joethequant commited on
Commit
174c105
·
1 Parent(s): 80c79c2

Update README.md

Browse files

added additional info

Files changed (1) hide show
  1. README.md +49 -0
README.md CHANGED
@@ -62,6 +62,55 @@ Performance and analytics:
62
  ## How to Use
63
  Instructions on how to use the model, including example prompts and API documentation, are available in the [Code Repository](https://github.com/joethequant/docker_streamlit_antibody_protein_generation).
64
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
65
  ## Limitations and Future Work
66
  - Predictions require experimental validation for practical use.
67
  - Future improvements will focus on incorporating diverse training data and enhancing prediction accuracy for the efficacy of generated antibodies.
 
62
  ## How to Use
63
  Instructions on how to use the model, including example prompts and API documentation, are available in the [Code Repository](https://github.com/joethequant/docker_streamlit_antibody_protein_generation).
64
 
65
+ ### Example Code
66
+
67
+ ```python
68
+ from models.progen.modeling_progen import ProGenForCausalLM
69
+ import torch
70
+ from tokenizers import Tokenizer
71
+ import json
72
+
73
+ # Define the model identifier from Hugging Face's model hub
74
+ model_path = 'AntibodyGeneration/fine-tuned-progen2-small'
75
+
76
+ # Load the model and tokenizer
77
+ model = ProGenForCausalLM.from_pretrained(model_path)
78
+ tokenizer = Tokenizer.from_file('tokenizer.json')
79
+
80
+ # Define your sequence and other parameters
81
+ target_sequence = 'MQIPQAPWPVVWAVLQLGWRPGWFLDSPDRPWNPPTFSPALLVVTEGDNATFTCSFSNTSESFVLNWYRMSPSNQTDKLAAFPEDRSQPGQDCRFRVTQLPNGRDFHMSVVRARRNDSGTYLCGAISLAPKAQIKESLRAELRVTERRAEVPTAHPSPSPRPAGQFQTLVVGVVGGLLGSLVLLVWVLAVICSRAARGTIGARRTGQPLKEDPSAVPVFSVDYGELDFQWREKTPEPPVPCVPEQTEYATIVFPSGMGTSSPARRGSADGPRSAQPLRPEDGHCSWPL'
82
+ number_of_sequences = 2
83
+
84
+ # Tokenize the sequence
85
+ tokenized_sequence = tokenizer(target_sequence, return_tensors="pt")
86
+
87
+ # Move model and tensors to CUDA if available
88
+ device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
89
+ model = model.to(device)
90
+ tokenized_sequence = tokenized_sequence.to(device)
91
+
92
+ # Generate sequences
93
+ with torch.no_grad():
94
+ output = model.generate(**tokenized_sequence, max_length=1024, pad_token_id=tokenizer.pad_token_id, do_sample=True, top_p=0.9, temperature=0.8, num_return_sequences=number_of_sequences)
95
+
96
+ # Decoding the output to get generated sequences
97
+ generated_sequences = [tokenizer.decode(output_seq, skip_special_tokens=True) for output_seq in output]
98
+ ```
99
+
100
+ ## Links:
101
+
102
+ - [Huggingface Model Repository](https://huggingface.co/AntibodyGeneration)
103
+ - [Web Demo](https://orca-app-ygzbp.ondigitalocean.app/Demo_Antibody_Generator)
104
+ - [OpenSource RunPod Severless Rest API](https://github.com/joethequant/docker_protein_generator)
105
+ - [The Code for this App](https://github.com/joethequant/docker_streamlit_antibody_protein_generation)
106
+
107
+ ## Additional Resources and Links
108
+ - [Progen Foundation Models](https://github.com/salesforce/progen)
109
+ - [ANARCI Github](https://github.com/oxpig/ANARCI)
110
+ - [ANARCI Webserver](http://opig.stats.ox.ac.uk/webapps/anarci/)
111
+ - [TAP: Therapeutic Antibody Profiler](https://opig.stats.ox.ac.uk/webapps/sabdab-sabpred/sabpred/tap)
112
+ - [ESM Fold](https://esmatlas.com/resources?action=fold)
113
+
114
  ## Limitations and Future Work
115
  - Predictions require experimental validation for practical use.
116
  - Future improvements will focus on incorporating diverse training data and enhancing prediction accuracy for the efficacy of generated antibodies.