Gemma 1.1 2B IT โ Merged Model v2 (DrugBank KG-to-Text)
Model Summary
This is the full merged model of Gemma 1.1 2B IT fine-tuned with LoRA to generate fluent, hallucination-free natural language drug descriptions from pharmaceutical RDF knowledge graph triples sourced from DrugBank. It was developed as part of a UELโDepixen industrial placement research project focused on building trustworthy, domain-specific SLMs.
This is the recommended model for inference โ no additional adapter loading required.
For the LoRA adapter only, use: ๐ BSVGK/gemma-1.1-2b-it-drugbank-kg2text-lora-v2
Key Results
| Metric | Score |
|---|---|
| BLEU Score | 0.9737 |
| BERTScore F1 | 0.9896 |
| Fact F1 | 0.9966 |
| Hallucination Rate | 0.54% |
| Test Samples | 254 unseen DrugBank entries |
Model Details
- Base Model: google/gemma-1.1-2b-it
- Fine-tuning Method: LoRA (Low-Rank Adaptation)
- Task: KG-to-Text โ RDF triples โ fluent drug descriptions
- Domain: Pharmaceutical โ DrugBank
- Training Dataset: 2,537 verified DrugBank RDF triples
- Hardware: NVIDIA A100
- Framework: PyTorch, Hugging Face PEFT, TRL, SFTTrainer
Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
Load merged model directly โ no adapter needed
tokenizer = AutoTokenizer.from_pretrained( "BSVGK/gemma-1.1-2b-it-drugbank-kg2text-merged-v2" ) model = AutoModelForCausalLM.from_pretrained( "BSVGK/gemma-1.1-2b-it-drugbank-kg2text-merged-v2" )
prompt = """Generate a natural language description from the following RDF triples:
Triples:
- DrugA hasIndication Condition_X
- DrugA hasMechanism Mechanism_Y
- DrugA hasInteraction DrugB
Description:"""
inputs = tokenizer(prompt, return_tensors="pt") outputs = model.generate(**inputs, max_new_tokens=256) print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Dataset
- Dataset: BSVGK/drugbank_dataset
- Size: 2,537 training + 254 test samples
- Source: DrugBank pharmaceutical database
- Format: RDF Triples โ Natural Language Drug Description
Comparison with LoRA Adapter
| Merged Model v2 | LoRA Adapter v2 | |
|---|---|---|
| Inference | โ Direct โ no base model needed | โ Requires base model + PEFT |
| Storage | Larger (full model) | Smaller (adapter only) |
| Speed | Faster to load | Slower to load |
| Recommended for | Production inference | Research & experimentation |
Intended Use
- Pharmaceutical knowledge graph verbalisation
- Drug information summarisation and description generation
- Research in trustworthy and hallucination-free biomedical NLP
- Natural language generation from biomedical knowledge graphs
Out of Scope
- Non-pharmaceutical domains
- Clinical diagnosis or medical advice
- General purpose text generation
Important Notice
This model is intended for research purposes only. It should not be used for clinical decision-making or medical advice. Always consult a qualified healthcare professional.
Citation
@misc{bubathula2026drugbank_merged, author = {Sai Venkata Gopala Krishna Bubathula}, title = {Gemma 1.1 2B IT Merged Model v2: KG-to-Text Generation for DrugBank Pharmaceutical Data}, year = {2026}, publisher = {HuggingFace}, url = {https://huggingface.co/BSVGK/gemma-1.1-2b-it-drugbank-kg2text-merged-v2}, institution = {University of East London & Depixen} }
Developer
Sai Venkata Gopala Krishna Bubathula
- ๐ MSc Big Data Technologies, University of East London
- ๐ข AI Engineer โ UELโDepixen Industrial Placement
- ๐ GitHub
- ๐ LoRA Adapter
- ๐ LinkedIn
- Downloads last month
- 131
Model tree for BSVGK/gemma-1.1-2b-it-drugbank-kg2text-merged-v2
Base model
google/gemma-1.1-2b-it