QLoRA: Efficient Finetuning of Quantized LLMs
Paper β’ 2305.14314 β’ Published β’ 61
How to use adity12345/qwen2.5-1.5b-medical-finetuned with PEFT:
from peft import PeftModel
from transformers import AutoModelForCausalLM
base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-1.5B")
model = PeftModel.from_pretrained(base_model, "adity12345/qwen2.5-1.5b-medical-finetuned")This model is a finetuned version of Qwen/Qwen2.5-1.5B trained using QLoRA (4-bit quantization + LoRA).
Pretraining Details:
Finetuning Details:
LoRA Configuration:
Training:
Dataset: Medical text corpus for domain adaptation
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
"Qwen/Qwen2.5-1.5B",
torch_dtype=torch.bfloat16,
device_map="auto"
)
# Load LoRA adapters
model = PeftModel.from_pretrained(base_model, "adity12345/qwen2.5-1.5b-medical-finetuned")
tokenizer = AutoTokenizer.from_pretrained("adity12345/qwen2.5-1.5b-medical-finetuned")
# Generate text
prompt = "Your prompt here"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=256,
temperature=0.7,
top_p=0.9,
do_sample=True
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
from transformers import AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.bfloat16,
)
base_model = AutoModelForCausalLM.from_pretrained(
"Qwen/Qwen2.5-1.5B",
quantization_config=bnb_config,
device_map="auto"
)
model = PeftModel.from_pretrained(base_model, "adity12345/qwen2.5-1.5b-medical-finetuned")
@misc{qwen2.5-1.5b-medical-finetuned},
author = {Your Name},
title = {adity12345/qwen2.5-1.5b-medical-finetuned},
year = {2024},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/adity12345/qwen2.5-1.5b-medical-finetuned}}
}
This model inherits the Apache 2.0 license from the base Qwen2.5 model.
Base model
Qwen/Qwen2.5-1.5B