5CD-AI/Vietnamese-alpaca-gpt4-gg-translated
Viewer • Updated • 52k • 403 • 20
LoRA adapter (rank=16) fine-tuned on Vietnamese Alpaca dataset. Best balance between performance and efficiency.
unsloth/Llama-3.2-3B-Instruct-bnb-4bitq_proj, v_projfrom peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
"unsloth/Llama-3.2-3B-Instruct-bnb-4bit",
load_in_4bit=True,
device_map="auto"
)
# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "luckyman2907/lab21-llama3.2-3b-r16")
tokenizer = AutoTokenizer.from_pretrained("luckyman2907/lab21-llama3.2-3b-r16")
# Generate
prompt = "Giải thích về machine learning"
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=200, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
| Rank | Trainable Params | Train Time | Peak VRAM | Perplexity | Status |
|---|---|---|---|---|---|
| 8 | 2.3M | 4.01 min | 7.00 GB | 5.30 | Underfitting |
| 16 | 4.6M | 4.32 min | 6.23 GB | 5.22 | ⭐ Best |
| 64 | 18.4M | 3.97 min | 7.97 GB | 5.23 | Diminishing returns |
Fine-tuned model shows significant improvements over base model:
@misc{lab21-lora-r16,
author = {luckyman2907},
title = {LoRA Adapter (Rank 16) - Llama 3.2 3B Vietnamese},
year = {2026},
publisher = {HuggingFace},
howpublished = {\url{luckyman2907/lab21-llama3.2-3b-r16}}
}
Apache 2.0 (following base model license)
Base model
meta-llama/Llama-3.2-3B-Instruct