LoRA Adapter (Rank 16) - Llama 3.2 3B Vietnamese ⭐

LoRA adapter (rank=16) fine-tuned on Vietnamese Alpaca dataset. Best balance between performance and efficiency.

Model Details

Base Model: unsloth/Llama-3.2-3B-Instruct-bnb-4bit
LoRA Rank: 16
LoRA Alpha: 32
Target Modules: q_proj, v_proj
Dataset: Vietnamese Alpaca (180 train samples, 20 eval samples)
Training Framework: Unsloth + TRL SFTTrainer
Quantization: 4-bit (QLoRA)

Metrics

Trainable Parameters: 4,587,520 (0.14% of total)
Training Time: 4.32 minutes on T4
Peak VRAM: 6.23 GB (lowest among all ranks!)
Eval Loss: 1.6530
Perplexity: 5.22 (best among all ranks!)

Usage

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
    "unsloth/Llama-3.2-3B-Instruct-bnb-4bit",
    load_in_4bit=True,
    device_map="auto"
)

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "luckyman2907/lab21-llama3.2-3b-r16")
tokenizer = AutoTokenizer.from_pretrained("luckyman2907/lab21-llama3.2-3b-r16")

# Generate
prompt = "Giải thích về machine learning"
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=200, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training Details

Epochs: 3
Learning Rate: 2e-4
Scheduler: Cosine with warmup (10%)
Batch Size: 8 (effective, via gradient accumulation)
Optimizer: AdamW 8-bit
GPU: Tesla T4 (16 GB)
Training Cost: ~$0.02 (4.32 minutes @ $0.35/hr)

Comparison with Other Ranks

Rank	Trainable Params	Train Time	Peak VRAM	Perplexity	Status
8	2.3M	4.01 min	7.00 GB	5.30	Underfitting
16	4.6M	4.32 min	6.23 GB	5.22	⭐ Best
64	18.4M	3.97 min	7.97 GB	5.23	Diminishing returns

Why Rank 16 is the Best Choice?

Lowest VRAM usage (6.23 GB) - 12% less than r8, 22% less than r64
Best perplexity (5.22) - outperforms both r8 and r64
Optimal capacity - 4.6M params is the sweet spot for 180 training samples
Cost-effective - Fast training (~4 min) with best results

Qualitative Improvements

Fine-tuned model shows significant improvements over base model:

✅ Better instruction-following format
✅ More concise and structured responses
✅ Improved Vietnamese language generation
✅ Better code formatting (markdown, syntax)
✅ Reduced hallucination

Limitations

Dataset size is small (180 samples) - may not generalize to all Vietnamese tasks
Technical concepts (LoRA, RAG) are not well-learned due to dataset limitations
Best suited for general instruction-following tasks

Citation

@misc{lab21-lora-r16,
  author = {luckyman2907},
  title = {LoRA Adapter (Rank 16) - Llama 3.2 3B Vietnamese},
  year = {2026},
  publisher = {HuggingFace},
  howpublished = {\url{luckyman2907/lab21-llama3.2-3b-r16}}
}

Related Models

License

Apache 2.0 (following base model license)

Acknowledgments

Base model: Llama 3.2 3B Instruct
Dataset: Vietnamese Alpaca GPT4
Training framework: Unsloth

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for luckyman2907/lab21-llama3.2-3b-r16

Base model

meta-llama/Llama-3.2-3B-Instruct

Quantized

unsloth/Llama-3.2-3B-Instruct-bnb-4bit

Adapter

(51)

this model

luckyman2907
/

lab21-llama3.2-3b-r16