Qwen3-4B Elderly Vietnamese (SFT merged)

Merged LoRA adapter + base for Qwen/Qwen3-4B-Instruct-2507. Trained on 47.5k DiaSynth Vietnamese elderly dialogues. See insights.md and metrics.json.

  • Train loss: 0.876, Eval loss: 0.861
  • Train steps: 8913, Runtime: 21.3h
  • Base: bf16 LoRA r=32, PiSSA init, 3 epochs, eff batch 16

Source adapter: https://huggingface.co/quannguyen204/qwen3-4b-elderly-sft-lora

Downloads last month
348
Safetensors
Model size
4B params
Tensor type
F16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for quannguyen204/qwen3-4b-elderly-sft-merged

Adapter
(5529)
this model
Adapters
1 model