vukien2301/qwen3-8b-teacher-sft-gpt54mini-math_cot-lora-adapter
Teacher SFT LoRA adapter (checkpoint checkpoint-2000) on top of Qwen/Qwen3-8B, trained on
vukien2301/ultrainteract_math_cot_gpt54mini
(78,349 GPT-5.4-mini regenerated math CoTs from UltraInteract).
Note: Training was stopped early at the checkpoint above (~80% / ~65% through 1 epoch).
Training config
- Adapter: pure LoRA (not DoRA — vLLM serves this directly with
enable_lora) - LoRA r=128, alpha=256, dropout=0.05
- Target modules: q,k,v,o,gate,up,down
- LR 2e-5, 1 epoch (early-stopped), batch 4 x grad-accum 8 (eff. 32)
- Raw completion format (no chat template):
"{question}\n{model_prediction}"+ EOS - Hardware: 1x B200, bf16, flash_attention_2, gradient checkpointing
Usage (vLLM)
from vllm import LLM
from vllm.lora.request import LoRARequest
llm = LLM(model="Qwen/Qwen3-8B", enable_lora=True, max_lora_rank=128)
out = llm.generate(
"What is 2+2?",
lora_request=LoRARequest("teacher", 1, lora_path="<path-to-this-repo>"),
)
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support