Anyone successfully reproduced this model with Jackrong's GitHub notebook? I'm getting results below baseline and wondering if it's just me.

#26
by sunboy - opened

The shared notebook (Jackrong's LLM Fine-tuning Guide) has been incredibly helpful for learning how to post-train an LLM for improved coding performance. I downloaded Jackrong's trained/reference model and confirmed it does outperform the baseline (Qwen3.5-27B).

However, when I followed the notebook (Qwopus3.5 27B SFT Google Colab) to train my own model, the results came in below baseline β€” so I'm wondering if anyone else has experienced the same issue.

Below is a comparison between the baseline, the model I trained using Jackrong's notebook, and Jackrong's published model.

compare

My setup was nearly identical to the notebook, with one exception to avoid OOM: I used PER_DEV_BS=4, GRAD_ACCUM=9 instead of PER_DEV_BS=6, GRAD_ACCUM=6. My understanding is that this should only affect training speed (since the effective batch size remains the same) without significantly impacting model quality.

Sign up or log in to comment