# Round 5 — 3B Anchor-Count Scaling Sweep **Repo:** https://huggingface.co/CK0607/cross-model-lora-prediction-3b **Models:** X=`Qwen/Qwen2.5-3B-Instruct` → Y=`meta-llama/Llama-3.2-3B-Instruct` ## Recipe + anchors LoRA recipe: r=16, alpha=32, targets=['q_proj', 'k_proj', 'v_proj', 'o_proj', 'gate_proj', 'up_proj', 'down_proj'], epochs=3.0, train=1500, bs=8, lr=0.0002, max_seq_len=512, bf16. New Round 5 anchors trained under `round5/X` and `round5/Y`: `['aqua_rat_numeric', 'math_counting_easy', 'mawps', 'mbpp_sanitized', 'humaneval', 'conala_curated', 'medmcqa_easy', 'pubmedqa_pqal']`. Drop list: `[]`. R4 adapters were reused from the Modal volume; no R4 adapter was retrained. ## Scaling table — gap_recovered mean ± std over 5 stratified seeds | N | mean | global_ridge | topk8_global_ridge | |---:|---:|---:|---:| | 4 | -0.058 ± 0.176 | 0.188 ± 0.012 | 0.188 ± 0.012 | | 8 | -0.140 ± 0.161 | 0.201 ± 0.012 | 0.201 ± 0.012 | | 12 | -0.062 ± 0.120 | 0.208 ± 0.008 | 0.208 ± 0.008 | | 16 | -0.010 ± 0.088 | 0.212 ± 0.006 | 0.212 ± 0.006 | | 24 | 0.024 ± 0.000 | 0.218 ± 0.000 | 0.216 ± 0.000 | ## Figure ![3B anchor-count scaling](figures/exp_scaling_3b.png) ## Interpretation Top-K beats global_ridge at small N=4 (0.188 vs 0.188) and flattens by N=8. Across N, the mean baseline ranges -0.140–0.024, while the best learned curve reaches 0.218 gap recovered.