Cross-Model LoRA Adapter Translation — Round 4
Repo: https://huggingface.co/CK0607/cross-model-lora-prediction-3b
Models: X=Qwen/Qwen2.5-3B-Instruct → Y=meta-llama/Llama-3.2-3B-Instruct
Diff vs Round 3
- Kept Round 3 3B model pair and mapping algorithms unchanged.
- Replaced broken held-outs:
math_algebra_medium→gsm8k_test_500,humaneval_plus→mbpp_test_held,mmlu_college_chemistry→openbookqa_test. - Retrained only the bounded Round 4 pool: 16 matched X/Y anchors plus 6 X held-out conditioning adapters and 6 Y oracle adapters.
- Stronger recipe: LoRA r=16, alpha=32, targets=['q_proj', 'k_proj', 'v_proj', 'o_proj', 'gate_proj', 'up_proj', 'down_proj'], epochs=3.0, train_per_task=1500, lr=0.0002, bf16, max_len=512.
- Recomputed Top-K cosine selection from the new r=16/full-target X adapter space.
Experiment 1 — Main table
Rows with oracle - base_Y < 3 pp are flagged as not usable for averages.
| Domain | Task | base_Y | mean | global_ridge | pertensor_ridge | topk8_global_ridge | topk8_pertensor_ridge | pertensor_mlp | oracle | oracle_minus_base_pp | usable | gap_recovered |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| math | gsm_hard | 0.063 | 0.057 | 0.060 | 0.067 | 0.067 | 0.063 | 0.073 | 0.150 | 8.667 | True | 0.115 |
| math | gsm8k_test_500 | 0.080 | 0.093 | 0.100 | 0.100 | 0.093 | 0.097 | 0.100 | 0.293 | 21.333 | True | 0.094 |
| code | mbpp_test_held | 0.230 | 0.240 | 0.250 | 0.250 | 0.250 | 0.250 | 0.240 | 0.320 | 9.000 | True | 0.222 |
| code | mbpp_plus | 0.217 | 0.213 | 0.280 | 0.270 | 0.270 | 0.267 | 0.210 | 0.450 | 23.333 | True | 0.271 |
| science | arc_challenge | 0.716 | 0.732 | 0.736 | 0.729 | 0.736 | 0.729 | 0.739 | 0.722 | 0.669 | False | 5.000 |
| science | openbookqa_test | 0.710 | 0.760 | 0.747 | 0.743 | 0.713 | 0.717 | 0.753 | 0.983 | 27.333 | True | 0.183 |
Headline
- Best learned method minus mean baseline, averaged over usable held-outs:
0.0187 - Usable held-outs:
['gsm_hard', 'gsm8k_test_500', 'mbpp_test_held', 'mbpp_plus', 'openbookqa_test'] - Excluded held-outs:
['arc_challenge']
Top-K selection log
| Held-out | topk8_global_ridge | topk8_pertensor_ridge |
|---|---|---|
| gsm_hard | ['math_counting_easy', 'mbpp_sanitized', 'mmlu_high_school_physics', 'humaneval', 'multiarith', 'math_algebra_easy', 'mmlu_elementary_math', 'mmlu_high_school_biology'] |
['math_counting_easy', 'mbpp_sanitized', 'mmlu_high_school_physics', 'humaneval', 'multiarith', 'math_algebra_easy', 'mmlu_elementary_math', 'mmlu_high_school_biology'] |
| gsm8k_test_500 | ['math_counting_easy', 'mbpp_sanitized', 'mmlu_high_school_physics', 'humaneval', 'multiarith', 'math_algebra_easy', 'mmlu_elementary_math', 'mmlu_high_school_biology'] |
['math_counting_easy', 'mbpp_sanitized', 'mmlu_high_school_physics', 'humaneval', 'multiarith', 'math_algebra_easy', 'mmlu_elementary_math', 'mmlu_high_school_biology'] |
| mbpp_test_held | ['mbpp_sanitized', 'math_counting_easy', 'humaneval', 'mmlu_high_school_physics', 'multiarith', 'mmlu_high_school_biology', 'mmlu_elementary_math', 'math_algebra_easy'] |
['mbpp_sanitized', 'math_counting_easy', 'humaneval', 'mmlu_high_school_physics', 'multiarith', 'mmlu_high_school_biology', 'mmlu_elementary_math', 'math_algebra_easy'] |
| mbpp_plus | ['mbpp_sanitized', 'humaneval', 'math_counting_easy', 'mmlu_high_school_physics', 'multiarith', 'mmlu_high_school_biology', 'mmlu_elementary_math', 'math_algebra_easy'] |
['mbpp_sanitized', 'humaneval', 'math_counting_easy', 'mmlu_high_school_physics', 'multiarith', 'mmlu_high_school_biology', 'mmlu_elementary_math', 'math_algebra_easy'] |
| arc_challenge | ['mmlu_high_school_physics', 'mmlu_high_school_biology', 'mmlu_elementary_math', 'math_counting_easy', 'mbpp_sanitized', 'humaneval', 'multiarith', 'math_algebra_easy'] |
['mmlu_high_school_physics', 'mmlu_high_school_biology', 'mmlu_elementary_math', 'math_counting_easy', 'mbpp_sanitized', 'humaneval', 'multiarith', 'math_algebra_easy'] |
| openbookqa_test | ['mmlu_high_school_physics', 'mmlu_high_school_biology', 'mbpp_sanitized', 'math_counting_easy', 'mmlu_elementary_math', 'humaneval', 'multiarith', 'math_algebra_easy'] |
['mmlu_high_school_physics', 'mmlu_high_school_biology', 'mbpp_sanitized', 'math_counting_easy', 'mmlu_elementary_math', 'humaneval', 'multiarith', 'math_algebra_easy'] |
Experiment 2 — Anchor-count + Top-K scaling
Experiment 3 — Cross-domain transfer
| Held-out domain | Best anchor pool | Top-K actual selections (top-3) |
|---|---|---|
| math | science-only | {'gsm_hard': ['mmlu_high_school_physics', 'mmlu_elementary_math', 'mmlu_high_school_biology'], 'gsm8k_test_500': ['mmlu_high_school_physics', 'mmlu_elementary_math', 'mmlu_high_school_biology']} |
| code | code-only | {'mbpp_test_held': ['mbpp_sanitized', 'humaneval', 'mbpp'], 'mbpp_plus': ['mbpp_sanitized', 'humaneval', 'mbpp']} |
| science | science-only | {'arc_challenge': ['mmlu_high_school_physics', 'mmlu_high_school_biology', 'mmlu_elementary_math'], 'openbookqa_test': ['mmlu_high_school_physics', 'mmlu_high_school_biology', 'mmlu_elementary_math']} |
Honest failure modes
- Excluded from averages:
arc_challengehas oracle-base = 0.67 pp. - Code-task evaluation remains cheap answer-string/span matching, not sandboxed unit tests; code numbers are adapter-locality proxies, not pass@1.
- Math uses numeric extraction/equality; formatting or non-numeric generations are counted wrong.
- Top-K and ridge methods are exactly the prior mapping family; no new mapping method was added.

