# Cross-Model LoRA Adapter Translation — Round 4 **Repo:** https://huggingface.co/CK0607/cross-model-lora-prediction-3b **Models:** X=`Qwen/Qwen2.5-3B-Instruct` → Y=`meta-llama/Llama-3.2-3B-Instruct` ## Diff vs Round 3 - Kept Round 3 3B model pair and mapping algorithms unchanged. - Replaced broken held-outs: `math_algebra_medium` → `gsm8k_test_500`, `humaneval_plus` → `mbpp_test_held`, `mmlu_college_chemistry` → `openbookqa_test`. - Retrained only the bounded Round 4 pool: 16 matched X/Y anchors plus 6 X held-out conditioning adapters and 6 Y oracle adapters. - Stronger recipe: LoRA r=16, alpha=32, targets=['q_proj', 'k_proj', 'v_proj', 'o_proj', 'gate_proj', 'up_proj', 'down_proj'], epochs=3.0, train_per_task=1500, lr=0.0002, bf16, max_len=512. - Recomputed Top-K cosine selection from the new r=16/full-target X adapter space. ## Experiment 1 — Main table Rows with `oracle - base_Y < 3 pp` are flagged as not usable for averages. | Domain | Task | base_Y | mean | global_ridge | pertensor_ridge | topk8_global_ridge | topk8_pertensor_ridge | pertensor_mlp | oracle | oracle_minus_base_pp | usable | gap_recovered | |---|---|---|---|---|---|---|---|---|---|---|---|---| | math | gsm_hard | 0.063 | 0.057 | 0.060 | 0.067 | 0.067 | 0.063 | 0.073 | 0.150 | 8.667 | True | 0.115 | | math | gsm8k_test_500 | 0.080 | 0.093 | 0.100 | 0.100 | 0.093 | 0.097 | 0.100 | 0.293 | 21.333 | True | 0.094 | | code | mbpp_test_held | 0.230 | 0.240 | 0.250 | 0.250 | 0.250 | 0.250 | 0.240 | 0.320 | 9.000 | True | 0.222 | | code | mbpp_plus | 0.217 | 0.213 | 0.280 | 0.270 | 0.270 | 0.267 | 0.210 | 0.450 | 23.333 | True | 0.271 | | science | arc_challenge | 0.716 | 0.732 | 0.736 | 0.729 | 0.736 | 0.729 | 0.739 | 0.722 | 0.669 | False | 5.000 | | science | openbookqa_test | 0.710 | 0.760 | 0.747 | 0.743 | 0.713 | 0.717 | 0.753 | 0.983 | 27.333 | True | 0.183 | ## Headline - Best learned method minus mean baseline, averaged over usable held-outs: `0.0187` - Usable held-outs: `['gsm_hard', 'gsm8k_test_500', 'mbpp_test_held', 'mbpp_plus', 'openbookqa_test']` - Excluded held-outs: `['arc_challenge']` ## Top-K selection log | Held-out | topk8_global_ridge | topk8_pertensor_ridge | |---|---|---| | gsm_hard | `['math_counting_easy', 'mbpp_sanitized', 'mmlu_high_school_physics', 'humaneval', 'multiarith', 'math_algebra_easy', 'mmlu_elementary_math', 'mmlu_high_school_biology']` | `['math_counting_easy', 'mbpp_sanitized', 'mmlu_high_school_physics', 'humaneval', 'multiarith', 'math_algebra_easy', 'mmlu_elementary_math', 'mmlu_high_school_biology']` | | gsm8k_test_500 | `['math_counting_easy', 'mbpp_sanitized', 'mmlu_high_school_physics', 'humaneval', 'multiarith', 'math_algebra_easy', 'mmlu_elementary_math', 'mmlu_high_school_biology']` | `['math_counting_easy', 'mbpp_sanitized', 'mmlu_high_school_physics', 'humaneval', 'multiarith', 'math_algebra_easy', 'mmlu_elementary_math', 'mmlu_high_school_biology']` | | mbpp_test_held | `['mbpp_sanitized', 'math_counting_easy', 'humaneval', 'mmlu_high_school_physics', 'multiarith', 'mmlu_high_school_biology', 'mmlu_elementary_math', 'math_algebra_easy']` | `['mbpp_sanitized', 'math_counting_easy', 'humaneval', 'mmlu_high_school_physics', 'multiarith', 'mmlu_high_school_biology', 'mmlu_elementary_math', 'math_algebra_easy']` | | mbpp_plus | `['mbpp_sanitized', 'humaneval', 'math_counting_easy', 'mmlu_high_school_physics', 'multiarith', 'mmlu_high_school_biology', 'mmlu_elementary_math', 'math_algebra_easy']` | `['mbpp_sanitized', 'humaneval', 'math_counting_easy', 'mmlu_high_school_physics', 'multiarith', 'mmlu_high_school_biology', 'mmlu_elementary_math', 'math_algebra_easy']` | | arc_challenge | `['mmlu_high_school_physics', 'mmlu_high_school_biology', 'mmlu_elementary_math', 'math_counting_easy', 'mbpp_sanitized', 'humaneval', 'multiarith', 'math_algebra_easy']` | `['mmlu_high_school_physics', 'mmlu_high_school_biology', 'mmlu_elementary_math', 'math_counting_easy', 'mbpp_sanitized', 'humaneval', 'multiarith', 'math_algebra_easy']` | | openbookqa_test | `['mmlu_high_school_physics', 'mmlu_high_school_biology', 'mbpp_sanitized', 'math_counting_easy', 'mmlu_elementary_math', 'humaneval', 'multiarith', 'math_algebra_easy']` | `['mmlu_high_school_physics', 'mmlu_high_school_biology', 'mbpp_sanitized', 'math_counting_easy', 'mmlu_elementary_math', 'humaneval', 'multiarith', 'math_algebra_easy']` | ## Experiment 2 — Anchor-count + Top-K scaling ![Anchor scaling](figures/exp2_anchor_scaling_r4.png) ## Experiment 3 — Cross-domain transfer ![Transfer heatmap](figures/exp3_transfer_heatmap_r4.png) | Held-out domain | Best anchor pool | Top-K actual selections (top-3) | |---|---|---| | math | science-only | `{'gsm_hard': ['mmlu_high_school_physics', 'mmlu_elementary_math', 'mmlu_high_school_biology'], 'gsm8k_test_500': ['mmlu_high_school_physics', 'mmlu_elementary_math', 'mmlu_high_school_biology']}` | | code | code-only | `{'mbpp_test_held': ['mbpp_sanitized', 'humaneval', 'mbpp'], 'mbpp_plus': ['mbpp_sanitized', 'humaneval', 'mbpp']}` | | science | science-only | `{'arc_challenge': ['mmlu_high_school_physics', 'mmlu_high_school_biology', 'mmlu_elementary_math'], 'openbookqa_test': ['mmlu_high_school_physics', 'mmlu_high_school_biology', 'mmlu_elementary_math']}` | ## Honest failure modes - Excluded from averages: `arc_challenge` has oracle-base = 0.67 pp. - Code-task evaluation remains cheap answer-string/span matching, not sandboxed unit tests; code numbers are adapter-locality proxies, not pass@1. - Math uses numeric extraction/equality; formatting or non-numeric generations are counted wrong. - Top-K and ridge methods are exactly the prior mapping family; no new mapping method was added.