# Cross-Model LoRA Adapter Translation — Round 4

**Repo:** https://huggingface.co/CK0607/cross-model-lora-prediction-3b
**Models:** X=`Qwen/Qwen2.5-3B-Instruct` → Y=`meta-llama/Llama-3.2-3B-Instruct`

## Diff vs Round 3

- Kept Round 3 3B model pair and mapping algorithms unchanged.
- Replaced broken held-outs: `math_algebra_medium` → `gsm8k_test_500`, `humaneval_plus` → `mbpp_test_held`, `mmlu_college_chemistry` → `openbookqa_test`.
- Retrained only the bounded Round 4 pool: 16 matched X/Y anchors plus 6 X held-out conditioning adapters and 6 Y oracle adapters.
- Stronger recipe: LoRA r=16, alpha=32, targets=['q_proj', 'k_proj', 'v_proj', 'o_proj', 'gate_proj', 'up_proj', 'down_proj'], epochs=3.0, train_per_task=1500, lr=0.0002, bf16, max_len=512.
- Recomputed Top-K cosine selection from the new r=16/full-target X adapter space.

## Experiment 1 — Main table

Rows with `oracle - base_Y < 3 pp` are flagged as not usable for averages.

| Domain | Task | base_Y | mean | global_ridge | pertensor_ridge | topk8_global_ridge | topk8_pertensor_ridge | pertensor_mlp | oracle | oracle_minus_base_pp | usable | gap_recovered |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| math | gsm_hard | 0.063 | 0.057 | 0.060 | 0.067 | 0.067 | 0.063 | 0.073 | 0.150 | 8.667 | True | 0.115 |
| math | gsm8k_test_500 | 0.080 | 0.093 | 0.100 | 0.100 | 0.093 | 0.097 | 0.100 | 0.293 | 21.333 | True | 0.094 |
| code | mbpp_test_held | 0.230 | 0.240 | 0.250 | 0.250 | 0.250 | 0.250 | 0.240 | 0.320 | 9.000 | True | 0.222 |
| code | mbpp_plus | 0.217 | 0.213 | 0.280 | 0.270 | 0.270 | 0.267 | 0.210 | 0.450 | 23.333 | True | 0.271 |
| science | arc_challenge | 0.716 | 0.732 | 0.736 | 0.729 | 0.736 | 0.729 | 0.739 | 0.722 | 0.669 | False | 5.000 |
| science | openbookqa_test | 0.710 | 0.760 | 0.747 | 0.743 | 0.713 | 0.717 | 0.753 | 0.983 | 27.333 | True | 0.183 |


## Headline

- Best learned method minus mean baseline, averaged over usable held-outs: `0.0187`
- Usable held-outs: `['gsm_hard', 'gsm8k_test_500', 'mbpp_test_held', 'mbpp_plus', 'openbookqa_test']`
- Excluded held-outs: `['arc_challenge']`

## Top-K selection log

| Held-out | topk8_global_ridge | topk8_pertensor_ridge |
|---|---|---|
| gsm_hard | `['math_counting_easy', 'mbpp_sanitized', 'mmlu_high_school_physics', 'humaneval', 'multiarith', 'math_algebra_easy', 'mmlu_elementary_math', 'mmlu_high_school_biology']` | `['math_counting_easy', 'mbpp_sanitized', 'mmlu_high_school_physics', 'humaneval', 'multiarith', 'math_algebra_easy', 'mmlu_elementary_math', 'mmlu_high_school_biology']` |
| gsm8k_test_500 | `['math_counting_easy', 'mbpp_sanitized', 'mmlu_high_school_physics', 'humaneval', 'multiarith', 'math_algebra_easy', 'mmlu_elementary_math', 'mmlu_high_school_biology']` | `['math_counting_easy', 'mbpp_sanitized', 'mmlu_high_school_physics', 'humaneval', 'multiarith', 'math_algebra_easy', 'mmlu_elementary_math', 'mmlu_high_school_biology']` |
| mbpp_test_held | `['mbpp_sanitized', 'math_counting_easy', 'humaneval', 'mmlu_high_school_physics', 'multiarith', 'mmlu_high_school_biology', 'mmlu_elementary_math', 'math_algebra_easy']` | `['mbpp_sanitized', 'math_counting_easy', 'humaneval', 'mmlu_high_school_physics', 'multiarith', 'mmlu_high_school_biology', 'mmlu_elementary_math', 'math_algebra_easy']` |
| mbpp_plus | `['mbpp_sanitized', 'humaneval', 'math_counting_easy', 'mmlu_high_school_physics', 'multiarith', 'mmlu_high_school_biology', 'mmlu_elementary_math', 'math_algebra_easy']` | `['mbpp_sanitized', 'humaneval', 'math_counting_easy', 'mmlu_high_school_physics', 'multiarith', 'mmlu_high_school_biology', 'mmlu_elementary_math', 'math_algebra_easy']` |
| arc_challenge | `['mmlu_high_school_physics', 'mmlu_high_school_biology', 'mmlu_elementary_math', 'math_counting_easy', 'mbpp_sanitized', 'humaneval', 'multiarith', 'math_algebra_easy']` | `['mmlu_high_school_physics', 'mmlu_high_school_biology', 'mmlu_elementary_math', 'math_counting_easy', 'mbpp_sanitized', 'humaneval', 'multiarith', 'math_algebra_easy']` |
| openbookqa_test | `['mmlu_high_school_physics', 'mmlu_high_school_biology', 'mbpp_sanitized', 'math_counting_easy', 'mmlu_elementary_math', 'humaneval', 'multiarith', 'math_algebra_easy']` | `['mmlu_high_school_physics', 'mmlu_high_school_biology', 'mbpp_sanitized', 'math_counting_easy', 'mmlu_elementary_math', 'humaneval', 'multiarith', 'math_algebra_easy']` |

## Experiment 2 — Anchor-count + Top-K scaling

![Anchor scaling](figures/exp2_anchor_scaling_r4.png)

## Experiment 3 — Cross-domain transfer

![Transfer heatmap](figures/exp3_transfer_heatmap_r4.png)

| Held-out domain | Best anchor pool | Top-K actual selections (top-3) |
|---|---|---|
| math | science-only | `{'gsm_hard': ['mmlu_high_school_physics', 'mmlu_elementary_math', 'mmlu_high_school_biology'], 'gsm8k_test_500': ['mmlu_high_school_physics', 'mmlu_elementary_math', 'mmlu_high_school_biology']}` |
| code | code-only | `{'mbpp_test_held': ['mbpp_sanitized', 'humaneval', 'mbpp'], 'mbpp_plus': ['mbpp_sanitized', 'humaneval', 'mbpp']}` |
| science | science-only | `{'arc_challenge': ['mmlu_high_school_physics', 'mmlu_high_school_biology', 'mmlu_elementary_math'], 'openbookqa_test': ['mmlu_high_school_physics', 'mmlu_high_school_biology', 'mmlu_elementary_math']}` |

## Honest failure modes

- Excluded from averages: `arc_challenge` has oracle-base = 0.67 pp.
- Code-task evaluation remains cheap answer-string/span matching, not sandboxed unit tests; code numbers are adapter-locality proxies, not pass@1.
- Math uses numeric extraction/equality; formatting or non-numeric generations are counted wrong.
- Top-K and ridge methods are exactly the prior mapping family; no new mapping method was added.