---
library_name: peft
base_model: openbmb/MiniCPM5-1B
license: apache-2.0
tags:
  - lora
  - qlora
  - build-small-hackathon
  - well-tuned
  - math
---

# math-lora

QLoRA adapter for **math**, fine-tuned from `openbmb/MiniCPM5-1B` on `meta-math/MetaMathQA` + `tatsu-lab/alpaca` (format: `mix`).

Trained, evaluated, and gated on [Modal](https://modal.com/docs/guide) via `research/modal/` (app `slm-finetune-benchmark`).

## Benchmark gate

- eval profile: `math`
- gate: **PASSED**

| check | value | result |
| --- | ---: | --- |
| gsm8k >= 0.05 | 0.4000 | pass |
| gsm8k improve >= 0.02 | 0.0700 | pass |
| arc_challenge regress <= 0.03 | -0.0500 | pass |
| hellaswag regress <= 0.03 | 0.0000 | pass |
| piqa regress <= 0.03 | 0.0200 | pass |

## lm-eval results

| task | metric | baseline | candidate | delta |
| --- | --- | ---: | ---: | ---: |
| arc_challenge | acc,none | 0.3200 | 0.3700 | +0.0500 |
| gsm8k | exact_match,strict-match | 0.3300 | 0.4000 | +0.0700 |
| hellaswag | acc,none | 0.4300 | 0.4300 | +0.0000 |
| piqa | acc,none | 0.7200 | 0.7000 | -0.0200 |

## Training

- dataset: `/repo/research/data/education-lesson-chat.jsonl`
- mode: `qlora`
- samples: {'train': 3528, 'eval': 72}
- final train loss: 0.340698
- eval loss: 0.494981

## Load with PEFT

```python
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base = "openbmb/MiniCPM5-1B"
adapter = "MSGEncrypted/minicpm5-1b-math-lora"

tokenizer = AutoTokenizer.from_pretrained(base, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    base, torch_dtype="auto", device_map="auto", trust_remote_code=True
)
model = PeftModel.from_pretrained(model, adapter)
```