MSGEncrypted
/

minicpm5-1b-math-lora

build-small-hackathon

Model card Files Files and versions

minicpm5-1b-math-lora / README.md

MSGEncrypted's picture

Publish math-lora (gate passed: gsm8k)

e574f0b verified 15 days ago

|

History Blame Contribute Delete

2.11 kB

	---
	library_name: peft
	base_model: openbmb/MiniCPM5-1B
	license: apache-2.0
	tags:
	- lora
	- qlora
	- build-small-hackathon
	- well-tuned
	- math
	---

	# math-lora

	QLoRA adapter for math, fine-tuned from `openbmb/MiniCPM5-1B` on `meta-math/MetaMathQA` + `tatsu-lab/alpaca` (format: `mix`).

	Trained, evaluated, and gated on [Modal](https://modal.com/docs/guide) via `research/modal/` (app `slm-finetune-benchmark`).

	## Benchmark gate

	- skill eval profile: `math`
	- gate: PASSED

	### Skill checks

	\| check \| value \| result \|
	\| --- \| ---: \| --- \|
	\| gsm8k >= 0.05 \| 0.4000 \| pass \|
	\| gsm8k improve >= 0.02 \| 0.0700 \| pass \|
	\| arc_challenge regress <= 0.03 \| -0.0500 \| pass \|
	\| hellaswag regress <= 0.03 \| 0.0000 \| pass \|
	\| piqa regress <= 0.03 \| 0.0200 \| pass \|

	- general eval profile: `compare_study`

	### General checks

	\| check \| value \| result \|
	\| --- \| ---: \| --- \|
	\| arc_easy regress <= 0.03 \| -0.0300 \| pass \|
	\| arc_challenge regress <= 0.03 \| -0.0400 \| pass \|
	\| hellaswag regress <= 0.03 \| 0.0100 \| pass \|
	\| piqa regress <= 0.03 \| 0.0100 \| pass \|
	\| boolq regress <= 0.03 \| -0.0300 \| pass \|
	\| gsm8k regress <= 0.03 \| -0.0700 \| pass \|


	## lm-eval results

	\| task \| metric \| baseline \| candidate \| delta \|
	\| --- \| --- \| ---: \| ---: \| ---: \|
	\| arc_challenge \| acc,none \| 0.3200 \| 0.3700 \| +0.0500 \|
	\| gsm8k \| exact_match,strict-match \| 0.3300 \| 0.4000 \| +0.0700 \|
	\| hellaswag \| acc,none \| 0.4300 \| 0.4300 \| +0.0000 \|
	\| piqa \| acc,none \| 0.7200 \| 0.7000 \| -0.0200 \|

	## Training

	- dataset: `/repo/research/data/education-lesson-chat.jsonl`
	- mode: `qlora`
	- samples: {'train': 3528, 'eval': 72}
	- final train loss: 0.340698
	- eval loss: 0.494981

	## Load with PEFT

	```python
	from peft import PeftModel
	from transformers import AutoModelForCausalLM, AutoTokenizer

	base = "openbmb/MiniCPM5-1B"
	adapter = "MSGEncrypted/minicpm5-1b-math-lora"

	tokenizer = AutoTokenizer.from_pretrained(base, trust_remote_code=True)
	model = AutoModelForCausalLM.from_pretrained(
	base, torch_dtype="auto", device_map="auto", trust_remote_code=True
	)
	model = PeftModel.from_pretrained(model, adapter)
	```