--- license: apache-2.0 library_name: transformers base_model: Qwen/Qwen2.5-Math-1.5B tags: - math - reinforcement-learning - grpo - checkpoint --- # od2961/qwen2.5-Math-1.5b-drx-grpo-readmeflash-node302-seed43 This repo stores organized Dr.X-GRPO checkpoints for **seed 43** from our `Qwen2.5-Math-1.5B` readme-flash math-reasoning training run. ## Layout - `checkpoints/step_XXXXX/`: exported model checkpoints ready for inference - `eval_results/`: per-benchmark JSON outputs saved at eval boundaries - `metadata/checkpoints_index.json`: machine-readable checkpoint manifest - `metadata/run_manifest.json`: local run provenance ## Source Run - Local save root: `/n/fs/similarity/maxent-grpo/var/data/oat_zero_drx_grpo_1p5b_20260416_100347_seed43` - Local run dir: `/n/fs/similarity/maxent-grpo/var/data/oat_zero_drx_grpo_1p5b_20260416_100347_seed43/qwen2.5-Math-1.5b-r1-zero-drx-grpo-node302-seed43_0416T10:04:41` - Run name: `qwen2.5-Math-1.5b-r1-zero-drx-grpo-node302-seed43_0416T10:04:41` - Objective: `Dr.X-GRPO` ## Available Checkpoints | Step | Size (GiB) | Eval files | | --- | ---: | --- | | `step_00000` | `2.89` | aime, amc, math, minerva, olympiad_bench | | `step_00016` | `2.89` | aime, amc, math, minerva, olympiad_bench | | `step_00032` | `2.89` | aime, amc, math, minerva, olympiad_bench | | `step_00048` | `2.89` | aime, amc, math, minerva, olympiad_bench | | `step_00064` | `2.89` | aime, amc, math, minerva, olympiad_bench | | `step_00080` | `2.89` | aime, amc, math, minerva, olympiad_bench | | `step_00096` | `2.89` | aime, amc, math, minerva, olympiad_bench | | `step_00112` | `2.89` | aime, amc, math, minerva, olympiad_bench | | `step_00128` | `2.89` | aime, amc, math, minerva, olympiad_bench | | `step_00144` | `2.89` | aime, amc, math, minerva, olympiad_bench | | `step_00160` | `2.89` | aime, amc, math, minerva, olympiad_bench | | `step_00176` | `2.89` | aime, amc, math, minerva, olympiad_bench |