MSG commited on
Commit
6cea344
·
1 Parent(s): 514a84f

Feat/finetuning model (#18)

Browse files

* notebook path

* modal plan

* finetune app

* pyproject and notebook

* usage modal

* usage volume modal

* usage build

* finetune app modal fix

* modal error build

* finetune app volumes

* server app container

* server app and experiment

* common fix

* evals

* common fix

* docs gpu server

* loop common stuff

* wip finetuning

* wip finetuning

* wip finetuning

* server app

* server app fix

* server app and docs

* docs

* docs and fix import

* wip experiment

* next todo

* experimentnal and fix pyproject

* experimentnal and fix pyproject

* fix stuff

* update readme

.cursor/plans/modal_finetune_benchmark_ac96d473.plan.md ADDED
@@ -0,0 +1,268 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ name: Modal Finetune Benchmark
3
+ overview: Add a Modal GPU pipeline that runs existing `research/finetune.py` and `slm-lm-eval` on OpenBMB MiniCPM5-1B across multiple datasets, persists checkpoints to Modal Volumes, and provides a companion Modal Notebook for interactive exploration — targeting both the Modal partner track and the Well-Tuned finetuning track.
4
+ todos:
5
+ - id: modal-scaffold
6
+ content: Create research/modal/finetune_app.py with Image, Volumes, HF secret, GPU functions
7
+ status: completed
8
+ - id: experiments-yaml
9
+ content: Add research/modal/experiments.yaml with lesson/alpaca/smoltalk job matrix + smoke limits
10
+ status: completed
11
+ - id: train-worker
12
+ content: Implement finetune_one() subprocess wrapper around research/finetune.py with volume commit
13
+ status: completed
14
+ - id: eval-worker
15
+ content: Implement run_lm_eval() subprocess wrapper for baseline + post-train comparison
16
+ status: completed
17
+ - id: sweep-entrypoint
18
+ content: Add @app.local_entrypoint sweep() to run baseline, map trainings, then eval each checkpoint
19
+ status: completed
20
+ - id: modal-notebook
21
+ content: Create research/notebook/minicpm5-modal-finetune.ipynb for interactive OpenBMB MiniCPM5-1B demo
22
+ status: completed
23
+ - id: docs-deps
24
+ content: Add research/modal/README.md, modal dependency group in pyproject.toml, HF_TOKEN note in .env.example
25
+ status: completed
26
+ isProject: false
27
+ ---
28
+
29
+ # Modal Finetuning + Benchmark Plan
30
+
31
+ ## Goal
32
+
33
+ Run GPU fine-tuning and academic benchmarks **without local CUDA**, reusing your existing scripts:
34
+
35
+ - Training: [`research/finetune.py`](research/finetune.py) (LoRA/QLoRA on `openbmb/MiniCPM5-1B`)
36
+ - Benchmarks: `slm-lm-eval` via [`research/evals/`](research/evals/) (ARC, HellaSwag, GSM8K, …)
37
+ - Datasets: lesson chat (default), plus Hub sets already documented in finetune docstring
38
+
39
+ Deliverables for hackathon tracks:
40
+
41
+ | Track | What judges see |
42
+ |-------|-----------------|
43
+ | **Modal** | `modal run` job + Modal Volume/Notebook link in README |
44
+ | **Well-Tuned / Finetuning** | Before/after `lm-eval` on base vs LoRA adapter, weights in `models/finetuned/` or HF Hub |
45
+
46
+ ## Current state (no Modal yet)
47
+
48
+ - [`research/finetune.py`](research/finetune.py) is self-contained CLI: resolves `minicpm5-1b` from [`models.yaml`](models.yaml), supports `--dataset`, `--format`, `--mode`, and optional `--lm-eval-after`.
49
+ - Eval harness lives in workspace package `slm-evals`; smoke config at [`research/evals/configs/lm_eval_smoke.yaml`](research/evals/configs/lm_eval_smoke.yaml).
50
+ - [`research/notebook/gemma-finetune.ipynb`](research/notebook/gemma-finetune.ipynb) has early OpenBMB load cells but no training loop — good skeleton for a Modal Notebook.
51
+ - Root [`pyproject.toml`](pyproject.toml) already defines `finetune` and `lm-eval` dependency groups (torch, peft, bitsandbytes, lm-eval).
52
+
53
+ ```mermaid
54
+ flowchart TB
55
+ subgraph local [Your laptop]
56
+ cli["modal run research/modal/finetune_app.py"]
57
+ nb["Modal GPU Notebook"]
58
+ end
59
+
60
+ subgraph modal [Modal cloud]
61
+ img["Image: torch + uv sync finetune/lm-eval"]
62
+ fn_train["@app.function gpu=A10G finetune_one"]
63
+ fn_eval["@app.function gpu=A10G run_lm_eval"]
64
+ vol_hf["Volume: hf-cache"]
65
+ vol_out["Volume: finetuned-outputs"]
66
+ end
67
+
68
+ subgraph repo_scripts [Mounted repo]
69
+ ft["research/finetune.py"]
70
+ eval["slm-lm-eval"]
71
+ data["research/data/*.jsonl"]
72
+ end
73
+
74
+ cli --> fn_train
75
+ cli --> fn_eval
76
+ nb --> ft
77
+ fn_train --> ft
78
+ fn_eval --> eval
79
+ fn_train --> vol_hf
80
+ fn_train --> vol_out
81
+ fn_eval --> vol_hf
82
+ fn_eval --> vol_out
83
+ img --> fn_train
84
+ img --> fn_eval
85
+ ```
86
+
87
+ ## Architecture
88
+
89
+ ### 1. New Modal module: `research/modal/`
90
+
91
+ Create a small Modal package (2–3 files, no refactor of `finetune.py`):
92
+
93
+ | File | Role |
94
+ |------|------|
95
+ | `research/modal/finetune_app.py` | Main `modal.App`, image, volumes, `@app.function` workers |
96
+ | `research/modal/experiments.yaml` | Dataset sweep matrix (name, hub id, format, max_samples) |
97
+ | `research/modal/README.md` | Setup (`modal setup`), secrets, run commands |
98
+
99
+ **Image** (per [Modal CUDA guide](https://modal.com/docs/guide/cuda)):
100
+
101
+ ```python
102
+ image = (
103
+ modal.Image.debian_slim(python_version="3.12")
104
+ .apt_install("git")
105
+ .pip_install("uv")
106
+ .add_local_file("pyproject.toml", "/repo/pyproject.toml", copy=True)
107
+ .add_local_file("uv.lock", "/repo/uv.lock", copy=True)
108
+ # ... copy workspace members needed for finetune + slm-evals
109
+ .run_commands(
110
+ "cd /repo && uv sync --frozen --group finetune --group lm-eval --package slm-evals"
111
+ )
112
+ .add_local_dir("research", remote_path="/repo/research")
113
+ .add_local_dir("libs/inference", remote_path="/repo/libs/inference")
114
+ .add_local_file("models.yaml", "/repo/models.yaml")
115
+ )
116
+ ```
117
+
118
+ Use `pip_install("torch")` on the image **or** let `uv sync` pull torch — either works on Modal since [driver API is pre-installed](https://modal.com/docs/guide/cuda).
119
+
120
+ **Volumes** (persist across runs):
121
+
122
+ - `hf-cache` → mount at `/root/.cache/huggingface` (model + dataset cache)
123
+ - `slm-finetune` → mount at `/vol/finetuned` (adapters, `training_results.json`, lm-eval `results/`)
124
+
125
+ **Secrets**: `modal.Secret.from_name("huggingface")` with `HF_TOKEN` for gated models and faster Hub downloads.
126
+
127
+ **GPU**: `gpu="A10G"` default (24 GB is plenty for MiniCPM5-1B LoRA at `max_len=1024`). Use `gpu="T4"` for QLoRA smoke tests; bump to `A100` only if you scale `batch_size` or `max_len`.
128
+
129
+ ### 2. Training worker — wrap existing CLI
130
+
131
+ Do **not** rewrite training logic. Each Modal function shells into your script:
132
+
133
+ ```python
134
+ @app.function(gpu="A10G", volumes={...}, secrets=[...], timeout=7200)
135
+ def finetune_one(job: dict) -> dict:
136
+ out = f"/vol/finetuned/{job['name']}"
137
+ cmd = [
138
+ "uv", "run", "python", "research/finetune.py",
139
+ "--preset", "minicpm5-1b",
140
+ "--mode", job.get("mode", "lora"),
141
+ "--dataset", job["dataset"],
142
+ "--format", job["format"],
143
+ "--out", out,
144
+ "--trust_remote_code", # implicit via preset; set TRUST_REMOTE_CODE=1 in env
145
+ *optional_flags(job),
146
+ ]
147
+ subprocess.run(cmd, cwd="/repo", check=True, env={**os.environ, "HF_HOME": "/root/.cache/huggingface"})
148
+ vol_finetune.commit()
149
+ return json.loads(Path(out, "training_results.json").read_text())
150
+ ```
151
+
152
+ Key env vars to pass through (already supported by [`finetune.py`](research/finetune.py)):
153
+
154
+ - `FINETUNE_DATASET_CONFIG`, `FINETUNE_DATASET_SPLIT`, `FINETUNE_MAX_SAMPLES`
155
+ - `TRUST_REMOTE_CODE=true` (required for `openbmb/MiniCPM5-1B`)
156
+
157
+ ### 3. Benchmark worker — baseline + per-checkpoint
158
+
159
+ Separate function so you can re-eval without re-training:
160
+
161
+ ```python
162
+ @app.function(gpu="A10G", volumes={...}, timeout=3600)
163
+ def run_lm_eval(*, experiment_name: str, preset: str | None = None,
164
+ model_path: str | None = None, adapter_path: str | None = None,
165
+ config: str = "research/evals/configs/lm_eval_smoke.yaml",
166
+ compare_to: str | None = None) -> dict:
167
+ # uv run --package slm-evals slm-lm-eval ...
168
+ ```
169
+
170
+ **Suggested experiment matrix** in `experiments.yaml`:
171
+
172
+ | Job name | Dataset | Format | Notes |
173
+ |----------|---------|--------|-------|
174
+ | `lesson-lora` | `research/data/education-lesson-chat.jsonl` | `chat` | Primary Well-Tuned story |
175
+ | `alpaca-lora` | `tatsu-lab/alpaca` | `alpaca` | General instruction |
176
+ | `smoltalk-lora` | `HuggingFaceTB/smoltalk` | `chat` | `dataset_config: all`, `split: train[:500]` |
177
+
178
+ Smoke flags for hackathon time budget: `--max_steps 100` or `FINETUNE_MAX_SAMPLES=200`, plus lm-eval `limit: 25` from [`lm_eval_smoke.yaml`](research/evals/configs/lm_eval_smoke.yaml).
179
+
180
+ ### 4. Orchestration — `local_entrypoint`
181
+
182
+ ```python
183
+ @app.local_entrypoint()
184
+ def sweep(train: bool = True, eval_only: bool = False):
185
+ jobs = yaml.safe_load(open("research/modal/experiments.yaml"))
186
+ if not eval_only:
187
+ baseline = run_lm_eval.remote(
188
+ experiment_name="minicpm5-1b__baseline",
189
+ preset="minicpm5-1b",
190
+ config="research/evals/configs/lm_eval_compare_study.yaml",
191
+ )
192
+ for result in finetune_one.map(jobs["finetune"]):
193
+ run_lm_eval.remote(
194
+ experiment_name=f"{result['preset']}__{job_name}",
195
+ model_path="openbmb/MiniCPM5-1B",
196
+ adapter_path=result["output_dir"],
197
+ compare_to=baseline["results_json"],
198
+ )
199
+ ```
200
+
201
+ Use `.map()` for parallel dataset runs only if budget allows; otherwise sequential `for job in jobs: finetune_one.remote(job)`.
202
+
203
+ ### 5. Modal GPU Notebook (OpenBMB)
204
+
205
+ Create [`research/notebook/minicpm5-modal-finetune.ipynb`](research/notebook/minicpm5-modal-finetune.ipynb):
206
+
207
+ 1. **Setup cell** — `pip install` / `uv sync` finetune group; verify `nvidia-smi` (Modal Notebooks have GPU per [Modal intro](https://modal.com/docs/guide)).
208
+ 2. **Clone or mount repo** — `git clone` your hackathon repo or upload `research/finetune.py` + `models.yaml` + lesson JSONL.
209
+ 3. **Smoke train** — `%run research/finetune.py --preset minicpm5-1b --mode lora --max_steps 20`
210
+ 4. **Inline eval** — `%run` or subprocess `slm-lm-eval --profile smoke --preset minicpm5-1b-lesson-lora` (after registering adapter path in a temp preset or passing `--model` + `--adapter`).
211
+ 5. **Sample generation** — reuse the smoke block at end of `finetune.py`.
212
+
213
+ Notebook is the **demo video** surface; `finetune_app.py` is the **reproducible** surface for judges.
214
+
215
+ Optional: use Modal Sandbox [`Sandbox.exec`](https://modal.com/docs/guide/sandbox-spawn) only for one-off shell probes (`nvidia-smi`, `python -c "import torch"`) — not for full training (Functions + Volumes are the right primitive).
216
+
217
+ ### 6. Pulling results back locally
218
+
219
+ After `modal run`:
220
+
221
+ ```bash
222
+ modal volume get slm-finetune minicpm5-1b-lesson-lora ./models/finetuned/minicpm5-1b-lesson-lora
223
+ modal volume get slm-finetune results/lm_eval ./results/lm_eval
224
+ ```
225
+
226
+ Then wire Space via existing preset [`minicpm5-1b-lesson-lora`](models.yaml) (`adapter_path: ./models/finetuned/minicpm5-1b-lora`).
227
+
228
+ Optional stretch: push adapter to `build-small-hackathon/<your-space>-lora` with `huggingface_hub` in a post-train Modal function.
229
+
230
+ ## Setup checklist (one-time)
231
+
232
+ 1. `pip install modal && modal setup` ([getting started](https://modal.com/docs/guide))
233
+ 2. `modal secret create huggingface HF_TOKEN=<token>`
234
+ 3. `uv sync --group finetune --group lm-eval` locally (validates lockfile before image build)
235
+ 4. First smoke: `modal run research/modal/finetune_app.py --max-steps 20 --dataset lesson`
236
+
237
+ ## Hackathon submission narrative
238
+
239
+ Document in root README or `research/modal/README.md`:
240
+
241
+ 1. **Modal track** — link to Modal app name, example `modal run` output, screenshot of Volume or Notebook.
242
+ 2. **Finetuning track** — table from `comparison.md` / `summary.md` showing base vs lesson-LoRA on same lm-eval config (fair comparison per [`research/USAGE.md`](research/USAGE.md) verification checklist).
243
+ 3. **Space integration** — `ACTIVE_MODEL=minicpm5-1b-lesson-lora` after downloading adapter.
244
+
245
+ ## Files to add (minimal diff)
246
+
247
+ - `research/modal/finetune_app.py` — Modal app (~150 lines)
248
+ - `research/modal/experiments.yaml` — 3 dataset jobs + eval config pointers
249
+ - `research/modal/README.md` — commands only
250
+ - `research/notebook/minicpm5-modal-finetune.ipynb` — notebook path
251
+ - Root `pyproject.toml` — add optional `modal` dependency group: `modal>=0.73`
252
+ - `.env.example` — note `HF_TOKEN` for Modal secret (no token in repo)
253
+
254
+ ## What we intentionally skip
255
+
256
+ - Refactoring `finetune.py` into importable library (subprocess wrapper is enough)
257
+ - Running agentic benchmarks (BFCL/GAIA) on Modal first pass — heavier deps; add later if time
258
+ - Modal Sandboxes for training loops — Functions are simpler and support GPU + Volumes
259
+
260
+ ## Risk mitigations
261
+
262
+ | Risk | Mitigation |
263
+ |------|------------|
264
+ | OpenBMB `trust_remote_code` | Set `TRUST_REMOTE_CODE=true` in Modal function env |
265
+ | Image build slow | Cache `hf-cache` Volume; pin `uv.lock` |
266
+ | OOM on small GPU | `--mode qlora`, `max_len=512`, `batch_size=1` (auto in [`_apply_low_vram_defaults`](research/finetune.py)) |
267
+ | lm-eval path assumptions | Run from `/repo` cwd; `slm_evals` resolves `_REPO_ROOT` four parents up from its module |
268
+ | Volume not persisted | Call `volume.commit()` after train/eval |
.env.example CHANGED
@@ -39,6 +39,10 @@ ALLOW_MODEL_SWITCH=false
39
  # ACTIVE_MODEL=gemma-merged-local
40
  # MODEL_ID=./gemma_merged_model
41
 
 
 
 
 
42
  # --- Fine-tuning (research/finetune.py) ---
43
  # FINETUNE_PRESET=minicpm5-1b
44
  # FINETUNE_MODEL=openbmb/MiniCPM5-1B
 
39
  # ACTIVE_MODEL=gemma-merged-local
40
  # MODEL_ID=./gemma_merged_model
41
 
42
+ # --- Modal (research/modal/finetune_app.py) ---
43
+ # Create secret: modal secret create huggingface HF_TOKEN=<token>
44
+ # HF_TOKEN=hf_...
45
+
46
  # --- Fine-tuning (research/finetune.py) ---
47
  # FINETUNE_PRESET=minicpm5-1b
48
  # FINETUNE_MODEL=openbmb/MiniCPM5-1B
README.md CHANGED
@@ -48,6 +48,9 @@ uv run --package gradio-space python -m gradio_space.app
48
 
49
  Open [http://localhost:7860](http://localhost:7860).
50
 
 
 
 
51
  ### Studio UI (Off Brand track)
52
 
53
  The default landing page is a **custom AI Studio workspace** at `/` — not default Gradio chrome. It uses **Gradio 6 Server mode** (`gradio.Server`): Material 3 layout, sidebar + workspace (Research → Slides → Language lessons), and `@server.api` endpoints wired to the same Python backends as Classic.
@@ -57,8 +60,25 @@ The default landing page is a **custom AI Studio workspace** at `/` — not defa
57
 
58
  See [apps/gradio-space/README.md](apps/gradio-space/README.md) for API names and a 2-minute judge demo script.
59
 
60
- - **Lesson slides** topic, grade, slide count → downloadable PowerPoint
61
- - **Research Agent** — scrape/index sources into MemRAG, then ask questions offline with citations
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
62
 
63
  ## How it works
64
 
@@ -108,9 +128,15 @@ See [`.env.example`](.env.example) and [`models.yaml`](models.yaml) for model pr
108
 
109
  A root `Dockerfile` is kept for a later **Docker SDK** deploy (flip README to `sdk: docker`). See [USAGE.md](USAGE.md).
110
 
111
- ## Hackathon checklist
 
 
 
 
 
 
 
112
 
113
- - **Track:** Backyard AI — lesson slide builder for a teacher you know
114
  - Space live under build-small-hackathon
115
  - Demo video: [YouTube](https://www.youtube.com/watch?v=bwtOiZvJ-7k) — real user enters topic → download `.pptx` → show agent trace
116
  - Social post published
@@ -123,7 +149,8 @@ A root `Dockerfile` is kept for a later **Docker SDK** deploy (flip README to `s
123
  - **OpenBMB** — `openbmb/MiniCPM5-1B`
124
  - **Sharing is Caring** — upload traces with `scripts/upload_trace.py`
125
  - **Off-the-Grid** — local inference only (no cloud LLM API)
126
- - **Well-Tuned** — optional fine-tuned preset in `models.yaml` (Phase 2)
 
127
 
128
  ## Agent trace upload
129
 
 
48
 
49
  Open [http://localhost:7860](http://localhost:7860).
50
 
51
+ - **Lesson slides** — topic, grade, slide count → downloadable PowerPoint
52
+ - **Research Agent** — scrape/index sources into MemRAG, then ask questions offline with citations
53
+
54
  ### Studio UI (Off Brand track)
55
 
56
  The default landing page is a **custom AI Studio workspace** at `/` — not default Gradio chrome. It uses **Gradio 6 Server mode** (`gradio.Server`): Material 3 layout, sidebar + workspace (Research → Slides → Language lessons), and `@server.api` endpoints wired to the same Python backends as Classic.
 
60
 
61
  See [apps/gradio-space/README.md](apps/gradio-space/README.md) for API names and a 2-minute judge demo script.
62
 
63
+ ### Modal + Fine-tuning track (Well-Tuned)
64
+
65
+ Cloud GPU **train → eval → gate → publish** for a skill-matrix of QLoRA adapters on `openbmb/MiniCPM5-1B` — no local CUDA required. Each job in [`research/modal/experiments.yaml`](research/modal/experiments.yaml) (math, science, coding, reasoning, teaching, …) fine-tunes with [`research/finetune.py`](research/finetune.py), benchmarks with `slm-lm-eval`, gates on per-skill `goals`, and publishes passing adapters to the Hub.
66
+
67
+ - **Modal (partner track)** — `modal run` / warm GPU worker, Volume artifacts, optional [Modal Notebook](research/notebook/minicpm5-modal-finetune.ipynb)
68
+ - **Well-Tuned badge** — before/after lm-eval per skill + gated Hub publish (`MSGEncrypted/minicpm5-1b-<skill>-lora`)
69
+
70
+ Full runbook: [`research/modal/README.md`](research/modal/README.md) · agent loop: [`research/modal/SERVER.md`](research/modal/SERVER.md) · local research overview: [`research/USAGE.md`](research/USAGE.md)
71
+
72
+ ```bash
73
+ uv sync --group modal
74
+ modal setup && modal secret create huggingface HF_TOKEN=<token>
75
+
76
+ modal run research/modal/server_app.py --ping # health check
77
+ modal run research/modal/server_app.py --job math-lora --max-steps 20 --no-publish # cheap smoke
78
+ modal run research/modal/server_app.py --pipeline # full sweep: baselines → train → eval → gate → publish
79
+ ```
80
+
81
+ Pull a passing adapter into the Space: `modal volume get slm-finetune math-lora ./models/finetuned/minicpm5-1b-lora`, then set `ACTIVE_MODEL=minicpm5-1b-lesson-lora`.
82
 
83
  ## How it works
84
 
 
128
 
129
  A root `Dockerfile` is kept for a later **Docker SDK** deploy (flip README to `sdk: docker`). See [USAGE.md](USAGE.md).
130
 
131
+ ## Hackathon tracks & checklist
132
+
133
+ | Track | What we ship |
134
+ | ----- | ------------ |
135
+ | **Backyard AI** (primary) | Lesson slide builder for a teacher you know — topic + grade → downloadable `.pptx` |
136
+ | **Off Brand** | Custom Studio UI at `/` (Gradio 6 Server mode, not default Gradio chrome) |
137
+ | **Modal** (partner) | GPU `train → eval → gate → publish` on [Modal](https://modal.com) — [`research/modal/`](research/modal/) |
138
+ | **Well-Tuned** (finetuning) | Skill-matrix QLoRA adapters on MiniCPM5-1B, lm-eval gates, Hub publish |
139
 
 
140
  - Space live under build-small-hackathon
141
  - Demo video: [YouTube](https://www.youtube.com/watch?v=bwtOiZvJ-7k) — real user enters topic → download `.pptx` → show agent trace
142
  - Social post published
 
149
  - **OpenBMB** — `openbmb/MiniCPM5-1B`
150
  - **Sharing is Caring** — upload traces with `scripts/upload_trace.py`
151
  - **Off-the-Grid** — local inference only (no cloud LLM API)
152
+ - **Well-Tuned** — per-skill QLoRA adapters trained + gated + published via the [Modal + Fine-tuning track](#modal--fine-tuning-track-well-tuned)
153
+ - **Modal** — same pipeline; see [`research/modal/README.md`](research/modal/README.md)
154
 
155
  ## Agent trace upload
156
 
TODO.md ADDED
@@ -0,0 +1,44 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Hackathon badge/track TODO
2
+
3
+ Strategy: one **Backyard AI** submission, stacking as many merit badges, sponsor
4
+ awards, and special awards as credibly fit the small-model / local-first story.
5
+ Deadline: **June 15, 2026** (Space + demo video + social post).
6
+
7
+ This PR (`feat/finetuning_model`) focuses on **🎯 Well-Tuned** + **Modal**. Everything
8
+ below is parked for follow-up PRs.
9
+
10
+ ## In this PR (finetuning + Modal) — done here
11
+ - [x] Make published adapters **public** so judges can verify the Well-Tuned badge
12
+ (`research/modal/experiments.yaml`: `private: false`).
13
+ - [x] Add hackathon discoverability tags + license to the published model card
14
+ (`research/modal/_common.py: render_model_card`).
15
+
16
+ ## 🦙 Llama Champion badge (cheap, high value)
17
+ - [ ] Run the Space on the **llama.cpp / GGUF** backend (`libs/inference/src/inference/llama_cpp.py`).
18
+ - [ ] Confirm MiniCPM5-1B has a GGUF (or convert/quantize one) — keep OpenBMB story intact.
19
+ - [ ] Document the llama.cpp path in README + Space (which `ACTIVE_MODEL` preset).
20
+
21
+ ## 📓 Field Notes badge (cheapest miss — no blog exists yet)
22
+ - [ ] Write a blog post / report on the fine-tuning + Modal pipeline:
23
+ skill-matrix QLoRA -> lm-eval -> per-skill gate -> Hub publish.
24
+ - [ ] Publish it (HF blog / personal) and link from README.
25
+ - [ ] This badge + the others clinches **Bonus Quest Champion ($2k)**.
26
+
27
+ ## README + submission hygiene
28
+ - [ ] Update README badge checklist to reflect full strategy (add Llama Champion, Field Notes).
29
+ - [ ] Best Demo: polished demo video (real teacher -> topic -> .pptx download -> trace).
30
+ - [ ] Social post published (required for submission).
31
+ - [ ] Community Choice: share the Space widely.
32
+
33
+ ## Decided NOT to chase (conflicts with MiniCPM / local-first core)
34
+ - OpenAI Track — requires OpenAI models; collides with Tiny Titan / OpenBMB / Off-the-Grid.
35
+ - NVIDIA Nemotron — requires Nemotron model; same conflict.
36
+ - Thousand Token Wood — different main track; can't be in both.
37
+
38
+ ## Badge scorecard (target = all 6 + Bonus Quest Champion)
39
+ - [x] 🔌 Off the Grid — local inference only
40
+ - [x] 🎨 Off-Brand — custom Studio UI (Gradio 6 Server mode)
41
+ - [x] 📡 Sharing is Caring — agent trace upload
42
+ - [~] 🎯 Well-Tuned — pipeline ready; needs a passing public adapter on the Hub
43
+ - [ ] 🦙 Llama Champion — see above
44
+ - [ ] 📓 Field Notes — see above
pyproject.toml CHANGED
@@ -28,6 +28,10 @@ evals = [
28
  lm-eval = [
29
  "slm-evals[lm-eval]",
30
  ]
 
 
 
 
31
 
32
  [tool.uv.workspace]
33
  members = [
 
28
  lm-eval = [
29
  "slm-evals[lm-eval]",
30
  ]
31
+ modal = [
32
+ "modal>=0.73.0",
33
+ "pyyaml>=6.0",
34
+ ]
35
 
36
  [tool.uv.workspace]
37
  members = [
research/USAGE.md CHANGED
@@ -27,6 +27,65 @@ uv sync --group lm-eval
27
  | `finetune` | `research/finetune.py` | `peft`, `datasets`, `bitsandbytes` (QLoRA) |
28
  | `evals` | `slm-evals` workspace member | `slm-benchmark` CLI |
29
  | `lm-eval` | `slm-evals[lm-eval]` | `slm-lm-eval` CLI (GSM8K, ARC, HellaSwag, …) |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
30
 
31
  ---
32
 
 
27
  | `finetune` | `research/finetune.py` | `peft`, `datasets`, `bitsandbytes` (QLoRA) |
28
  | `evals` | `slm-evals` workspace member | `slm-benchmark` CLI |
29
  | `lm-eval` | `slm-evals[lm-eval]` | `slm-lm-eval` CLI (GSM8K, ARC, HellaSwag, …) |
30
+ | `modal` | `research/modal/finetune_app.py` | Cloud GPU train + eval via [Modal](https://modal.com/docs/guide) |
31
+ | `modal` | `research/modal/server_app.py` | Long-lived warm GPU worker for human/AI iteration loops |
32
+
33
+ ---
34
+
35
+ ## 0. Modal cloud GPU (`research/modal/`)
36
+
37
+ Run a **skill-matrix** of QLoRA fine-tunes **without local CUDA**: each job in
38
+ [`modal/experiments.yaml`](modal/experiments.yaml) trains one adapter for a
39
+ category (math, science, coding, reasoning, teaching, instructions), evaluates
40
+ it against a matching `slm-lm-eval` profile vs. a per-profile baseline, checks
41
+ the result against `goals`, and — only if the gate passes — publishes the
42
+ adapter to the Hugging Face Hub. Adapters + results are saved to Modal Volume
43
+ `slm-finetune`.
44
+
45
+ ```bash
46
+ uv sync --group modal
47
+ modal setup
48
+ modal secret create huggingface HF_TOKEN=<token> # needs write access for Hub publish
49
+
50
+ # Smoke run for one skill: baseline -> train -> eval -> gate -> publish -> pull
51
+ modal run research/modal/finetune_app.py --job math-lora --max-steps 20
52
+
53
+ # Whole skill matrix
54
+ modal run research/modal/finetune_app.py
55
+
56
+ # One category, train+eval only (no Hub push)
57
+ modal run research/modal/finetune_app.py --category science --no-publish
58
+
59
+ # Re-check the gate and publish an already-evaluated job
60
+ modal run research/modal/finetune_app.py::publish_only --job math-lora
61
+
62
+ # Pull adapters + lm-eval results without re-running anything
63
+ modal run research/modal/finetune_app.py::pull --category math
64
+ ```
65
+
66
+ Set real values for `defaults.hub_org` and each job's `publish.hub_repo` in
67
+ `experiments.yaml` (placeholder: `your-hf-username`) before publishing — repos
68
+ are created automatically. Jobs with no `goals` (e.g. `alpaca-lora`) are
69
+ trained/evaluated but never gated or published (local-only).
70
+
71
+ For a multi-hour session on **one warm GPU** (iterative human/AI loop without
72
+ re-downloading weights each run), use `research/modal/server_app.py` instead —
73
+ same skill-matrix pipeline (`--job`/`--category`/`--pipeline`/`--publish-only`)
74
+ on a deployed `GpuWorker`.
75
+
76
+ Full guide: **[modal/README.md](modal/README.md)** · **Agent loop:** **[modal/SERVER.md](modal/SERVER.md)** · [Modal Volumes](https://modal.com/docs/guide/volumes) · [Modal Notebooks](https://modal.com/docs/guide/notebooks)
77
+
78
+ **Iterative loop (one warm GPU, many runs):**
79
+
80
+ ```bash
81
+ modal deploy research/modal/server_app.py
82
+ modal run -d research/modal/server_app.py --hours 6 # keep worker alive
83
+ modal run research/modal/server_app.py --ping # verify
84
+ modal run research/modal/server_app.py --job lesson-lora --max-steps 20
85
+ modal app stop slm-gpu-worker -y # when done
86
+ ```
87
+
88
+ Interactive notebook: upload [`research/notebook/minicpm5-modal-finetune.ipynb`](notebook/minicpm5-modal-finetune.ipynb) at [modal.com/notebooks](https://modal.com/notebooks), attach GPU + Volume `slm-finetune` + Secret `huggingface`.
89
 
90
  ---
91
 
research/data/science-tutor-chat.jsonl ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {"messages": [{"role": "system", "content": "You are a friendly science tutor for middle-school students. Explain concepts in simple terms and give one concrete example."}, {"role": "user", "content": "What is photosynthesis?"}, {"role": "assistant", "content": "Photosynthesis is how plants make their own food. Using sunlight, water, and carbon dioxide from the air, a plant's leaves produce glucose (sugar) for energy and release oxygen as a byproduct. For example, a tree's leaves absorb sunlight during the day and release the oxygen we breathe."}]}
2
+ {"messages": [{"role": "system", "content": "You are a friendly science tutor for middle-school students. Explain concepts in simple terms and give one concrete example."}, {"role": "user", "content": "Why does ice float on water?"}, {"role": "assistant", "content": "Ice floats because it is less dense than liquid water. When water freezes, its molecules arrange into a crystal structure that takes up more space, so the same mass of ice has more volume than liquid water. For example, an ice cube floats in a glass of water with most of it submerged but the top sticking out."}]}
3
+ {"messages": [{"role": "system", "content": "You are a friendly science tutor for middle-school students. Explain concepts in simple terms and give one concrete example."}, {"role": "user", "content": "What causes the water cycle?"}, {"role": "assistant", "content": "The water cycle is driven by the sun's heat. The sun evaporates water from oceans and lakes into vapor, which rises and cools to form clouds (condensation). When clouds get heavy, water falls back as rain or snow (precipitation), and it eventually flows back to the oceans. For example, rain that falls on mountains flows downhill into rivers and back to the sea."}]}
4
+ {"messages": [{"role": "system", "content": "You are a friendly science tutor for middle-school students. Explain concepts in simple terms and give one concrete example."}, {"role": "user", "content": "What is Newton's first law of motion?"}, {"role": "assistant", "content": "Newton's first law says an object at rest stays at rest, and an object in motion stays in motion at the same speed and direction, unless a force acts on it. This is also called inertia. For example, a hockey puck slides across ice and only slows down because of friction, not because it 'runs out' of motion on its own."}]}
5
+ {"messages": [{"role": "system", "content": "You are a friendly science tutor for middle-school students. Explain concepts in simple terms and give one concrete example."}, {"role": "user", "content": "What are the three states of matter?"}, {"role": "assistant", "content": "The three common states of matter are solid, liquid, and gas. Solids have a fixed shape and volume, liquids take the shape of their container but keep their volume, and gases expand to fill their container. For example, water can be ice (solid), liquid water, or steam (gas), depending on its temperature."}]}
6
+ {"messages": [{"role": "system", "content": "You are a friendly science tutor for middle-school students. Explain concepts in simple terms and give one concrete example."}, {"role": "user", "content": "How does a simple electric circuit work?"}, {"role": "assistant", "content": "A simple circuit needs a power source, a conductor (wire), and a load (like a bulb), all connected in a loop. Electrons flow from the power source through the wire, through the load where their energy is converted (e.g., into light), and back to the source. For example, connecting a battery to a light bulb with two wires lets current flow and lights the bulb; cutting one wire breaks the circuit and the bulb turns off."}]}
7
+ {"messages": [{"role": "system", "content": "You are a friendly science tutor for middle-school students. Explain concepts in simple terms and give one concrete example."}, {"role": "user", "content": "What is the difference between a producer and a consumer in an ecosystem?"}, {"role": "assistant", "content": "Producers, like plants and algae, make their own food using sunlight through photosynthesis. Consumers cannot make their own food and must eat other organisms to get energy. For example, grass is a producer, a rabbit that eats the grass is a consumer, and a fox that eats the rabbit is also a consumer."}]}
8
+ {"messages": [{"role": "system", "content": "You are a friendly science tutor for middle-school students. Explain concepts in simple terms and give one concrete example."}, {"role": "user", "content": "Why do we see different phases of the Moon?"}, {"role": "assistant", "content": "The Moon's phases happen because we see different amounts of its sunlit side as it orbits Earth. The Moon doesn't change shape; only the portion we can see lit by the sun changes. For example, during a full moon, the entire visible side is lit because the Moon is on the opposite side of Earth from the sun, while during a new moon the lit side faces away from us."}]}
9
+ {"messages": [{"role": "system", "content": "You are a friendly science tutor for middle-school students. Explain concepts in simple terms and give one concrete example."}, {"role": "user", "content": "What is the function of the human heart?"}, {"role": "assistant", "content": "The heart is a muscular pump that circulates blood through the body. It pushes oxygen-rich blood from the lungs out to the body's tissues, and pumps oxygen-poor blood back to the lungs to pick up more oxygen. For example, each heartbeat you feel is the heart contracting to push blood through your blood vessels."}]}
10
+ {"messages": [{"role": "system", "content": "You are a friendly science tutor for middle-school students. Explain concepts in simple terms and give one concrete example."}, {"role": "user", "content": "What is an acid and a base?"}, {"role": "assistant", "content": "Acids are substances that release hydrogen ions (H+) in water and taste sour, while bases release hydroxide ions (OH-) and taste bitter or feel slippery. Acidity is measured on the pH scale, where values below 7 are acidic and above 7 are basic. For example, lemon juice is acidic (low pH) and baking soda dissolved in water is basic (high pH)."}]}
research/evals/configs/eval_profiles.yaml CHANGED
@@ -25,6 +25,25 @@ profiles:
25
  - arc_challenge
26
  - hellaswag
27
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
28
  understanding:
29
  tool: slm-lm-eval
30
  claim: Better language understanding
 
25
  - arc_challenge
26
  - hellaswag
27
 
28
+ math:
29
+ tool: slm-lm-eval
30
+ claim: Better math reasoning
31
+ description: Grade-school math word problems (GSM8K) + abstract reasoning QA.
32
+ config: lm_eval_math.yaml
33
+ tasks:
34
+ - gsm8k
35
+ - arc_challenge
36
+
37
+ science:
38
+ tool: slm-lm-eval
39
+ claim: Better science knowledge
40
+ description: Science fact recall (SciQ, OpenBookQA) + science reasoning QA.
41
+ config: lm_eval_science.yaml
42
+ tasks:
43
+ - sciq
44
+ - openbookqa
45
+ - arc_challenge
46
+
47
  understanding:
48
  tool: slm-lm-eval
49
  claim: Better language understanding
research/evals/configs/lm_eval_instructions.yaml CHANGED
@@ -1,6 +1,6 @@
1
  # Instruction following profile — IFEval (verifiable constraints)
2
  # Run: slm-lm-eval --profile instructions --preset minicpm5-1b
3
- # Requires lm-eval extras; install with: uv sync --group lm-eval
4
 
5
  profile: instructions
6
  claim: Better instruction following
 
1
  # Instruction following profile — IFEval (verifiable constraints)
2
  # Run: slm-lm-eval --profile instructions --preset minicpm5-1b
3
+ # Requires lm-eval[ifeval] extras; install with: uv sync --group lm-eval
4
 
5
  profile: instructions
6
  claim: Better instruction following
research/evals/configs/lm_eval_math.yaml ADDED
@@ -0,0 +1,18 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Math profile — grade-school word problems + abstract reasoning QA
2
+ # Run: slm-lm-eval --profile math --preset minicpm5-1b --experiment-name math-baseline
3
+
4
+ profile: math
5
+ claim: Better math reasoning
6
+
7
+ tasks:
8
+ - gsm8k
9
+ - arc_challenge
10
+
11
+ num_fewshot: 5
12
+ limit: 100
13
+ seed: 42
14
+ batch_size: auto
15
+ device: auto
16
+ dtype: bfloat16
17
+ trust_remote_code: true
18
+ output_dir: results/lm_eval
research/evals/configs/lm_eval_science.yaml ADDED
@@ -0,0 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Science profile — fact recall + elementary science reasoning
2
+ # Run: slm-lm-eval --profile science --preset minicpm5-1b --experiment-name science-baseline
3
+
4
+ profile: science
5
+ claim: Better science knowledge
6
+
7
+ tasks:
8
+ - sciq
9
+ - openbookqa
10
+ - arc_challenge
11
+
12
+ num_fewshot: 0
13
+ limit: 100
14
+ seed: 42
15
+ batch_size: auto
16
+ device: auto
17
+ dtype: bfloat16
18
+ trust_remote_code: true
19
+ output_dir: results/lm_eval
research/evals/pyproject.toml CHANGED
@@ -19,7 +19,7 @@ dependencies = [
19
 
20
  [project.optional-dependencies]
21
  lm-eval = [
22
- "lm-eval[hf]>=0.4.9",
23
  ]
24
 
25
  [project.scripts]
 
19
 
20
  [project.optional-dependencies]
21
  lm-eval = [
22
+ "lm-eval[hf,ifeval]>=0.4.9",
23
  ]
24
 
25
  [project.scripts]
research/modal/README.md ADDED
@@ -0,0 +1,774 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Modal finetune + benchmark
2
+
3
+ GPU fine-tuning + benchmarking + Hub publishing on [Modal](https://modal.com/docs/guide) for `openbmb/MiniCPM5-1B`, wrapping existing [`research/finetune.py`](../finetune.py) and `slm-lm-eval`.
4
+
5
+ Use this when you have no local CUDA but want a hackathon-quality
6
+ **train → eval → gate → publish** loop for a whole **skill matrix** of QLoRA
7
+ adapters (math, science, coding, reasoning, teaching, instructions).
8
+
9
+ | Track | What you ship |
10
+ | ----- | ------------- |
11
+ | **Modal** | `modal run` skill-matrix pipeline, Volume artifacts, optional Modal Notebook |
12
+ | **Well-Tuned** | Per-skill before/after `lm-eval` + gated Hub publish for each LoRA |
13
+
14
+ ---
15
+
16
+ ## Layout
17
+
18
+ ```text
19
+ research/modal/
20
+ ├── _common.py # Shared image, volumes, command builders, gate + publish helpers
21
+ ├── finetune_app.py # One-shot batch pipeline (slm-finetune-benchmark): main, publish_only, pull
22
+ ├── server_app.py # Long-lived GPU worker (slm-gpu-worker): GpuWorker.run_pipeline
23
+ ├── experiments.yaml # Skill matrix: jobs, eval_profile, goals, publish
24
+ ├── README.md # Full Modal docs (this file)
25
+ └── SERVER.md # Human + AI agent loop runbook (quick reference)
26
+ ```
27
+
28
+ Interactive path: [`research/notebook/minicpm5-modal-finetune.ipynb`](../notebook/minicpm5-modal-finetune.ipynb) (Modal GPU Notebook).
29
+
30
+ ### Which app to use
31
+
32
+ | App | CLI | Best for |
33
+ | --- | --- | --- |
34
+ | **`finetune_app.py`** | `modal run research/modal/finetune_app.py` | Full sweep, CI-style batch, parallel jobs |
35
+ | **`server_app.py`** | `modal deploy` + `modal run research/modal/server_app.py` | Multi-hour session, iterative human/AI loops on **one warm GPU** |
36
+
37
+ Both apps share [`_common.py`](_common.py): same image, `hf-cache` / `slm-finetune` volumes, and wrappers around [`research/finetune.py`](../finetune.py) + `slm-lm-eval`.
38
+
39
+ ---
40
+
41
+ ## One-time setup
42
+
43
+ ```bash
44
+ # Modal CLI + auth
45
+ pip install modal
46
+ modal setup
47
+
48
+ # HF token (downloads + Hub upload). Same token as huggingface-cli login.
49
+ modal secret create huggingface HF_TOKEN=<your-hf-token>
50
+
51
+ # Optional: validate deps before first image build
52
+ uv sync --group finetune --group lm-eval --package slm-evals
53
+ uv sync --group modal # local orchestration only
54
+ ```
55
+
56
+ `HF_TOKEN` must be a [write token](https://huggingface.co/settings/tokens) if you plan to push adapters to the Hub.
57
+
58
+ ---
59
+
60
+ ## Run training + benchmarks
61
+
62
+ All commands from **repo root**. `finetune_app.py` runs the full **skill-matrix
63
+ pipeline**: per-profile baseline lm-eval → finetune each job's QLoRA adapter →
64
+ post-train lm-eval vs. that baseline → check `goals` (gate) → publish to the
65
+ Hugging Face Hub if the gate passes → pull adapter + results to your laptop.
66
+
67
+ ```bash
68
+ # Full sweep: every job in experiments.yaml
69
+ modal run research/modal/finetune_app.py
70
+
71
+ # One skill (cheap smoke run)
72
+ modal run research/modal/finetune_app.py --job math-lora --max-steps 20
73
+
74
+ # One category (e.g. all "science" jobs)
75
+ modal run research/modal/finetune_app.py --category science
76
+
77
+ # Re-run lm-eval (+ gate + publish) only — adapter already on Volume
78
+ modal run research/modal/finetune_app.py --eval-only --job math-lora
79
+
80
+ # Train + eval but skip the Hub push and the local download
81
+ modal run research/modal/finetune_app.py --no-publish --no-pull
82
+
83
+ # Train/eval jobs in parallel (one GPU per job — higher cost)
84
+ modal run research/modal/finetune_app.py --parallel
85
+
86
+ # Re-run just the gate + Hub publish for an already-evaluated job
87
+ modal run research/modal/finetune_app.py::publish_only --job math-lora
88
+
89
+ # Pull adapters + lm-eval results for a category without re-running anything
90
+ modal run research/modal/finetune_app.py::pull --category math
91
+ ```
92
+
93
+ Jobs live in [`experiments.yaml`](experiments.yaml) — a **skill matrix**, one
94
+ QLoRA adapter per category, each evaluated against the matching
95
+ `eval_profile` from [`research/evals/configs/eval_profiles.yaml`](../evals/configs/eval_profiles.yaml):
96
+
97
+ | Job | Category | Dataset (format) | Eval profile | `goals` task | Publish |
98
+ | --- | -------- | ----------------- | ------------ | ------------- | ------- |
99
+ | `teaching-lora` | teaching | `research/data/education-lesson-chat.jsonl` (`chat`) | `instructions` | `ifeval` | ✅ |
100
+ | `science-lora` | science | `research/data/science-tutor-chat.jsonl` (`chat`) | `science` | `sciq` (+ `arc_challenge` guard) | ✅ |
101
+ | `math-lora` | math | `TIGER-Lab/MathInstruct` (`alpaca`) | `math` | `gsm8k` (+ `arc_challenge` guard) | ✅ |
102
+ | `coding-lora` | coding | `iamtarun/python_code_instructions_18k_alpaca` (`alpaca`) | `code` | `mbpp` | ✅ |
103
+ | `reasoning-lora` | reasoning | `HuggingFaceTB/smoltalk` (`chat`) | `reasoning` | `gsm8k` (+ `hellaswag` guard) | ✅ |
104
+ | `alpaca-lora` | instructions | `tatsu-lab/alpaca` (`alpaca`) | `instructions` | — (no `goals`) | local-only |
105
+
106
+ Before publishing, replace `defaults.hub_org` and each job's `publish.hub_repo`
107
+ in `experiments.yaml` with your Hugging Face username/org (defaults to the
108
+ placeholder `your-hf-username`).
109
+
110
+ Edit `defaults.max_steps`, per-job `gpu`, or per-job `max_samples` /
111
+ `dataset_split` in `experiments.yaml` to balance cost vs quality. See
112
+ [Benchmark gate & Hugging Face Hub publish](#benchmark-gate--hugging-face-hub-publish)
113
+ for the `goals`/`publish` schema.
114
+
115
+ ### CLI flags (`finetune_app.py`)
116
+
117
+ `main` (default entrypoint — full pipeline):
118
+
119
+ | Flag | Default | Meaning |
120
+ | ---- | ------- | ------- |
121
+ | `--train` / `--no-train` | train on | Run finetune jobs |
122
+ | `--eval-only` | off | Skip train + baselines; eval existing Volume checkpoints |
123
+ | `--parallel` | off | `finetune_one.spawn()` per job instead of sequential |
124
+ | `--job` | all jobs | Run one job name from `experiments.yaml` |
125
+ | `--category` | all categories | Run all jobs with this `category` |
126
+ | `--max-steps` | from YAML | Override training steps |
127
+ | `--publish` / `--no-publish` | publish on | Push to `publish.hub_repo` if the gate passes |
128
+ | `--pull` / `--no-pull` | pull on | `modal volume get` the adapter + lm-eval results after each job |
129
+
130
+ `publish_only` (separate entrypoint — `::publish_only`):
131
+
132
+ | Flag | Default | Meaning |
133
+ | ---- | ------- | ------- |
134
+ | `--job` | required | Re-check the gate against existing results and publish if it passes |
135
+
136
+ `pull` (separate entrypoint — `::pull`):
137
+
138
+ | Flag | Default | Meaning |
139
+ | ---- | ------- | ------- |
140
+ | `--job` | — | Pull one job's adapter + results |
141
+ | `--category` | — | Pull all jobs in a category |
142
+ | `--dest` | `models/finetuned` | Local destination directory |
143
+
144
+ ---
145
+
146
+ ## GPU worker (`server_app.py`) — human + AI agent loops
147
+
148
+ Use this when you want **one warm A10G container** for several hours and many train/eval commands **without** reinstalling deps or re-downloading HF weights each time.
149
+
150
+ **Quick runbook:** see [`SERVER.md`](SERVER.md) (copy-paste commands for humans and coding agents).
151
+
152
+ ### Deploy once
153
+
154
+ ```bash
155
+ modal deploy research/modal/server_app.py
156
+ ```
157
+
158
+ App name: **`slm-gpu-worker`**. Dashboard: `modal app list` or the URL printed after deploy.
159
+
160
+ `GpuWorker` keeps `min_containers=1` while deployed, mounts `hf-cache` + `slm-finetune`, and reuses the same container for sequential `.remote()` calls when possible.
161
+
162
+ ### Two-terminal loop (recommended)
163
+
164
+ **Terminal 1 — keep worker alive** (default 4h; blocks unless detached):
165
+
166
+ ```bash
167
+ modal run research/modal/server_app.py
168
+ # or free your terminal:
169
+ modal run -d research/modal/server_app.py --hours 6
170
+ ```
171
+
172
+ **Terminal 2 — run experiments on the warm GPU** (repeat as often as you like):
173
+
174
+ ```bash
175
+ # Full skill-matrix pipeline for one job on the warm container:
176
+ # per-profile baseline → train → eval → gate → publish → pull
177
+ modal run research/modal/server_app.py --job math-lora --max-steps 20
178
+
179
+ # All jobs in a category
180
+ modal run research/modal/server_app.py --category science
181
+
182
+ # Whole matrix, but skip the Hub push
183
+ modal run research/modal/server_app.py --pipeline --no-publish
184
+
185
+ # Re-eval (+ gate + publish) an existing adapter on Volume
186
+ modal run research/modal/server_app.py --eval-only --job math-lora
187
+
188
+ # Re-check the gate and publish using already-computed results
189
+ modal run research/modal/server_app.py --publish-only --job math-lora
190
+
191
+ # Arbitrary command in /repo (same env as finetune.py)
192
+ modal run research/modal/server_app.py --cmd "uv run python research/finetune.py --help"
193
+
194
+ # Health check
195
+ modal run research/modal/server_app.py --ping
196
+ ```
197
+
198
+ Task flags (`--job`, `--category`, `--cmd`, `--pipeline`, `--eval-only`, `--publish-only`, `--ping`) automatically disable the default keep-alive mode.
199
+
200
+ ### CLI flags (`server_app.py`)
201
+
202
+ | Flag | Default | Meaning |
203
+ | ---- | ------- | ------- |
204
+ | *(none)* | `serve=True` | Keep `GpuWorker` alive (`keep_alive`) |
205
+ | `--hours` | `4` | Keep-alive duration |
206
+ | `--no-serve` | — | Skip keep-alive (auto when any task flag is set) |
207
+ | `--job` | — | Run the skill-matrix pipeline for one job |
208
+ | `--category` | — | Run the skill-matrix pipeline for all jobs in a category |
209
+ | `--pipeline` | off | Run the skill-matrix pipeline for all jobs |
210
+ | `--max-steps` | from YAML | Override training steps |
211
+ | `--eval-only` | off | Pipeline eval/gate/publish path only (skip baselines + train) |
212
+ | `--publish` / `--no-publish` | publish on | Push to `publish.hub_repo` if the gate passes |
213
+ | `--publish-only` | off | Re-check the gate against existing results and publish (requires `--job`) |
214
+ | `--pull` / `--no-pull` | pull on | `modal volume get` adapter + results after the pipeline |
215
+ | `--cmd` | — | Shell command (parsed with `shlex`) |
216
+ | `--ping` | off | Return worker status JSON |
217
+
218
+ ### `GpuWorker` methods (for notebooks / Python callers)
219
+
220
+ After `modal deploy`, call from Python:
221
+
222
+ ```python
223
+ import modal
224
+
225
+ Worker = modal.Cls.from_name("slm-gpu-worker", "GpuWorker")
226
+ w = Worker()
227
+
228
+ w.ping.remote()
229
+ w.finetune.remote({"name": "math-lora", "dataset": "...", "format": "alpaca", "max_steps": 20})
230
+ w.lm_eval.remote(experiment_name="math-lora__math", config="research/evals/configs/lm_eval_math.yaml", adapter_path="/vol/finetuned/math-lora")
231
+ w.exec_cmd.remote(["uv", "run", "python", "research/finetune.py", "--help"])
232
+ w.run_pipeline.remote(job_names=["math-lora"], max_steps=20)
233
+
234
+ # Gate + publish (only pushes to the Hub if gate_result["passed"])
235
+ gate = w.check_gate.remote(
236
+ candidate_results_path="/vol/finetuned/results/lm_eval/math-lora__math/results.json",
237
+ baseline_results_path="/vol/finetuned/results/lm_eval/minicpm5-1b__baseline__math/results.json",
238
+ goals={"task": "gsm8k", "min_score": 0.05, "min_improve": 0.02},
239
+ )
240
+ w.publish_adapter.remote(job=..., adapter_dir="/vol/finetuned/math-lora", gate_result=gate, ...)
241
+ ```
242
+
243
+ Inside the class, `run_pipeline` chains `lm_eval` (baselines) → `finetune` → `lm_eval` (candidate) → `check_gate` → `publish_adapter` via `.local()`, so everything runs in the **same** container without extra cold starts.
244
+
245
+ ### Persistence (what survives between commands)
246
+
247
+ | Layer | Survives | Notes |
248
+ | ----- | -------- | ----- |
249
+ | **Image** (`uv sync` baked in) | Across all runs | Rebuilds only when image definition changes |
250
+ | **`hf-cache` Volume** | Across runs | Base weights + datasets; committed after each job |
251
+ | **`slm-finetune` Volume** | Across runs | Adapters + lm-eval results |
252
+ | **Warm container** | While deployed + idle &lt; `scaledown_window` | `min_containers=1`; max idle grace **3600s** (Modal limit) |
253
+ | **`keep_alive` loop** | Up to `--hours` | Container stays active; no scale-down during loop |
254
+
255
+ ### Stop / logs
256
+
257
+ ```bash
258
+ modal app logs slm-gpu-worker -f # stream logs
259
+ modal app stop slm-gpu-worker # stop deployed app + warm pool
260
+ modal app stop slm-gpu-worker -y # no confirmation prompt
261
+ ```
262
+
263
+ Refs: [`modal app`](https://modal.com/docs/reference/cli/app) · [`modal run`](https://modal.com/docs/reference/cli/run) · [`modal shell`](https://modal.com/docs/reference/cli/shell)
264
+
265
+ ### Agent loop pattern
266
+
267
+ For an AI agent iterating on finetune hyperparameters or eval configs:
268
+
269
+ 1. Ensure worker is up: `modal run research/modal/server_app.py --ping` → `{"status": "ok"}`.
270
+ 2. If ping fails, human or agent runs `modal deploy research/modal/server_app.py` then `modal run -d research/modal/server_app.py --hours 6`.
271
+ 3. Agent runs smoke train+eval+gate (no publish yet): `--job math-lora --max-steps 5 --no-publish`.
272
+ 4. Agent re-evals without retraining: `--eval-only --job math-lora`.
273
+ 5. Agent reads results: `modal volume get slm-finetune results/lm_eval/math-lora__math ./results/lm_eval/math-lora__math` or `modal volume ls slm-finetune`.
274
+ 6. Agent adjusts `experiments.yaml`'s `goals`/`max_steps`/`max_samples`, repeats from step 3.
275
+ 7. Once the gate passes and `hub_org`/`hub_repo` are real: `--publish-only --job math-lora`, or just drop `--no-publish`.
276
+ 8. When done: `modal app stop slm-gpu-worker` (optional, stops GPU billing from warm pool).
277
+
278
+ See [`SERVER.md`](SERVER.md) for a structured checklist and error recovery table.
279
+
280
+ ---
281
+
282
+ ## What gets saved on Modal
283
+
284
+ Modal persists artifacts on [**Volumes**](https://modal.com/docs/guide/volumes) — a distributed filesystem optimized for write-once, read-many workloads like model checkpoints. Files written only to the container disk (outside the mount path) are **not** saved.
285
+
286
+ | Volume | Mount in container | Contents |
287
+ | ------ | ------------------ | -------- |
288
+ | `slm-finetune` | `/vol/finetuned` | LoRA adapters, `training_results.json`, lm-eval `results/` |
289
+ | `hf-cache` | `/root/.cache/huggingface` | Cached base weights + datasets |
290
+
291
+ Volumes are created lazily on first run (`create_if_missing=True` in [`finetune_app.py`](finetune_app.py)).
292
+
293
+ ### Commits and visibility
294
+
295
+ Per the [Volumes guide](https://modal.com/docs/guide/volumes):
296
+
297
+ - **`volume.commit()`** — persist writes so other containers and `modal volume get` can see them. Our workers call this after each train/eval job.
298
+ - **Background commits** — Modal also snapshots attached Volumes every few seconds and on container shutdown, but explicit `commit()` is safest before download.
299
+ - **`volume.reload()`** — needed only if the *same* container must see writes from another container without restarting. Each `finetune_one.remote()` / `run_lm_eval.remote()` starts fresh and mounts the latest committed state.
300
+
301
+ Training writes under `/vol/finetuned/...` (the mount), not `/repo/models/...`. That matches Modal’s [model checkpointing](https://modal.com/docs/guide/volumes#model-checkpointing) pattern: point `finetune.py --out` at the Volume path.
302
+
303
+ ### Per-job adapter layout
304
+
305
+ Each finetune job writes to a Volume path named after the job (e.g. `math-lora/`).
306
+ lm-eval results live under `results/lm_eval/`, named
307
+ `<job_name>__<eval_profile>` for candidates and `<preset>__baseline__<eval_profile>`
308
+ for the shared per-profile baselines:
309
+
310
+ ```text
311
+ slm-finetune (Volume)
312
+ ├── math-lora/
313
+ │ ├── adapter_config.json
314
+ │ ├── adapter_model.safetensors # or adapter_model.bin
315
+ │ ├── tokenizer files…
316
+ │ ├── training_results.json
317
+ │ └── README.md # model card, written by publish_adapter
318
+ ├── science-lora/
319
+ ├── coding-lora/
320
+ ├── reasoning-lora/
321
+ ├── teaching-lora/
322
+ ├── alpaca-lora/
323
+ └── results/lm_eval/
324
+ ├── minicpm5-1b__baseline__math/ # shared by all "math" profile jobs
325
+ ├── minicpm5-1b__baseline__science/
326
+ ├── minicpm5-1b__baseline__instructions/
327
+ ├── math-lora__math/
328
+ ├── science-lora__science/
329
+ └── ...
330
+ ```
331
+
332
+ Because `eval_profile` is shared across jobs (e.g. `teaching-lora` and
333
+ `alpaca-lora` both use `instructions`), the `instructions` baseline is computed
334
+ once per pipeline run and reused for both jobs' gates.
335
+
336
+ ---
337
+
338
+ ## Volume CLI (browse, download, upload)
339
+
340
+ Official reference: [Modal Volumes guide](https://modal.com/docs/guide/volumes) · [CLI reference](https://modal.com/docs/reference/cli/volume)
341
+
342
+ ### Create or list volumes
343
+
344
+ ```bash
345
+ modal volume list
346
+ modal volume create slm-finetune # optional; app creates on first run
347
+ modal volume ls slm-finetune
348
+ modal volume ls slm-finetune lesson-lora
349
+ ```
350
+
351
+ ### Browse in a shell
352
+
353
+ Volumes are mounted under `/mnt` in an interactive shell:
354
+
355
+ ```bash
356
+ modal shell --volume slm-finetune
357
+ # inside shell:
358
+ ls /mnt/slm-finetune
359
+ ls /mnt/slm-finetune/lesson-lora
360
+ du -sh /mnt/slm-finetune/lesson-lora
361
+ ```
362
+
363
+ Use `du` for size — Volumes do not report accurate `df` / `disk_usage()` values ([docs](https://modal.com/docs/guide/volumes#disk-usage-reporting)).
364
+
365
+ ### Download LoRA to your machine
366
+
367
+ **Use the CLI for adapter weights.** The Modal web UI only supports downloads up to **16 MB** per file; `adapter_model.safetensors` is usually larger ([docs](https://modal.com/docs/guide/volumes#downloading-a-file-from-a-volume)).
368
+
369
+ ```bash
370
+ mkdir -p ./models/finetuned
371
+
372
+ # One job folder → local path expected by models.yaml
373
+ modal volume get slm-finetune lesson-lora ./models/finetuned/minicpm5-1b-lora
374
+
375
+ # lm-eval artifacts
376
+ mkdir -p ./results
377
+ modal volume get slm-finetune results/lm_eval ./results/lm_eval
378
+
379
+ # Entire volume (large)
380
+ modal volume get slm-finetune / ./modal-artifacts
381
+ ```
382
+
383
+ Job folders use the **job name** from `experiments.yaml` (`lesson-lora`), not `minicpm5-1b-lora`. Root [`models.yaml`](../../models.yaml) preset `minicpm5-1b-lesson-lora` expects `./models/finetuned/minicpm5-1b-lora`.
384
+
385
+ If you downloaded to a different folder name:
386
+
387
+ ```bash
388
+ modal volume get slm-finetune lesson-lora ./models/finetuned/lesson-lora
389
+ cp -r ./models/finetuned/lesson-lora ./models/finetuned/minicpm5-1b-lora
390
+ ```
391
+
392
+ ### Upload to a Volume from local
393
+
394
+ Push a local adapter or merged checkpoint back to Modal ([`modal volume put`](https://modal.com/docs/reference/cli/volume)):
395
+
396
+ ```bash
397
+ modal volume put slm-finetune ./models/finetuned/minicpm5-1b-lora lesson-lora
398
+ ```
399
+
400
+ Or from Python ([`batch_upload`](https://modal.com/docs/guide/volumes#using-a-volume-from-local-code)):
401
+
402
+ ```python
403
+ import modal
404
+
405
+ vol = modal.Volume.from_name("slm-finetune")
406
+ with vol.batch_upload() as batch:
407
+ batch.put_directory(
408
+ "./models/finetuned/minicpm5-1b-lora",
409
+ "/lesson-lora",
410
+ )
411
+ ```
412
+
413
+ ### Copy within a Volume
414
+
415
+ ```bash
416
+ modal volume cp slm-finetune lesson-lora lesson-lora-backup
417
+ ```
418
+
419
+ ### Parallel training note
420
+
421
+ With `--parallel`, multiple jobs write to **different** folders on the same Volume. On Volumes v1, avoid more than ~5 concurrent writers/commits ([docs](https://modal.com/docs/guide/volumes#volume-commits-and-reloads)). Prefer sequential runs unless you use Volumes v2 (`modal volume create --version=2`).
422
+
423
+ ---
424
+
425
+ ## Use downloaded weights locally
426
+
427
+ ```bash
428
+ # Gradio / inference preset
429
+ export ACTIVE_MODEL=minicpm5-1b-lesson-lora
430
+
431
+ uv run --package gradio-space python -m gradio_space.app
432
+
433
+ # lm-eval on downloaded adapter
434
+ uv run --package slm-evals slm-lm-eval \
435
+ --config research/evals/configs/lm_eval_smoke.yaml \
436
+ --preset minicpm5-1b-lesson-lora \
437
+ --experiment-name minicpm5-1b-lora__local-check
438
+ ```
439
+
440
+ ### Optional: merge LoRA into full weights locally
441
+
442
+ Adapters are small; merged weights are easier for some deploy targets.
443
+
444
+ ```bash
445
+ uv run python research/finetune.py \
446
+ --merge ./models/finetuned/minicpm5-1b-lora \
447
+ --out ./models/finetuned/minicpm5-1b-lora-merged
448
+ ```
449
+
450
+ Then use preset `minicpm5-1b-lesson-merged` or `--model ./models/finetuned/minicpm5-1b-lora-merged`.
451
+
452
+ ---
453
+
454
+ ## Benchmark gate & Hugging Face Hub publish
455
+
456
+ `finetune_app.py` / `server_app.py` publish adapters to the Hub **automatically**,
457
+ but only when a job's lm-eval results pass its `goals`. This is the
458
+ "only ship it if it's actually better" gate.
459
+
460
+ ### `goals` schema (per job in `experiments.yaml`)
461
+
462
+ ```yaml
463
+ goals:
464
+ task: gsm8k # lm-eval task name, scored via primary_metric() (same as summary.md)
465
+ min_score: 0.05 # candidate score must be >= this
466
+ min_improve: 0.02 # candidate - baseline must be >= this (baseline = per-profile baseline run)
467
+ guard_tasks: # optional regression guards — must NOT regress more than max_regress
468
+ - task: arc_challenge
469
+ max_regress: 0.03
470
+ ```
471
+
472
+ A job with no `goals` (e.g. `alpaca-lora`) is never gated and never published —
473
+ it's local-only (still trained, evaluated, and pulled to your laptop).
474
+
475
+ ### `publish` schema (per job)
476
+
477
+ ```yaml
478
+ publish:
479
+ hub_repo: your-hf-username/minicpm5-1b-math-lora
480
+ private: false # public so judges can verify the Well-Tuned badge; set true to keep it hidden
481
+ ```
482
+
483
+ ### What happens on a passing gate
484
+
485
+ 1. `run_lm_eval` writes `results/lm_eval/<job>__<profile>/results.json`.
486
+ 2. `check_gate` compares it against `results/lm_eval/<preset>__baseline__<profile>/results.json`
487
+ using the `goals` above → `{"passed": bool, "checks": [...]}`.
488
+ 3. If `passed` and `publish` is set, `publish_adapter`:
489
+ - renders a model card (`README.md`) into the adapter directory — base model,
490
+ gate checks table, full lm-eval baseline-vs-candidate-vs-delta table,
491
+ training stats, and a PEFT load snippet
492
+ - `huggingface_hub.HfApi().create_repo(..., exist_ok=True)` +
493
+ `upload_folder(...)` to `publish.hub_repo`
494
+
495
+ If the gate fails, nothing is pushed — rerun with different `max_steps` /
496
+ dataset / `goals`, then `modal run research/modal/finetune_app.py::publish_only --job <name>`
497
+ once it passes (re-checks the gate against the latest results before publishing).
498
+
499
+ ### Setup
500
+
501
+ ```bash
502
+ huggingface-cli login
503
+ # or: export HF_TOKEN=hf_... (needs write access; same token as `modal secret create huggingface`)
504
+ ```
505
+
506
+ Set real values for `defaults.hub_org` and each job's `publish.hub_repo` in
507
+ `experiments.yaml` before running with `--publish` (the default). Repos are
508
+ created automatically (`exist_ok=True`) — no need to pre-create them on huggingface.co.
509
+
510
+ ---
511
+
512
+ ## Manual Hugging Face Hub publish (fallback)
513
+
514
+ Use this if you'd rather download an adapter and push it yourself — e.g. for
515
+ **merged full weights**, or adapters trained before the gate/publish pipeline
516
+ existed.
517
+
518
+ ### Prerequisites
519
+
520
+ ```bash
521
+ huggingface-cli login
522
+ # or: export HF_TOKEN=hf_...
523
+ ```
524
+
525
+ Create an empty model repo on Hugging Face (e.g. `your-user/minicpm5-1b-lesson-lora`).
526
+
527
+ ### Option A — Upload LoRA adapter (recommended)
528
+
529
+ After `modal volume get`:
530
+
531
+ ```bash
532
+ ADAPTER=./models/finetuned/minicpm5-1b-lora
533
+ REPO=your-user/minicpm5-1b-lesson-lora
534
+
535
+ huggingface-cli upload "$REPO" "$ADAPTER" . \
536
+ --repo-type model \
537
+ --commit-message "Lesson LoRA from Modal finetune"
538
+ ```
539
+
540
+ Add a minimal `README.md` in the adapter folder before upload (or edit on the Hub) documenting the base model:
541
+
542
+ ```markdown
543
+ # MiniCPM5-1B lesson LoRA
544
+
545
+ - Base model: [openbmb/MiniCPM5-1B](https://huggingface.co/openbmb/MiniCPM5-1B)
546
+ - Dataset: education lesson chat (Build Small hackathon)
547
+ - Load with PEFT: `PeftModel.from_pretrained(base, "your-user/minicpm5-1b-lesson-lora")`
548
+ ```
549
+
550
+ **Load from Hub in Python:**
551
+
552
+ ```python
553
+ from peft import PeftModel
554
+ from transformers import AutoModelForCausalLM, AutoTokenizer
555
+
556
+ base = "openbmb/MiniCPM5-1B"
557
+ adapter = "your-user/minicpm5-1b-lesson-lora"
558
+
559
+ tokenizer = AutoTokenizer.from_pretrained(base, trust_remote_code=True)
560
+ model = AutoModelForCausalLM.from_pretrained(
561
+ base, torch_dtype="auto", device_map="auto", trust_remote_code=True
562
+ )
563
+ model = PeftModel.from_pretrained(model, adapter)
564
+ ```
565
+
566
+ ### Option B — Upload merged weights
567
+
568
+ ```bash
569
+ uv run python research/finetune.py \
570
+ --merge ./models/finetuned/minicpm5-1b-lora \
571
+ --out ./models/finetuned/minicpm5-1b-lora-merged
572
+
573
+ huggingface-cli upload your-user/minicpm5-1b-lesson-merged \
574
+ ./models/finetuned/minicpm5-1b-lora-merged . \
575
+ --repo-type model
576
+ ```
577
+
578
+ Consumers set `MODEL_ID=your-user/minicpm5-1b-lesson-merged` with no adapter.
579
+
580
+ ### Option C — Upload from Modal shell (no local download)
581
+
582
+ Browse the Volume in a shell ([docs](https://modal.com/docs/guide/volumes#using-a-volume-from-outside-of-modal)):
583
+
584
+ ```bash
585
+ modal shell --volume slm-finetune
586
+ ```
587
+
588
+ Inside the shell (volume at `/mnt/slm-finetune`):
589
+
590
+ ```bash
591
+ pip install huggingface_hub
592
+ export HF_TOKEN=... # write token
593
+ huggingface-cli upload your-user/minicpm5-1b-lesson-lora \
594
+ /mnt/slm-finetune/lesson-lora . --repo-type model
595
+ ```
596
+
597
+ Downloading to your laptop first (Option A) is usually easier to review before publish.
598
+
599
+ ### Use on Hugging Face Space
600
+
601
+ **LoRA on Space (Gradio SDK):**
602
+
603
+ 1. Upload adapter repo (Option A).
604
+ 2. In Space **Settings → Repository secrets**, set `HF_TOKEN` if the base model needs it.
605
+ 3. In Space env vars:
606
+
607
+ ```bash
608
+ ACTIVE_MODEL=minicpm5-1b
609
+ # Override adapter via custom preset or env — e.g. add to models.yaml on Space:
610
+ # adapter_path: your-user/minicpm5-1b-lesson-lora # Hub id works if peft resolves it
611
+ ```
612
+
613
+ For the shipped Space, the reliable path is: download adapter → commit into repo under `models/finetuned/` → `ACTIVE_MODEL=minicpm5-1b-lesson-lora`, or upload **merged** weights and point `MODEL_ID` at your Hub repo.
614
+
615
+ **Merged on Space:**
616
+
617
+ ```bash
618
+ ACTIVE_MODEL=custom
619
+ MODEL_ID=your-user/minicpm5-1b-lesson-merged
620
+ TRUST_REMOTE_CODE=true
621
+ ```
622
+
623
+ ---
624
+
625
+ ## Modal Notebooks (interactive GPU)
626
+
627
+ Official guide: [Modal Notebooks](https://modal.com/docs/guide/notebooks)
628
+
629
+ Use a hosted Jupyter kernel on Modal for demos, pair programming, and quick experiments. For reproducible sweeps and CI-style runs, prefer `modal run research/modal/finetune_app.py`.
630
+
631
+ ### Getting started
632
+
633
+ 1. Open [modal.com/notebooks](https://modal.com/notebooks) and **upload** [`research/notebook/minicpm5-modal-finetune.ipynb`](../notebook/minicpm5-modal-finetune.ipynb) (or create a notebook and copy the cells).
634
+ 2. In the **sidebar → Compute profile**, enable a **GPU** (e.g. A10G). Notebooks are serverless: you pay only while the kernel runs; idle shutdown defaults to 10 minutes.
635
+ 3. Attach resources in the sidebar **Files** panel:
636
+ - **Volume** `slm-finetune` → appears under `/mnt/slm-finetune` (share checkpoints with `modal run` jobs)
637
+ - **Secret** `huggingface` → injects `HF_TOKEN` for Hub downloads
638
+ 4. Run cells top to bottom.
639
+
640
+ The default notebook image includes PyTorch, Transformers, and NumPy. Install extras with:
641
+
642
+ ```python
643
+ %uv pip install uv peft bitsandbytes datasets
644
+ ```
645
+
646
+ ### Persist checkpoints on a Volume
647
+
648
+ The container filesystem is **ephemeral**. Anything under `/root` is lost when the kernel stops. Write adapters to an attached Volume:
649
+
650
+ ```python
651
+ OUT = "/mnt/slm-finetune/lesson-lora-notebook" # survives kernel restarts
652
+ ```
653
+
654
+ After training, download from the **Files** panel (⬇) or locally:
655
+
656
+ ```bash
657
+ modal volume get slm-finetune lesson-lora-notebook ./models/finetuned/minicpm5-1b-lora
658
+ ```
659
+
660
+ ### Custom image (optional, full repo deps)
661
+
662
+ To match the `modal run` environment exactly, deploy the app image once:
663
+
664
+ ```bash
665
+ modal deploy research/modal/finetune_app.py
666
+ ```
667
+
668
+ Then in the notebook sidebar, search for function `finetune_one` from app `slm-finetune-benchmark` and select that image as the kernel.
669
+
670
+ Or call deployed functions from a cell with [`%modal` magic](https://modal.com/docs/guide/notebooks#cell-magic):
671
+
672
+ ```python
673
+ %modal from slm-finetune-benchmark import finetune_one
674
+
675
+ finetune_one.remote({
676
+ "name": "lesson-lora",
677
+ "dataset": "research/data/education-lesson-chat.jsonl",
678
+ "format": "chat",
679
+ "max_steps": 20,
680
+ })
681
+ ```
682
+
683
+ (Requires `modal deploy` and the repo baked into the image.)
684
+
685
+ ### Share for hackathon judges
686
+
687
+ Use **Share** in the notebook editor → **public unlisted link** → **Can view and run** so reviewers can fork and execute without a Modal account ([docs](https://modal.com/docs/guide/notebooks#access-and-sharing)).
688
+
689
+ ### Notebook vs `modal run`
690
+
691
+ | | Modal Notebook | `modal run finetune_app.py` |
692
+ | --- | --- | --- |
693
+ | Best for | Demo video, exploration | Reproducible sweep, Volume + lm-eval pipeline |
694
+ | GPU | Sidebar compute profile | `gpu="A10G"` on functions |
695
+ | Persistence | Attach Volume in sidebar | `slm-finetune` Volume auto-mounted |
696
+ | Cost | Per kernel uptime | Per function invocation |
697
+
698
+ ---
699
+
700
+ ## Architecture
701
+
702
+ ```mermaid
703
+ flowchart LR
704
+ subgraph batch [finetune_app.py — batch]
705
+ laptop1["modal run finetune_app\n--job/--category"] --> base["run_lm_eval\n(per-profile baseline)"]
706
+ laptop1 --> train["finetune_one"]
707
+ train --> eval["run_lm_eval\n(candidate)"]
708
+ eval --> gate["check_gate\n(goals)"]
709
+ gate -- passed --> pub["publish_adapter"]
710
+ end
711
+ subgraph worker [server_app.py — warm loop]
712
+ laptop2["modal run server_app\n--job/--category/--pipeline"] --> gpu["GpuWorker A10G"]
713
+ gpu --> rp["run_pipeline\n(baseline -> train -> eval -> gate -> publish)"]
714
+ end
715
+ base --> vol["Volume slm-finetune"]
716
+ train --> vol
717
+ eval --> vol
718
+ gate --> vol
719
+ rp --> vol
720
+ gpu --> hfc["Volume hf-cache"]
721
+ pub --> hub["Hugging Face Hub\n(publish.hub_repo)"]
722
+ rp --> hub
723
+ vol --> get["modal volume get\n(pull)"]
724
+ get --> local["models/finetuned/<job>"]
725
+ local --> space["HF Space ACTIVE_MODEL"]
726
+ ```
727
+
728
+ | Resource | Role |
729
+ | -------- | ---- |
730
+ | App `slm-finetune-benchmark` | One-shot batch pipeline (`finetune_app.py`): `main`, `publish_only`, `pull` |
731
+ | App `slm-gpu-worker` | Long-lived GPU worker (`server_app.py`): `GpuWorker.run_pipeline` |
732
+ | GPU `A10G` (or per-job `gpu:` override) | Default for train + eval |
733
+ | Secret `huggingface` | `HF_TOKEN` for HF downloads + Hub publish |
734
+ | [`_common.py`](_common.py) | Shared image, volumes, command builders, gate (`evaluate_gate`/`check_gate_files`), publish (`publish_adapter_files`, `render_model_card`) |
735
+ | [`experiments.yaml`](experiments.yaml) | Skill matrix: jobs, `eval_profile`, `goals`, `publish` |
736
+ | [`eval_profiles.yaml`](../evals/configs/eval_profiles.yaml) | Maps `eval_profile` → lm-eval config + task list |
737
+ | [`finetune.py`](../finetune.py) | Training logic (unchanged) |
738
+ | `slm-lm-eval` | Academic benchmarks |
739
+
740
+ ---
741
+
742
+ ## Troubleshooting
743
+
744
+ | Symptom | Fix |
745
+ | ------- | --- |
746
+ | `Secret huggingface not found` | `modal secret create huggingface HF_TOKEN=...` |
747
+ | Volume empty after run | Job may have failed; `modal volume ls slm-finetune`; ensure writes went to `/vol/finetuned` not `/repo` |
748
+ | `modal volume get` missing files | Call `commit()` completed; for same-container reads use `volume.reload()` |
749
+ | Large file won't download in UI | Use `modal volume get` CLI (16 MB UI limit) |
750
+ | `modal volume get` path wrong | Job name = top-level folder (e.g. `math-lora`, not `minicpm5-1b-lora`) |
751
+ | Gate fails / `published: false, reason: "gate failed"` | Check `gate.checks` in the output; adjust `goals` (`min_score`/`min_improve`/`guard_tasks`), `max_steps`, or dataset, then rerun |
752
+ | `published: false, reason: "no publish config..."` | Job has no `publish:` block in `experiments.yaml` (intentional for local-only jobs like `alpaca-lora`) |
753
+ | `Unknown eval_profile ...` | Check `eval_profile` in `experiments.yaml` matches a key in `research/evals/configs/eval_profiles.yaml` |
754
+ | Hub upload 403 | Use a write `HF_TOKEN`; repos are created automatically (`exist_ok=True`), no need to pre-create |
755
+ | Still publishing to `your-hf-username/...` | Edit `defaults.hub_org` and each job's `publish.hub_repo` in `experiments.yaml` |
756
+ | Space cannot find adapter | Use merged weights or copy adapter into repo `models/finetuned/` |
757
+ | Image build slow | `hf-cache` Volume caches weights across runs |
758
+ | OOM on GPU | `--mode qlora` in `experiments.yaml`; lower `max_len` in finetune; or set a per-job `gpu:` with more VRAM |
759
+ | `scaledown_window` deploy error | Must be 2–3600s (we use 3600); see `_common.py` |
760
+ | `server_app` ping fails | `modal deploy research/modal/server_app.py`; start keep-alive: `modal run -d research/modal/server_app.py` |
761
+ | Jobs hit different containers | Deploy first; use `server_app.py` not `finetune_app.py` for warm loop |
762
+ | Worker still billing after done | `modal app stop slm-gpu-worker` |
763
+
764
+ ---
765
+
766
+ ## Hackathon checklist
767
+
768
+ 1. Link or screenshot of Modal app run (`slm-finetune-benchmark` or `slm-gpu-worker`), including the `--- summary ---` table (skill, category, gate, published, hub_repo).
769
+ 2. `results/lm_eval/<job>__<profile>/comparison.md` — baseline vs candidate per skill.
770
+ 3. At least one adapter with `goals` that passed the gate and published to the Hub (model card auto-generated).
771
+ 4. Adapter on Volume or Hub + `ACTIVE_MODEL=minicpm5-1b-<skill>-lora` on Space.
772
+ 5. Optional: Notebook recording of smoke train cell.
773
+
774
+ See also: [`SERVER.md`](SERVER.md) · [research/USAGE.md](../USAGE.md) · [Modal Volumes](https://modal.com/docs/guide/volumes) · [Modal Notebooks](https://modal.com/docs/guide/notebooks) · [Modal CUDA](https://modal.com/docs/guide/cuda)
research/modal/SERVER.md ADDED
@@ -0,0 +1,204 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # GPU worker runbook (`server_app.py`)
2
+
3
+ Long-lived Modal GPU for iterative finetune / eval loops. Intended for **humans** and **AI coding agents** running many experiments from the same warm container.
4
+
5
+ **Full docs:** [README.md](README.md) · **Code:** [`server_app.py`](server_app.py) · **Jobs:** [`experiments.yaml`](experiments.yaml)
6
+
7
+ ---
8
+
9
+ ## Prerequisites
10
+
11
+ Run from **repo root**.
12
+
13
+ ```bash
14
+ pip install modal
15
+ modal setup
16
+ modal secret create huggingface HF_TOKEN=<your-hf-token> # once
17
+ modal deploy research/modal/server_app.py # once per image change
18
+ ```
19
+
20
+ | Name | Value |
21
+ | ---- | ----- |
22
+ | App | `slm-gpu-worker` |
23
+ | Class | `GpuWorker` |
24
+ | GPU | `A10G` |
25
+ | Volumes | `hf-cache` → `/root/.cache/huggingface`, `slm-finetune` → `/vol/finetuned` |
26
+
27
+ ---
28
+
29
+ ## Start session (human or agent)
30
+
31
+ ```bash
32
+ # Option A: block terminal (default 4h keep-alive)
33
+ modal run research/modal/server_app.py
34
+
35
+ # Option B: detached — preferred for agent loops
36
+ modal run -d research/modal/server_app.py --hours 6
37
+
38
+ # Verify worker
39
+ modal run research/modal/server_app.py --ping
40
+ # → {"status": "ok", "app": "slm-gpu-worker"}
41
+ ```
42
+
43
+ ---
44
+
45
+ ## Experiment commands (repeat freely)
46
+
47
+ All commands use the deployed warm worker when `modal deploy` has been run.
48
+
49
+ ```bash
50
+ # --- Train ---
51
+ modal run research/modal/server_app.py --job lesson-lora --max-steps 20
52
+ modal run research/modal/server_app.py --job alpaca-lora --max-steps 50
53
+ modal run research/modal/server_app.py --job smoltalk-lora --max-steps 50
54
+
55
+ # --- Eval only (adapter must exist on Volume) ---
56
+ modal run research/modal/server_app.py --eval-only --job lesson-lora
57
+ modal run research/modal/server_app.py --eval-only # all jobs in experiments.yaml
58
+
59
+ # --- Full pipeline (same container: baseline → train → eval) ---
60
+ modal run research/modal/server_app.py --pipeline --job lesson-lora --max-steps 20
61
+ modal run research/modal/server_app.py --pipeline --job lesson-lora --max-steps 20 --skip-baseline
62
+
63
+ # --- Custom finetune.py flags ---
64
+ modal run research/modal/server_app.py --cmd \
65
+ "uv run python research/finetune.py --preset minicpm5-1b --mode lora \
66
+ --dataset research/data/education-lesson-chat.jsonl --format chat \
67
+ --out /vol/finetuned/lesson-lora --max_steps 10"
68
+
69
+ # --- Custom lm-eval ---
70
+ modal run research/modal/server_app.py --cmd \
71
+ "uv run --package slm-evals slm-lm-eval \
72
+ --config research/evals/configs/lm_eval_smoke.yaml \
73
+ --experiment-name lesson-lora__manual \
74
+ --output-dir /vol/finetuned/results/lm_eval \
75
+ --model openbmb/MiniCPM5-1B \
76
+ --adapter /vol/finetuned/lesson-lora"
77
+ ```
78
+
79
+ Job names and datasets: [`experiments.yaml`](experiments.yaml).
80
+
81
+ ---
82
+
83
+ ## Inspect results (human or agent)
84
+
85
+ ```bash
86
+ # List Volume
87
+ modal volume ls slm-finetune
88
+ modal volume ls slm-finetune lesson-lora
89
+ modal volume ls slm-finetune results/lm_eval
90
+
91
+ # Download to laptop
92
+ modal volume get slm-finetune lesson-lora ./models/finetuned/minicpm5-1b-lora
93
+ modal volume get slm-finetune results/lm_eval ./results/lm_eval
94
+
95
+ # Stream worker logs
96
+ modal app logs slm-gpu-worker -f
97
+ ```
98
+
99
+ Key artifacts on Volume:
100
+
101
+ | Path | Content |
102
+ | ---- | ------- |
103
+ | `/vol/finetuned/<job>/` | LoRA adapter + `training_results.json` |
104
+ | `/vol/finetuned/results/lm_eval/<exp>/` | `results.json`, `summary.md`, `comparison.md` |
105
+
106
+ ---
107
+
108
+ ## End session
109
+
110
+ ```bash
111
+ modal app stop slm-gpu-worker -y
112
+ ```
113
+
114
+ Stops the deployed app and warm GPU pool. Volume data is retained.
115
+
116
+ ---
117
+
118
+ ## AI agent loop (structured)
119
+
120
+ Use this sequence when an agent is iterating on training or eval without local CUDA.
121
+
122
+ ```
123
+ 1. CHECK modal run research/modal/server_app.py --ping
124
+ 2. BOOT if ping fails → modal deploy ... then modal run -d ... --hours 6
125
+ 3. SMOKE modal run ... --job lesson-lora --max-steps 5
126
+ 4. EVAL modal run ... --eval-only --job lesson-lora
127
+ 5. READ modal volume ls slm-finetune results/lm_eval
128
+ modal volume get ... (or read comparison.md locally after get)
129
+ 6. ADJUST edit experiments.yaml OR pass --max-steps / --lm-eval-config
130
+ 7. GOTO 3 until metrics acceptable
131
+ 8. PULL modal volume get slm-finetune lesson-lora ./models/finetuned/minicpm5-1b-lora
132
+ 9. STOP modal app stop slm-gpu-worker -y (optional, saves GPU cost)
133
+ ```
134
+
135
+ ### Agent decision rules
136
+
137
+ | Situation | Action |
138
+ | --------- | ------ |
139
+ | First time in repo | `modal deploy research/modal/server_app.py` |
140
+ | `ping` returns ok | Skip boot; run task commands |
141
+ | `ping` fails / timeout | `modal run -d research/modal/server_app.py --hours 6`, retry ping |
142
+ | Train OOM | `--cmd` with `--mode qlora` or lower `--max-steps` |
143
+ | Eval missing adapter | Train first, or `modal volume ls slm-finetune <job>` |
144
+ | Need batch parallel GPUs | Use `finetune_app.py --parallel` instead |
145
+ | Need one-shot CI sweep | Use `finetune_app.py` (not server) |
146
+ | Image / code changed | Re-run `modal deploy research/modal/server_app.py` |
147
+
148
+ ### Python API (agents in Modal notebook or scripts)
149
+
150
+ ```python
151
+ import modal
152
+
153
+ Worker = modal.Cls.from_name("slm-gpu-worker", "GpuWorker")
154
+ w = Worker()
155
+
156
+ assert w.ping.remote()["status"] == "ok"
157
+
158
+ w.finetune.remote({
159
+ "name": "lesson-lora",
160
+ "preset": "minicpm5-1b",
161
+ "mode": "lora",
162
+ "dataset": "research/data/education-lesson-chat.jsonl",
163
+ "format": "chat",
164
+ "max_steps": 20,
165
+ })
166
+
167
+ w.run_pipeline.remote(job_names=["lesson-lora"], max_steps=20)
168
+ ```
169
+
170
+ ---
171
+
172
+ ## `finetune_app.py` vs `server_app.py`
173
+
174
+ | | `finetune_app.py` | `server_app.py` |
175
+ | --- | --- | --- |
176
+ | App name | `slm-finetune-benchmark` | `slm-gpu-worker` |
177
+ | Container | New per function call | Warm pool, reused |
178
+ | Deploy | Optional | **Required** for cross-terminal reuse |
179
+ | Parallel jobs | `--parallel` (3 GPUs) | Sequential on one GPU |
180
+ | Best for | Full sweep, reproducible batch | Interactive / agent iteration |
181
+ | Entry | `modal run research/modal/finetune_app.py` | `modal deploy` + `modal run research/modal/server_app.py` |
182
+
183
+ ---
184
+
185
+ ## Troubleshooting
186
+
187
+ | Symptom | Fix |
188
+ | ------- | --- |
189
+ | `scaledown_window must be between 2 and 3600` | Already fixed in `_common.py` (3600 max) |
190
+ | Deploy succeeds but ping fails | Wait ~30s for warm pool; check `modal app list` |
191
+ | Command uses cold container | Run `modal deploy` first; confirm app name `slm-gpu-worker` |
192
+ | HF download every run | `hf-cache` volume should mount; first run populates cache |
193
+ | Writes not visible | Paths must be under `/vol/finetuned/`, not `/repo/models/` |
194
+ | GPU still billing overnight | `modal app stop slm-gpu-worker` |
195
+
196
+ ---
197
+
198
+ ## References
199
+
200
+ - [Modal Volumes](https://modal.com/docs/guide/volumes)
201
+ - [Modal Images](https://modal.com/docs/guide/images)
202
+ - [modal run](https://modal.com/docs/reference/cli/run)
203
+ - [modal app stop](https://modal.com/docs/reference/cli/app#modal-app-stop)
204
+ - [modal shell](https://modal.com/docs/reference/cli/shell) — debug: `modal shell research/modal/server_app.py::GpuWorker.finetune`
research/modal/_common.py ADDED
@@ -0,0 +1,520 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Shared Modal image, volumes, and command builders for finetune + server apps."""
2
+
3
+ from __future__ import annotations
4
+
5
+ import json
6
+ import os
7
+ from pathlib import Path
8
+ from typing import Any
9
+
10
+ import modal
11
+ import yaml
12
+
13
+ _file = Path(__file__).resolve()
14
+ try:
15
+ LOCAL_REPO_ROOT = _file.parents[2]
16
+ except IndexError:
17
+ LOCAL_REPO_ROOT = Path("/repo")
18
+
19
+ if (_file.parent / "experiments.yaml").is_file():
20
+ EXPERIMENTS_PATH = _file.parent / "experiments.yaml"
21
+ else:
22
+ EXPERIMENTS_PATH = Path("/repo/research/modal/experiments.yaml")
23
+
24
+ _EVAL_PROFILES_REL = "research/evals/configs/eval_profiles.yaml"
25
+ if (LOCAL_REPO_ROOT / _EVAL_PROFILES_REL).is_file():
26
+ EVAL_PROFILES_PATH = LOCAL_REPO_ROOT / _EVAL_PROFILES_REL
27
+ else:
28
+ EVAL_PROFILES_PATH = Path("/repo") / _EVAL_PROFILES_REL
29
+
30
+ REPO_ROOT = LOCAL_REPO_ROOT
31
+
32
+ HF_CACHE_PATH = "/root/.cache/huggingface"
33
+ FINETUNE_VOL_PATH = "/vol/finetuned"
34
+ LM_EVAL_OUTPUT = f"{FINETUNE_VOL_PATH}/results/lm_eval"
35
+ BASE_MODEL_ID = "openbmb/MiniCPM5-1B"
36
+
37
+ BASELINE_EXPERIMENT = "minicpm5-1b__modal-baseline"
38
+ BASELINE_RESULTS_JSON = f"{LM_EVAL_OUTPUT}/{BASELINE_EXPERIMENT}/results.json"
39
+
40
+ # Metric keys to prefer when picking a task's "primary" score, in priority
41
+ # order. Covers lm-eval-harness multiple-choice (acc), generation (exact_match),
42
+ # and code (pass@1) tasks so gates and model cards pick a real score, not a stderr.
43
+ _METRIC_PRIORITY = (
44
+ "acc,none",
45
+ "acc_norm,none",
46
+ "exact_match,strict-match",
47
+ "exact_match,flexible-extract",
48
+ "pass_at_1,create_test",
49
+ "pass_at_1,none",
50
+ "f1,none",
51
+ "bleu,none",
52
+ )
53
+
54
+ hf_cache_vol = modal.Volume.from_name("hf-cache", create_if_missing=True)
55
+ finetune_vol = modal.Volume.from_name("slm-finetune", create_if_missing=True)
56
+ hf_secret = modal.Secret.from_name("huggingface")
57
+
58
+ image = (
59
+ modal.Image.debian_slim(python_version="3.12")
60
+ .apt_install("git", "build-essential")
61
+ .pip_install("uv", "pyyaml", "huggingface_hub")
62
+ .add_local_dir(
63
+ str(REPO_ROOT),
64
+ remote_path="/repo",
65
+ copy=True,
66
+ ignore=[
67
+ ".git/**",
68
+ ".venv/**",
69
+ "models/**",
70
+ "results/**",
71
+ "outputs/**",
72
+ "**/__pycache__/**",
73
+ "**/.pytest_cache/**",
74
+ "**/node_modules/**",
75
+ ],
76
+ )
77
+ .run_commands(
78
+ "cd /repo && uv sync --frozen --group finetune --group lm-eval --no-dev"
79
+ )
80
+ )
81
+
82
+ COMMON_ENV = {
83
+ "TRUST_REMOTE_CODE": "true",
84
+ "HF_HOME": HF_CACHE_PATH,
85
+ "PYTORCH_CUDA_ALLOC_CONF": "expandable_segments:True",
86
+ }
87
+
88
+ DEFAULT_GPU = "A10G"
89
+ DEFAULT_KEEPALIVE_HOURS = 4.0
90
+ DEFAULT_SCALEDOWN_WINDOW = 3600 # max allowed by Modal (1h idle before scale-down)
91
+ DEFAULT_WORKER_TIMEOUT = 14400 # 4h per method call
92
+
93
+
94
+ def repo_env() -> dict[str, str]:
95
+ return {**os.environ, **COMMON_ENV}
96
+
97
+
98
+ def reload_volumes() -> None:
99
+ finetune_vol.reload()
100
+ hf_cache_vol.reload()
101
+
102
+
103
+ def commit_volumes() -> None:
104
+ finetune_vol.commit()
105
+ hf_cache_vol.commit()
106
+
107
+
108
+ def load_experiments() -> dict[str, Any]:
109
+ with EXPERIMENTS_PATH.open() as f:
110
+ return yaml.safe_load(f) or {}
111
+
112
+
113
+ def apply_defaults(job: dict[str, Any], defaults: dict[str, Any]) -> dict[str, Any]:
114
+ return {**defaults, **job}
115
+
116
+
117
+ def build_finetune_cmd(job: dict[str, Any], out_dir: str) -> list[str]:
118
+ cmd = [
119
+ "uv",
120
+ "run",
121
+ "python",
122
+ "research/finetune.py",
123
+ "--preset",
124
+ job.get("preset", "minicpm5-1b"),
125
+ "--mode",
126
+ job.get("mode", "lora"),
127
+ "--dataset",
128
+ job["dataset"],
129
+ "--format",
130
+ job["format"],
131
+ "--out",
132
+ out_dir,
133
+ ]
134
+ if job.get("max_steps") is not None:
135
+ cmd.extend(["--max_steps", str(int(job["max_steps"]))])
136
+ if job.get("epochs") is not None:
137
+ cmd.extend(["--epochs", str(job["epochs"])])
138
+ if job.get("dataset_config"):
139
+ cmd.extend(["--dataset-config", job["dataset_config"]])
140
+ if job.get("dataset_split"):
141
+ cmd.extend(["--dataset-split", str(job["dataset_split"])])
142
+ if job.get("max_samples") is not None:
143
+ cmd.extend(["--dataset-max-samples", str(int(job["max_samples"]))])
144
+ return cmd
145
+
146
+
147
+ def build_lm_eval_cmd(
148
+ *,
149
+ experiment_name: str,
150
+ config: str,
151
+ preset: str | None = None,
152
+ model_path: str | None = None,
153
+ adapter_path: str | None = None,
154
+ compare_to: str | None = None,
155
+ ) -> list[str]:
156
+ cmd = [
157
+ "uv",
158
+ "run",
159
+ "--package",
160
+ "slm-evals",
161
+ "slm-lm-eval",
162
+ "--config",
163
+ config,
164
+ "--experiment-name",
165
+ experiment_name,
166
+ "--output-dir",
167
+ LM_EVAL_OUTPUT,
168
+ ]
169
+ if preset:
170
+ cmd.extend(["--preset", preset])
171
+ if model_path:
172
+ cmd.extend(["--model", model_path])
173
+ if adapter_path:
174
+ cmd.extend(["--adapter", adapter_path])
175
+ if compare_to:
176
+ cmd.extend(["--compare-to", compare_to])
177
+ return cmd
178
+
179
+
180
+ def prepare_jobs(
181
+ *,
182
+ job: str | None = None,
183
+ category: str | None = None,
184
+ max_steps: int | None = None,
185
+ ) -> tuple[dict[str, Any], list[dict[str, Any]]]:
186
+ spec = load_experiments()
187
+ defaults = spec.get("defaults", {})
188
+ jobs = spec.get("finetune", [])
189
+
190
+ if job:
191
+ jobs = [j for j in jobs if j.get("name") == job]
192
+ if not jobs:
193
+ raise SystemExit(
194
+ f"Unknown job {job!r}; check research/modal/experiments.yaml"
195
+ )
196
+ if category:
197
+ jobs = [j for j in jobs if j.get("category") == category]
198
+ if not jobs:
199
+ raise SystemExit(f"No jobs with category {category!r}")
200
+
201
+ prepared: list[dict[str, Any]] = []
202
+ for raw in jobs:
203
+ merged = apply_defaults(raw, defaults)
204
+ if max_steps is not None:
205
+ merged["max_steps"] = max_steps
206
+ prepared.append(merged)
207
+ return defaults, prepared
208
+
209
+
210
+ def job_gpu(job: dict[str, Any]) -> str:
211
+ return job.get("gpu") or DEFAULT_GPU
212
+
213
+
214
+ def config_for_profile(profile: str) -> str:
215
+ """Map an eval_profiles.yaml profile name to its config path (relative to repo root)."""
216
+ with EVAL_PROFILES_PATH.open() as f:
217
+ catalog = yaml.safe_load(f) or {}
218
+ meta = (catalog.get("profiles") or {}).get(profile)
219
+ if not meta or not meta.get("config"):
220
+ known = ", ".join(sorted((catalog.get("profiles") or {})))
221
+ raise SystemExit(
222
+ f"Unknown eval_profile {profile!r}; check {_EVAL_PROFILES_REL} (known: {known})"
223
+ )
224
+ return f"research/evals/configs/{meta['config']}"
225
+
226
+
227
+ def primary_metric(task_metrics: dict[str, Any]) -> tuple[str, float] | None:
228
+ """Pick a task's headline (metric_name, score), matching slm_evals summary tables."""
229
+ for key in _METRIC_PRIORITY:
230
+ if key in task_metrics and isinstance(task_metrics[key], (int, float)):
231
+ return key, float(task_metrics[key])
232
+ for key, value in task_metrics.items():
233
+ if "stderr" in key:
234
+ continue
235
+ if isinstance(value, (int, float)):
236
+ return key, float(value)
237
+ return None
238
+
239
+
240
+ def evaluate_gate(
241
+ *,
242
+ candidate: dict[str, Any],
243
+ baseline: dict[str, Any] | None,
244
+ goals: dict[str, Any],
245
+ ) -> dict[str, Any]:
246
+ """Check a candidate's lm-eval results dict against `goals` (Hub publish gate).
247
+
248
+ `goals` schema:
249
+ task: <lm-eval task name> # scored via primary_metric(), same as summary.md
250
+ min_score: <float, optional> # candidate score must be >= this
251
+ min_improve: <float, optional> # candidate - baseline must be >= this
252
+ guard_tasks: # optional regression guards
253
+ - task: <lm-eval task name>
254
+ max_regress: <float> # baseline - candidate must be <= this
255
+ """
256
+ cand_tasks = candidate.get("results", {})
257
+ base_tasks = (baseline or {}).get("results", {})
258
+
259
+ def _score(tasks: dict[str, Any], task_name: str) -> float | None:
260
+ metrics = tasks.get(task_name)
261
+ if not metrics:
262
+ return None
263
+ picked = primary_metric(metrics)
264
+ return picked[1] if picked else None
265
+
266
+ checks: list[dict[str, Any]] = []
267
+ passed = True
268
+
269
+ task = goals["task"]
270
+ cand_score = _score(cand_tasks, task)
271
+ base_score = _score(base_tasks, task)
272
+
273
+ if goals.get("min_score") is not None:
274
+ ok = cand_score is not None and cand_score >= goals["min_score"]
275
+ checks.append({"check": f"{task} >= {goals['min_score']}", "value": cand_score, "ok": ok})
276
+ passed = passed and ok
277
+
278
+ if goals.get("min_improve") is not None:
279
+ delta = (
280
+ cand_score - base_score
281
+ if (cand_score is not None and base_score is not None)
282
+ else None
283
+ )
284
+ ok = delta is not None and delta >= goals["min_improve"]
285
+ checks.append(
286
+ {"check": f"{task} improve >= {goals['min_improve']}", "value": delta, "ok": ok}
287
+ )
288
+ passed = passed and ok
289
+
290
+ for guard in goals.get("guard_tasks", []):
291
+ g_task = guard["task"]
292
+ g_cand = _score(cand_tasks, g_task)
293
+ g_base = _score(base_tasks, g_task)
294
+ regress = g_base - g_cand if (g_cand is not None and g_base is not None) else None
295
+ ok = regress is not None and regress <= guard["max_regress"]
296
+ checks.append(
297
+ {"check": f"{g_task} regress <= {guard['max_regress']}", "value": regress, "ok": ok}
298
+ )
299
+ passed = passed and ok
300
+
301
+ if not checks:
302
+ passed = False
303
+ checks.append({"check": "goals defined no checks", "value": None, "ok": False})
304
+
305
+ return {
306
+ "passed": passed,
307
+ "checks": checks,
308
+ "task": task,
309
+ "candidate_score": cand_score,
310
+ "baseline_score": base_score,
311
+ }
312
+
313
+
314
+ def pull_artifacts(job_name: str, exp_name: str, dest: str = "models/finetuned") -> None:
315
+ """Download an adapter and its lm-eval results from the `slm-finetune` Volume (run locally)."""
316
+ import subprocess
317
+
318
+ local_dir = f"{dest}/{job_name}"
319
+ print(f"--- pulling {job_name} -> {local_dir} ---")
320
+ subprocess.run(
321
+ ["modal", "volume", "get", "slm-finetune", job_name, local_dir, "--force"],
322
+ check=False,
323
+ )
324
+
325
+ results_dir = f"results/lm_eval/{exp_name}"
326
+ print(f"--- pulling {results_dir} ---")
327
+ subprocess.run(
328
+ ["modal", "volume", "get", "slm-finetune", results_dir, results_dir, "--force"],
329
+ check=False,
330
+ )
331
+
332
+
333
+ def check_gate_files(
334
+ *,
335
+ candidate_results_path: str,
336
+ baseline_results_path: str | None,
337
+ goals: dict[str, Any],
338
+ ) -> dict[str, Any]:
339
+ """Like evaluate_gate(), but reads results.json files (run inside a volume-mounted function)."""
340
+ cand_path = Path(candidate_results_path)
341
+ if not cand_path.is_file():
342
+ return {"passed": False, "checks": [], "reason": f"missing results file: {cand_path}"}
343
+
344
+ candidate = json.loads(cand_path.read_text())
345
+ baseline = None
346
+ if baseline_results_path and Path(baseline_results_path).is_file():
347
+ baseline = json.loads(Path(baseline_results_path).read_text())
348
+
349
+ return evaluate_gate(candidate=candidate, baseline=baseline, goals=goals)
350
+
351
+
352
+ def render_model_card(
353
+ *,
354
+ job: dict[str, Any],
355
+ gate_result: dict[str, Any],
356
+ candidate: dict[str, Any],
357
+ baseline: dict[str, Any] | None,
358
+ training_payload: dict[str, Any] | None,
359
+ ) -> str:
360
+ def _fmt(v: float | None) -> str:
361
+ return "—" if v is None else f"{v:.4f}"
362
+
363
+ cand_tasks = candidate.get("results", {})
364
+ base_tasks = (baseline or {}).get("results", {})
365
+ base_model = (training_payload or {}).get("model") or BASE_MODEL_ID
366
+
367
+ lines = [
368
+ "---",
369
+ "library_name: peft",
370
+ f"base_model: {base_model}",
371
+ "license: apache-2.0",
372
+ "tags:",
373
+ " - lora",
374
+ " - qlora",
375
+ " - build-small-hackathon",
376
+ " - well-tuned",
377
+ f" - {job.get('category', 'general')}",
378
+ "---",
379
+ "",
380
+ f"# {job['name']}",
381
+ "",
382
+ f"QLoRA adapter for **{job.get('category', 'general')}**, fine-tuned from "
383
+ f"`{base_model}` on `{job['dataset']}` (format: `{job['format']}`).",
384
+ "",
385
+ "Trained, evaluated, and gated on [Modal](https://modal.com/docs/guide) via "
386
+ "`research/modal/` (app `slm-finetune-benchmark`).",
387
+ "",
388
+ "## Benchmark gate",
389
+ "",
390
+ f"- eval profile: `{job.get('eval_profile')}`",
391
+ f"- gate: {'**PASSED**' if gate_result.get('passed') else '**FAILED**'}",
392
+ "",
393
+ "| check | value | result |",
394
+ "| --- | ---: | --- |",
395
+ ]
396
+ for c in gate_result.get("checks", []):
397
+ lines.append(f"| {c['check']} | {_fmt(c['value'])} | {'pass' if c['ok'] else 'fail'} |")
398
+ if not gate_result.get("checks"):
399
+ lines.append("| — | — | — |")
400
+
401
+ lines.extend(
402
+ [
403
+ "",
404
+ "## lm-eval results",
405
+ "",
406
+ "| task | metric | baseline | candidate | delta |",
407
+ "| --- | --- | ---: | ---: | ---: |",
408
+ ]
409
+ )
410
+ for task in sorted(set(cand_tasks) | set(base_tasks)):
411
+ c = primary_metric(cand_tasks.get(task, {}))
412
+ b = primary_metric(base_tasks.get(task, {}))
413
+ metric_name = (c or b or (None, None))[0] or "—"
414
+ c_val = c[1] if c else None
415
+ b_val = b[1] if b else None
416
+ delta = c_val - b_val if (c_val is not None and b_val is not None) else None
417
+ sign = "+" if (delta is not None and delta >= 0) else ""
418
+ delta_str = "—" if delta is None else f"{sign}{delta:.4f}"
419
+ lines.append(f"| {task} | {metric_name} | {_fmt(b_val)} | {_fmt(c_val)} | {delta_str} |")
420
+
421
+ if training_payload:
422
+ lines.extend(
423
+ [
424
+ "",
425
+ "## Training",
426
+ "",
427
+ f"- dataset: `{training_payload.get('dataset')}`",
428
+ f"- mode: `{training_payload.get('mode')}`",
429
+ f"- samples: {training_payload.get('samples')}",
430
+ f"- final train loss: {training_payload.get('metrics', {}).get('final_train_loss')}",
431
+ f"- eval loss: {training_payload.get('metrics', {}).get('eval_loss')}",
432
+ ]
433
+ )
434
+
435
+ lines.extend(
436
+ [
437
+ "",
438
+ "## Load with PEFT",
439
+ "",
440
+ "```python",
441
+ "from peft import PeftModel",
442
+ "from transformers import AutoModelForCausalLM, AutoTokenizer",
443
+ "",
444
+ f'base = "{base_model}"',
445
+ f'adapter = "{job.get("publish", {}).get("hub_repo", "<hub-repo>")}"',
446
+ "",
447
+ "tokenizer = AutoTokenizer.from_pretrained(base, trust_remote_code=True)",
448
+ "model = AutoModelForCausalLM.from_pretrained(",
449
+ ' base, torch_dtype="auto", device_map="auto", trust_remote_code=True',
450
+ ")",
451
+ "model = PeftModel.from_pretrained(model, adapter)",
452
+ "```",
453
+ "",
454
+ ]
455
+ )
456
+ return "\n".join(lines) + "\n"
457
+
458
+
459
+ def publish_adapter_files(
460
+ *,
461
+ job: dict[str, Any],
462
+ adapter_dir: str,
463
+ gate_result: dict[str, Any],
464
+ candidate_results_path: str,
465
+ baseline_results_path: str | None,
466
+ ) -> dict[str, Any]:
467
+ """Write a model card and push the adapter to the Hub — only if the gate passed.
468
+
469
+ Run inside a function with `finetune_vol` mounted and `hf_secret` set.
470
+ """
471
+ publish_cfg = job.get("publish")
472
+ if not publish_cfg:
473
+ return {"published": False, "reason": "no publish config for this job"}
474
+
475
+ if not gate_result.get("passed"):
476
+ return {"published": False, "reason": "gate failed", "gate": gate_result}
477
+
478
+ adapter_path = Path(adapter_dir)
479
+ if not adapter_path.is_dir():
480
+ return {"published": False, "reason": f"adapter dir missing: {adapter_dir}"}
481
+
482
+ candidate = {}
483
+ cand_path = Path(candidate_results_path)
484
+ if cand_path.is_file():
485
+ candidate = json.loads(cand_path.read_text())
486
+
487
+ baseline = None
488
+ if baseline_results_path and Path(baseline_results_path).is_file():
489
+ baseline = json.loads(Path(baseline_results_path).read_text())
490
+
491
+ training_payload = None
492
+ training_results_path = adapter_path / "training_results.json"
493
+ if training_results_path.is_file():
494
+ training_payload = json.loads(training_results_path.read_text())
495
+
496
+ card = render_model_card(
497
+ job=job,
498
+ gate_result=gate_result,
499
+ candidate=candidate,
500
+ baseline=baseline,
501
+ training_payload=training_payload,
502
+ )
503
+ (adapter_path / "README.md").write_text(card)
504
+ commit_volumes()
505
+
506
+ from huggingface_hub import HfApi
507
+
508
+ repo_id = publish_cfg["hub_repo"]
509
+ private = publish_cfg.get("private", True)
510
+
511
+ api = HfApi()
512
+ api.create_repo(repo_id=repo_id, repo_type="model", private=private, exist_ok=True)
513
+ api.upload_folder(
514
+ folder_path=str(adapter_path),
515
+ repo_id=repo_id,
516
+ repo_type="model",
517
+ commit_message=f"Publish {job['name']} (gate passed: {gate_result.get('task')})",
518
+ )
519
+
520
+ return {"published": True, "repo_id": repo_id, "url": f"https://huggingface.co/{repo_id}"}
research/modal/experiments.yaml ADDED
@@ -0,0 +1,131 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Skill matrix for the Modal finetune + lm-eval + publish pipeline.
2
+ #
3
+ # Each entry trains one QLoRA adapter for a skill/category, evaluates it
4
+ # against the matching slm-lm-eval profile (vs. a per-profile baseline),
5
+ # checks the result against `goals`, and — only if the gate passes —
6
+ # publishes the adapter to `publish.hub_repo` on the Hugging Face Hub.
7
+ #
8
+ # Smoke limits (max_steps, max_samples, eval `limit` in the profile configs)
9
+ # keep hackathon runs affordable; bump them for full runs.
10
+ #
11
+ # publish.private is `false` so passing adapters land on the Hub publicly: the
12
+ # Well-Tuned badge requires a judge-visible, fine-tuned published model.
13
+ #
14
+ # Workflows (see modal/README.md):
15
+ # modal run research/modal/finetune_app.py # full sweep: baselines -> train -> eval -> gate -> publish -> pull
16
+ # modal run research/modal/finetune_app.py --job math-lora # one skill
17
+ # modal run research/modal/finetune_app.py --category math # one category
18
+ # modal run research/modal/finetune_app.py --eval-only --job math-lora
19
+ # modal run research/modal/finetune_app.py --no-publish # train+eval, skip Hub push
20
+ # modal run research/modal/finetune_app.py::publish_only --job math-lora
21
+ # modal run research/modal/finetune_app.py::pull --category math
22
+
23
+ defaults:
24
+ preset: minicpm5-1b
25
+ mode: qlora
26
+ gpu: A10G # QLoRA fits on T4 too; override per job with `gpu: T4` for cheaper runs
27
+ max_steps: 100
28
+ # Hugging Face namespace for published adapters.
29
+ hub_org: MSGEncrypted
30
+
31
+ finetune:
32
+ # --- teaching: lesson-planning agent chat data (Well-Tuned primary) ---
33
+ - name: teaching-lora
34
+ category: teaching
35
+ dataset: research/data/education-lesson-chat.jsonl
36
+ format: chat
37
+ description: Lesson-planning agent chat data (local)
38
+ eval_profile: instructions
39
+ goals:
40
+ task: ifeval
41
+ min_score: 0.15
42
+ min_improve: 0.02
43
+ publish:
44
+ hub_repo: MSGEncrypted/minicpm5-1b-teaching-lora
45
+ private: false
46
+
47
+ # --- science: factual + explanatory science tutoring ---
48
+ - name: science-lora
49
+ category: science
50
+ dataset: research/data/science-tutor-chat.jsonl
51
+ format: chat
52
+ description: Science tutor Q&A chat data (local)
53
+ eval_profile: science
54
+ goals:
55
+ task: sciq
56
+ min_score: 0.50
57
+ min_improve: 0.02
58
+ guard_tasks:
59
+ - task: arc_challenge
60
+ max_regress: 0.03
61
+ publish:
62
+ hub_repo: MSGEncrypted/minicpm5-1b-science-lora
63
+ private: false
64
+
65
+ # --- math: grade-school word problems + instruction-style math solutions ---
66
+ - name: math-lora
67
+ category: math
68
+ dataset: TIGER-Lab/MathInstruct
69
+ format: alpaca
70
+ dataset_split: "train[:1000]"
71
+ max_samples: 1000
72
+ description: Math instruction tuning (Hub, instruction/output columns)
73
+ eval_profile: math
74
+ goals:
75
+ task: gsm8k
76
+ min_score: 0.05
77
+ min_improve: 0.02
78
+ guard_tasks:
79
+ - task: arc_challenge
80
+ max_regress: 0.03
81
+ publish:
82
+ hub_repo: MSGEncrypted/minicpm5-1b-math-lora
83
+ private: false
84
+
85
+ # --- coding: Python instruction-following code generation ---
86
+ - name: coding-lora
87
+ category: coding
88
+ dataset: iamtarun/python_code_instructions_18k_alpaca
89
+ format: alpaca
90
+ dataset_split: "train[:1000]"
91
+ max_samples: 1000
92
+ description: Python code instruction tuning (Hub, alpaca columns)
93
+ eval_profile: code
94
+ goals:
95
+ task: mbpp
96
+ min_score: 0.05
97
+ min_improve: 0.01
98
+ publish:
99
+ hub_repo: MSGEncrypted/minicpm5-1b-coding-lora
100
+ private: false
101
+
102
+ # --- reasoning: multi-turn chat with reasoning-heavy conversations ---
103
+ - name: reasoning-lora
104
+ category: reasoning
105
+ dataset: HuggingFaceTB/smoltalk
106
+ format: chat
107
+ dataset_config: all
108
+ dataset_split: "train[:500]"
109
+ max_samples: 500
110
+ description: Multi-turn reasoning/chat subset (Hub)
111
+ eval_profile: reasoning
112
+ goals:
113
+ task: gsm8k
114
+ min_score: 0.05
115
+ min_improve: 0.01
116
+ guard_tasks:
117
+ - task: hellaswag
118
+ max_regress: 0.03
119
+ publish:
120
+ hub_repo: MSGEncrypted/minicpm5-1b-reasoning-lora
121
+ private: false
122
+
123
+ # --- general instructions baseline: no goals/publish -> local-only adapter ---
124
+ - name: alpaca-lora
125
+ category: instructions
126
+ dataset: tatsu-lab/alpaca
127
+ format: alpaca
128
+ dataset_split: train
129
+ max_samples: 200
130
+ description: General instruction tuning baseline (Hub, local-only)
131
+ eval_profile: instructions
research/modal/finetune_app.py ADDED
@@ -0,0 +1,370 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Modal GPU pipeline for research/finetune.py + slm-lm-eval.
3
+
4
+ Skill-matrix pipeline: train -> eval -> gate -> publish.
5
+ Each job in experiments.yaml fine-tunes one QLoRA adapter for a skill
6
+ (math, science, coding, reasoning, teaching, ...), evaluates it against the
7
+ matching slm-lm-eval profile vs. a per-profile baseline, checks the result
8
+ against `goals`, and (only if the gate passes) publishes the adapter to the
9
+ Hugging Face Hub.
10
+
11
+ Run from repo root:
12
+ modal run research/modal/finetune_app.py
13
+ modal run research/modal/finetune_app.py --eval-only
14
+ modal run research/modal/finetune_app.py --job math-lora --max-steps 20
15
+ modal run research/modal/finetune_app.py --category science
16
+ modal run research/modal/finetune_app.py --no-publish --no-pull
17
+ modal run research/modal/finetune_app.py::publish_only --job math-lora
18
+ modal run research/modal/finetune_app.py::pull --category math
19
+ """
20
+
21
+ from __future__ import annotations
22
+
23
+ import json
24
+ import subprocess
25
+ import sys
26
+ from pathlib import Path
27
+ from typing import Any
28
+
29
+ import modal
30
+
31
+ # Make `_common` importable both locally (sibling file) and in the Modal
32
+ # container, where the entrypoint lands at /root but the repo is baked into the
33
+ # image at /repo (see add_local_dir in _common.py).
34
+ for _candidate in (Path(__file__).resolve().parent, Path("/repo/research/modal")):
35
+ if _candidate.is_dir() and str(_candidate) not in sys.path:
36
+ sys.path.insert(0, str(_candidate))
37
+
38
+ from _common import (
39
+ BASE_MODEL_ID,
40
+ FINETUNE_VOL_PATH,
41
+ HF_CACHE_PATH,
42
+ LM_EVAL_OUTPUT,
43
+ build_finetune_cmd,
44
+ build_lm_eval_cmd,
45
+ check_gate_files,
46
+ commit_volumes,
47
+ config_for_profile,
48
+ finetune_vol,
49
+ hf_cache_vol,
50
+ hf_secret,
51
+ image,
52
+ job_gpu,
53
+ load_experiments,
54
+ prepare_jobs,
55
+ publish_adapter_files,
56
+ pull_artifacts,
57
+ reload_volumes,
58
+ repo_env,
59
+ )
60
+
61
+ APP_NAME = "slm-finetune-benchmark"
62
+
63
+ app = modal.App(APP_NAME, image=image)
64
+
65
+
66
+ @app.function(
67
+ gpu="A10G",
68
+ volumes={
69
+ HF_CACHE_PATH: hf_cache_vol,
70
+ FINETUNE_VOL_PATH: finetune_vol,
71
+ },
72
+ secrets=[hf_secret],
73
+ timeout=7200,
74
+ )
75
+ def finetune_one(job: dict[str, Any]) -> dict[str, Any]:
76
+ """Fine-tune one dataset job; persist adapter to Modal Volume."""
77
+ name = job["name"]
78
+ out_dir = f"{FINETUNE_VOL_PATH}/{name}"
79
+ Path(out_dir).mkdir(parents=True, exist_ok=True)
80
+
81
+ cmd = build_finetune_cmd(job, out_dir)
82
+ print("Running:", " ".join(cmd))
83
+ subprocess.run(cmd, cwd="/repo", check=True, env=repo_env())
84
+
85
+ commit_volumes()
86
+
87
+ results_path = Path(out_dir) / "training_results.json"
88
+ payload = json.loads(results_path.read_text())
89
+ payload["job_name"] = name
90
+ return payload
91
+
92
+
93
+ @app.function(
94
+ gpu="A10G",
95
+ volumes={
96
+ HF_CACHE_PATH: hf_cache_vol,
97
+ FINETUNE_VOL_PATH: finetune_vol,
98
+ },
99
+ secrets=[hf_secret],
100
+ timeout=3600,
101
+ )
102
+ def run_lm_eval(
103
+ *,
104
+ experiment_name: str,
105
+ config: str = "research/evals/configs/lm_eval_smoke.yaml",
106
+ preset: str | None = None,
107
+ model_path: str | None = None,
108
+ adapter_path: str | None = None,
109
+ compare_to: str | None = None,
110
+ ) -> dict[str, Any]:
111
+ """Run slm-lm-eval on base model or finetuned checkpoint."""
112
+ reload_volumes()
113
+
114
+ if adapter_path:
115
+ adapter_dir = Path(adapter_path)
116
+ adapter_cfg = adapter_dir / "adapter_config.json"
117
+ if not adapter_cfg.is_file():
118
+ raise FileNotFoundError(
119
+ f"LoRA adapter not visible at {adapter_path} "
120
+ f"(missing {adapter_cfg.name}). "
121
+ "If training just finished, retry after volume commit/reload."
122
+ )
123
+
124
+ cmd = build_lm_eval_cmd(
125
+ experiment_name=experiment_name,
126
+ config=config,
127
+ preset=preset,
128
+ model_path=model_path,
129
+ adapter_path=adapter_path,
130
+ compare_to=compare_to,
131
+ )
132
+ print("Running:", " ".join(cmd))
133
+ proc = subprocess.run(cmd, cwd="/repo", check=False, env=repo_env())
134
+
135
+ commit_volumes()
136
+
137
+ out_root = Path(LM_EVAL_OUTPUT) / experiment_name
138
+ results_json = out_root / "results.json"
139
+ summary_md = out_root / "summary.md"
140
+ comparison_md = out_root / "comparison.md"
141
+
142
+ return {
143
+ "experiment_name": experiment_name,
144
+ "config": config,
145
+ "preset": preset,
146
+ "model_path": model_path,
147
+ "adapter_path": adapter_path,
148
+ "compare_to": compare_to,
149
+ "results_json": str(results_json),
150
+ "summary_md": str(summary_md),
151
+ "comparison_md": str(comparison_md) if comparison_md.is_file() else None,
152
+ "exit_code": proc.returncode,
153
+ "ok": proc.returncode == 0 and results_json.is_file(),
154
+ }
155
+
156
+
157
+ @app.function(volumes={FINETUNE_VOL_PATH: finetune_vol}, timeout=300)
158
+ def check_gate(
159
+ *,
160
+ candidate_results_path: str,
161
+ baseline_results_path: str | None,
162
+ goals: dict[str, Any],
163
+ ) -> dict[str, Any]:
164
+ """Check a candidate's lm-eval results against `goals` (Hub publish gate)."""
165
+ reload_volumes()
166
+ return check_gate_files(
167
+ candidate_results_path=candidate_results_path,
168
+ baseline_results_path=baseline_results_path,
169
+ goals=goals,
170
+ )
171
+
172
+
173
+ @app.function(
174
+ volumes={FINETUNE_VOL_PATH: finetune_vol},
175
+ secrets=[hf_secret],
176
+ timeout=900,
177
+ )
178
+ def publish_adapter(
179
+ *,
180
+ job: dict[str, Any],
181
+ adapter_dir: str,
182
+ gate_result: dict[str, Any],
183
+ candidate_results_path: str,
184
+ baseline_results_path: str | None,
185
+ ) -> dict[str, Any]:
186
+ """Write a model card and push the adapter to the Hub, but only if the gate passed."""
187
+ reload_volumes()
188
+ return publish_adapter_files(
189
+ job=job,
190
+ adapter_dir=adapter_dir,
191
+ gate_result=gate_result,
192
+ candidate_results_path=candidate_results_path,
193
+ baseline_results_path=baseline_results_path,
194
+ )
195
+
196
+
197
+ def _print_summary(rows: list[dict[str, Any]]) -> None:
198
+ print("\n--- summary ---")
199
+ print(f"{'skill':<18} {'category':<12} {'gate':<6} {'published':<10} hub_repo")
200
+ for row in rows:
201
+ gate = "PASS" if row.get("gate_passed") else "fail"
202
+ published = "yes" if row.get("published") else "no"
203
+ print(
204
+ f"{row['name']:<18} {row.get('category') or '-':<12} {gate:<6} "
205
+ f"{published:<10} {row.get('hub_repo') or '-'}"
206
+ )
207
+
208
+
209
+ @app.local_entrypoint()
210
+ def main(
211
+ train: bool = True,
212
+ eval_only: bool = False,
213
+ parallel: bool = False,
214
+ job: str | None = None,
215
+ category: str | None = None,
216
+ max_steps: int | None = None,
217
+ publish: bool = True,
218
+ pull: bool = True,
219
+ ):
220
+ """
221
+ Skill-matrix pipeline: per-profile baselines -> train -> eval -> gate -> publish -> pull.
222
+
223
+ Examples:
224
+ modal run research/modal/finetune_app.py
225
+ modal run research/modal/finetune_app.py --job math-lora --max-steps 20
226
+ modal run research/modal/finetune_app.py --category science
227
+ modal run research/modal/finetune_app.py --eval-only --job math-lora
228
+ modal run research/modal/finetune_app.py --no-publish --no-pull
229
+ """
230
+ defaults, prepared = prepare_jobs(job=job, category=category, max_steps=max_steps)
231
+ if not prepared:
232
+ raise SystemExit("No matching jobs; check --job/--category and experiments.yaml")
233
+ preset = defaults.get("preset", "minicpm5-1b")
234
+
235
+ profiles = sorted({j.get("eval_profile", "compare_study") for j in prepared})
236
+
237
+ baselines_ok: dict[str, bool] = {}
238
+ if not eval_only:
239
+ print(f"--- baselines ({', '.join(profiles)}) ---")
240
+ for profile in profiles:
241
+ result = run_lm_eval.remote(
242
+ experiment_name=f"{preset}__baseline__{profile}",
243
+ config=config_for_profile(profile),
244
+ preset=preset,
245
+ )
246
+ print(json.dumps(result, indent=2))
247
+ baselines_ok[profile] = bool(result.get("ok"))
248
+
249
+ train_results: dict[str, dict[str, Any]] = {}
250
+ if train and not eval_only:
251
+ print(f"--- finetune ({len(prepared)} job(s), parallel={parallel}) ---")
252
+ if parallel:
253
+ handles = {
254
+ j["name"]: finetune_one.with_options(gpu=job_gpu(j)).spawn(j)
255
+ for j in prepared
256
+ }
257
+ for name, handle in handles.items():
258
+ train_results[name] = handle.get()
259
+ print(json.dumps(train_results[name], indent=2))
260
+ else:
261
+ for j in prepared:
262
+ print(f"Training {j['name']}...")
263
+ result = finetune_one.with_options(gpu=job_gpu(j)).remote(j)
264
+ train_results[j["name"]] = result
265
+ print(json.dumps(result, indent=2))
266
+
267
+ print("--- post-train lm-eval / gate / publish ---")
268
+ summary: list[dict[str, Any]] = []
269
+ for j in prepared:
270
+ job_name = j["name"]
271
+ profile = j.get("eval_profile", "compare_study")
272
+ train_payload = train_results.get(job_name)
273
+ adapter_path = (
274
+ train_payload["output_dir"] if train_payload else f"{FINETUNE_VOL_PATH}/{job_name}"
275
+ )
276
+
277
+ baseline_path = f"{LM_EVAL_OUTPUT}/{preset}__baseline__{profile}/results.json"
278
+ compare_to = baseline_path if baselines_ok.get(profile) else None
279
+
280
+ exp_name = f"{job_name}__{profile}"
281
+ eval_result = run_lm_eval.remote(
282
+ experiment_name=exp_name,
283
+ config=config_for_profile(profile),
284
+ model_path=BASE_MODEL_ID,
285
+ adapter_path=adapter_path,
286
+ compare_to=compare_to,
287
+ )
288
+ print(json.dumps(eval_result, indent=2))
289
+
290
+ row: dict[str, Any] = {
291
+ "name": job_name,
292
+ "category": j.get("category"),
293
+ "profile": profile,
294
+ }
295
+
296
+ gate_result: dict[str, Any] | None = None
297
+ if j.get("goals"):
298
+ if eval_result.get("ok"):
299
+ gate_result = check_gate.remote(
300
+ candidate_results_path=eval_result["results_json"],
301
+ baseline_results_path=baseline_path,
302
+ goals=j["goals"],
303
+ )
304
+ print(json.dumps(gate_result, indent=2))
305
+ row["gate_passed"] = bool(gate_result and gate_result.get("passed"))
306
+
307
+ if j.get("publish"):
308
+ row["hub_repo"] = j["publish"].get("hub_repo")
309
+ if publish and gate_result is not None:
310
+ publish_result = publish_adapter.remote(
311
+ job=j,
312
+ adapter_dir=adapter_path,
313
+ gate_result=gate_result,
314
+ candidate_results_path=eval_result["results_json"],
315
+ baseline_results_path=baseline_path,
316
+ )
317
+ print(json.dumps(publish_result, indent=2))
318
+ row["published"] = publish_result.get("published")
319
+
320
+ summary.append(row)
321
+
322
+ if pull:
323
+ pull_artifacts(job_name, exp_name)
324
+
325
+ _print_summary(summary)
326
+
327
+
328
+ @app.local_entrypoint()
329
+ def publish_only(job: str):
330
+ """Re-run the gate and Hub publish for a job using already-computed results (no train/eval)."""
331
+ defaults, prepared = prepare_jobs(job=job)
332
+ j = prepared[0]
333
+ if not j.get("goals"):
334
+ raise SystemExit(f"Job {job!r} has no `goals`; nothing to gate on")
335
+ if not j.get("publish"):
336
+ raise SystemExit(f"Job {job!r} has no `publish` config")
337
+
338
+ preset = defaults.get("preset", "minicpm5-1b")
339
+ profile = j.get("eval_profile", "compare_study")
340
+ adapter_path = f"{FINETUNE_VOL_PATH}/{job}"
341
+ candidate_results_path = f"{LM_EVAL_OUTPUT}/{job}__{profile}/results.json"
342
+ baseline_results_path = f"{LM_EVAL_OUTPUT}/{preset}__baseline__{profile}/results.json"
343
+
344
+ gate_result = check_gate.remote(
345
+ candidate_results_path=candidate_results_path,
346
+ baseline_results_path=baseline_results_path,
347
+ goals=j["goals"],
348
+ )
349
+ print(json.dumps(gate_result, indent=2))
350
+
351
+ publish_result = publish_adapter.remote(
352
+ job=j,
353
+ adapter_dir=adapter_path,
354
+ gate_result=gate_result,
355
+ candidate_results_path=candidate_results_path,
356
+ baseline_results_path=baseline_results_path,
357
+ )
358
+ print(json.dumps(publish_result, indent=2))
359
+
360
+
361
+ @app.local_entrypoint()
362
+ def pull(job: str | None = None, category: str | None = None, dest: str = "models/finetuned"):
363
+ """Download adapters and their lm-eval results from the `slm-finetune` Volume."""
364
+ _, prepared = prepare_jobs(job=job, category=category)
365
+ if not prepared:
366
+ raise SystemExit("No matching jobs; pass --job or --category")
367
+
368
+ for j in prepared:
369
+ profile = j.get("eval_profile", "compare_study")
370
+ pull_artifacts(j["name"], f"{j['name']}__{profile}", dest)
research/modal/server_app.py ADDED
@@ -0,0 +1,472 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Long-lived Modal GPU worker — reuse one warm container for many finetune / eval runs.
3
+
4
+ Deploy once (enables min_containers warm pool across separate CLI invocations):
5
+ modal deploy research/modal/server_app.py
6
+
7
+ Default: keep a GPU worker alive for several hours (blocks local terminal):
8
+ modal run research/modal/server_app.py
9
+ modal run research/modal/server_app.py --hours 6
10
+
11
+ Detached keep-alive (local terminal free):
12
+ modal run -d research/modal/server_app.py --hours 6
13
+
14
+ Run the skill-matrix pipeline on the warm worker (separate terminal, same
15
+ container when deployed) — per-profile baselines -> finetune -> eval -> gate -> publish:
16
+ modal run research/modal/server_app.py --job math-lora --max-steps 20
17
+ modal run research/modal/server_app.py --category science
18
+ modal run research/modal/server_app.py --pipeline --no-publish
19
+ modal run research/modal/server_app.py --eval-only --job math-lora
20
+ modal run research/modal/server_app.py --publish-only --job math-lora
21
+ modal run research/modal/server_app.py --cmd "uv run python research/finetune.py --help"
22
+
23
+ Stop deployed app:
24
+ modal app stop slm-gpu-worker
25
+ """
26
+
27
+ from __future__ import annotations
28
+
29
+ import json
30
+ import shlex
31
+ import subprocess
32
+ import sys
33
+ import time
34
+ from pathlib import Path
35
+ from typing import Any
36
+
37
+ import modal
38
+
39
+ # Make `_common` importable both locally (sibling file) and in the Modal
40
+ # container, where the entrypoint lands at /root but the repo is baked into the
41
+ # image at /repo (see add_local_dir in _common.py).
42
+ for _candidate in (Path(__file__).resolve().parent, Path("/repo/research/modal")):
43
+ if _candidate.is_dir() and str(_candidate) not in sys.path:
44
+ sys.path.insert(0, str(_candidate))
45
+
46
+ from _common import (
47
+ BASE_MODEL_ID,
48
+ DEFAULT_GPU,
49
+ DEFAULT_KEEPALIVE_HOURS,
50
+ DEFAULT_SCALEDOWN_WINDOW,
51
+ DEFAULT_WORKER_TIMEOUT,
52
+ FINETUNE_VOL_PATH,
53
+ HF_CACHE_PATH,
54
+ LM_EVAL_OUTPUT,
55
+ apply_defaults,
56
+ build_finetune_cmd,
57
+ build_lm_eval_cmd,
58
+ check_gate_files,
59
+ commit_volumes,
60
+ config_for_profile,
61
+ finetune_vol,
62
+ hf_cache_vol,
63
+ hf_secret,
64
+ image,
65
+ load_experiments,
66
+ prepare_jobs,
67
+ publish_adapter_files,
68
+ pull_artifacts,
69
+ reload_volumes,
70
+ repo_env,
71
+ )
72
+
73
+ APP_NAME = "slm-gpu-worker"
74
+
75
+ app = modal.App(APP_NAME, image=image)
76
+
77
+
78
+ @app.cls(
79
+ gpu=DEFAULT_GPU,
80
+ volumes={
81
+ HF_CACHE_PATH: hf_cache_vol,
82
+ FINETUNE_VOL_PATH: finetune_vol,
83
+ },
84
+ secrets=[hf_secret],
85
+ timeout=DEFAULT_WORKER_TIMEOUT,
86
+ scaledown_window=DEFAULT_SCALEDOWN_WINDOW,
87
+ min_containers=1,
88
+ )
89
+ class GpuWorker:
90
+ """Single warm GPU container for sequential finetune / lm-eval / shell commands."""
91
+
92
+ @modal.enter()
93
+ def startup(self) -> None:
94
+ reload_volumes()
95
+ print(
96
+ f"GpuWorker ready (HF cache={HF_CACHE_PATH}, finetune vol={FINETUNE_VOL_PATH})"
97
+ )
98
+
99
+ @modal.method()
100
+ def ping(self) -> dict[str, str]:
101
+ return {"status": "ok", "app": APP_NAME}
102
+
103
+ @modal.method()
104
+ def keep_alive(self, hours: float = DEFAULT_KEEPALIVE_HOURS) -> dict[str, Any]:
105
+ """Hold this container open; cheap heartbeat so scaledown_window stays fresh."""
106
+ deadline = time.time() + hours * 3600
107
+ ticks = 0
108
+ while time.time() < deadline:
109
+ remaining = int(deadline - time.time())
110
+ if ticks % 5 == 0:
111
+ print(f"keep_alive: {remaining}s remaining")
112
+ time.sleep(60)
113
+ ticks += 1
114
+ return {"status": "done", "hours": hours}
115
+
116
+ @modal.method()
117
+ def exec_cmd(self, argv: list[str], cwd: str = "/repo") -> dict[str, Any]:
118
+ """Run an arbitrary command in the repo (same env as finetune.py)."""
119
+ print("Running:", " ".join(argv))
120
+ proc = subprocess.run(
121
+ argv,
122
+ cwd=cwd,
123
+ check=False,
124
+ env=repo_env(),
125
+ capture_output=True,
126
+ text=True,
127
+ )
128
+ commit_volumes()
129
+ return {
130
+ "argv": argv,
131
+ "exit_code": proc.returncode,
132
+ "ok": proc.returncode == 0,
133
+ "stdout": proc.stdout,
134
+ "stderr": proc.stderr,
135
+ }
136
+
137
+ @modal.method()
138
+ def finetune(self, job: dict[str, Any]) -> dict[str, Any]:
139
+ """Fine-tune one dataset job via research/finetune.py."""
140
+ name = job["name"]
141
+ out_dir = f"{FINETUNE_VOL_PATH}/{name}"
142
+ Path(out_dir).mkdir(parents=True, exist_ok=True)
143
+
144
+ cmd = build_finetune_cmd(job, out_dir)
145
+ print("Running:", " ".join(cmd))
146
+ subprocess.run(cmd, cwd="/repo", check=True, env=repo_env())
147
+
148
+ commit_volumes()
149
+
150
+ results_path = Path(out_dir) / "training_results.json"
151
+ payload = json.loads(results_path.read_text())
152
+ payload["job_name"] = name
153
+ payload["output_dir"] = out_dir
154
+ return payload
155
+
156
+ @modal.method()
157
+ def lm_eval(
158
+ self,
159
+ *,
160
+ experiment_name: str,
161
+ config: str = "research/evals/configs/lm_eval_smoke.yaml",
162
+ preset: str | None = None,
163
+ model_path: str | None = None,
164
+ adapter_path: str | None = None,
165
+ compare_to: str | None = None,
166
+ ) -> dict[str, Any]:
167
+ """Run slm-lm-eval on base model or finetuned checkpoint."""
168
+ if adapter_path:
169
+ adapter_dir = Path(adapter_path)
170
+ adapter_cfg = adapter_dir / "adapter_config.json"
171
+ if not adapter_cfg.is_file():
172
+ raise FileNotFoundError(
173
+ f"LoRA adapter not visible at {adapter_path} "
174
+ f"(missing {adapter_cfg.name})."
175
+ )
176
+
177
+ cmd = build_lm_eval_cmd(
178
+ experiment_name=experiment_name,
179
+ config=config,
180
+ preset=preset,
181
+ model_path=model_path,
182
+ adapter_path=adapter_path,
183
+ compare_to=compare_to,
184
+ )
185
+ print("Running:", " ".join(cmd))
186
+ proc = subprocess.run(cmd, cwd="/repo", check=False, env=repo_env())
187
+
188
+ commit_volumes()
189
+
190
+ out_root = Path(LM_EVAL_OUTPUT) / experiment_name
191
+ results_json = out_root / "results.json"
192
+ summary_md = out_root / "summary.md"
193
+ comparison_md = out_root / "comparison.md"
194
+
195
+ return {
196
+ "experiment_name": experiment_name,
197
+ "config": config,
198
+ "preset": preset,
199
+ "model_path": model_path,
200
+ "adapter_path": adapter_path,
201
+ "compare_to": compare_to,
202
+ "results_json": str(results_json),
203
+ "summary_md": str(summary_md),
204
+ "comparison_md": str(comparison_md) if comparison_md.is_file() else None,
205
+ "exit_code": proc.returncode,
206
+ "ok": proc.returncode == 0,
207
+ }
208
+
209
+ @modal.method()
210
+ def check_gate(
211
+ self,
212
+ *,
213
+ candidate_results_path: str,
214
+ baseline_results_path: str | None,
215
+ goals: dict[str, Any],
216
+ ) -> dict[str, Any]:
217
+ """Check a candidate's lm-eval results against `goals` (Hub publish gate)."""
218
+ return check_gate_files(
219
+ candidate_results_path=candidate_results_path,
220
+ baseline_results_path=baseline_results_path,
221
+ goals=goals,
222
+ )
223
+
224
+ @modal.method()
225
+ def publish_adapter(
226
+ self,
227
+ *,
228
+ job: dict[str, Any],
229
+ adapter_dir: str,
230
+ gate_result: dict[str, Any],
231
+ candidate_results_path: str,
232
+ baseline_results_path: str | None,
233
+ ) -> dict[str, Any]:
234
+ """Write a model card and push the adapter to the Hub, but only if the gate passed."""
235
+ return publish_adapter_files(
236
+ job=job,
237
+ adapter_dir=adapter_dir,
238
+ gate_result=gate_result,
239
+ candidate_results_path=candidate_results_path,
240
+ baseline_results_path=baseline_results_path,
241
+ )
242
+
243
+ @modal.method()
244
+ def run_pipeline(
245
+ self,
246
+ *,
247
+ job_names: list[str] | None = None,
248
+ category: str | None = None,
249
+ max_steps: int | None = None,
250
+ train: bool = True,
251
+ eval_only: bool = False,
252
+ publish: bool = True,
253
+ ) -> dict[str, Any]:
254
+ """Per-profile baselines -> finetune -> eval -> gate -> publish (same container)."""
255
+ spec = load_experiments()
256
+ defaults = spec.get("defaults", {})
257
+ jobs = spec.get("finetune", [])
258
+
259
+ if job_names:
260
+ jobs = [j for j in jobs if j.get("name") in job_names]
261
+ if not jobs:
262
+ raise ValueError(f"No matching jobs in experiments.yaml: {job_names}")
263
+ if category:
264
+ jobs = [j for j in jobs if j.get("category") == category]
265
+ if not jobs:
266
+ raise ValueError(f"No jobs with category {category!r}")
267
+ if not jobs:
268
+ raise ValueError("No jobs matched job_names/category")
269
+
270
+ preset = defaults.get("preset", "minicpm5-1b")
271
+ prepared: list[dict[str, Any]] = []
272
+ for raw in jobs:
273
+ merged = apply_defaults(raw, defaults)
274
+ if max_steps is not None:
275
+ merged["max_steps"] = max_steps
276
+ prepared.append(merged)
277
+
278
+ profiles = sorted({j.get("eval_profile", "compare_study") for j in prepared})
279
+
280
+ baselines_ok: dict[str, bool] = {}
281
+ if not eval_only:
282
+ for profile in profiles:
283
+ result = self.lm_eval.local(
284
+ experiment_name=f"{preset}__baseline__{profile}",
285
+ config=config_for_profile(profile),
286
+ preset=preset,
287
+ )
288
+ baselines_ok[profile] = bool(result.get("ok"))
289
+
290
+ train_results: dict[str, dict[str, Any]] = {}
291
+ if train and not eval_only:
292
+ for j in prepared:
293
+ train_results[j["name"]] = self.finetune.local(j)
294
+
295
+ rows: list[dict[str, Any]] = []
296
+ for j in prepared:
297
+ job_name = j["name"]
298
+ profile = j.get("eval_profile", "compare_study")
299
+ train_payload = train_results.get(job_name)
300
+ adapter_path = (
301
+ train_payload["output_dir"]
302
+ if train_payload
303
+ else f"{FINETUNE_VOL_PATH}/{job_name}"
304
+ )
305
+
306
+ baseline_path = f"{LM_EVAL_OUTPUT}/{preset}__baseline__{profile}/results.json"
307
+ compare_to = baseline_path if baselines_ok.get(profile) else None
308
+
309
+ exp_name = f"{job_name}__{profile}"
310
+ eval_result = self.lm_eval.local(
311
+ experiment_name=exp_name,
312
+ config=config_for_profile(profile),
313
+ model_path=BASE_MODEL_ID,
314
+ adapter_path=adapter_path,
315
+ compare_to=compare_to,
316
+ )
317
+
318
+ row: dict[str, Any] = {
319
+ "name": job_name,
320
+ "category": j.get("category"),
321
+ "profile": profile,
322
+ "eval": eval_result,
323
+ }
324
+
325
+ gate_result: dict[str, Any] | None = None
326
+ if j.get("goals"):
327
+ if eval_result.get("ok"):
328
+ gate_result = self.check_gate.local(
329
+ candidate_results_path=eval_result["results_json"],
330
+ baseline_results_path=baseline_path,
331
+ goals=j["goals"],
332
+ )
333
+ row["gate"] = gate_result
334
+
335
+ if j.get("publish") and publish and gate_result is not None:
336
+ row["publish"] = self.publish_adapter.local(
337
+ job=j,
338
+ adapter_dir=adapter_path,
339
+ gate_result=gate_result,
340
+ candidate_results_path=eval_result["results_json"],
341
+ baseline_results_path=baseline_path,
342
+ )
343
+
344
+ rows.append(row)
345
+
346
+ return {"jobs": rows}
347
+
348
+
349
+ def _worker() -> GpuWorker:
350
+ """Prefer deployed warm worker; fall back to ephemeral cls for first deploy."""
351
+ try:
352
+ cls = modal.Cls.from_name(APP_NAME, "GpuWorker")
353
+ return cls()
354
+ except modal.exception.NotFoundError:
355
+ return GpuWorker()
356
+
357
+
358
+ @app.local_entrypoint()
359
+ def main(
360
+ serve: bool = True,
361
+ hours: float = DEFAULT_KEEPALIVE_HOURS,
362
+ cmd: str | None = None,
363
+ job: str | None = None,
364
+ category: str | None = None,
365
+ max_steps: int | None = None,
366
+ eval_only: bool = False,
367
+ pipeline: bool = False,
368
+ publish: bool = True,
369
+ publish_only: bool = False,
370
+ pull: bool = True,
371
+ ping: bool = False,
372
+ ):
373
+ """
374
+ GPU worker CLI.
375
+
376
+ With no task flags, keeps one container alive (default). With --job/--category,
377
+ --cmd, --eval-only, --pipeline, or --publish-only, runs that task on the warm
378
+ worker instead. --pipeline (and --job/--category/--eval-only) run the skill-matrix
379
+ pipeline: per-profile baselines -> finetune -> eval -> gate -> publish.
380
+
381
+ Examples:
382
+ modal deploy research/modal/server_app.py
383
+ modal run research/modal/server_app.py
384
+ modal run research/modal/server_app.py --pipeline --job math-lora --max-steps 20
385
+ modal run research/modal/server_app.py --pipeline --category science --no-publish
386
+ modal run research/modal/server_app.py --eval-only --job math-lora
387
+ modal run research/modal/server_app.py --publish-only --job math-lora
388
+ modal run research/modal/server_app.py --cmd "uv run python research/finetune.py --help"
389
+ """
390
+ has_task = bool(cmd or job or category or eval_only or pipeline or publish_only or ping)
391
+ if has_task:
392
+ serve = False
393
+
394
+ worker = _worker()
395
+
396
+ if ping:
397
+ print(json.dumps(worker.ping.remote(), indent=2))
398
+ return
399
+
400
+ if cmd:
401
+ argv = shlex.split(cmd)
402
+ result = worker.exec_cmd.remote(argv)
403
+ if result.get("stdout"):
404
+ print(result["stdout"], end="")
405
+ if result.get("stderr"):
406
+ print(result["stderr"], end="", file=__import__("sys").stderr)
407
+ if not result.get("ok"):
408
+ raise SystemExit(result.get("exit_code", 1))
409
+ return
410
+
411
+ if publish_only:
412
+ if not job:
413
+ raise SystemExit("--publish-only requires --job")
414
+ defaults, prepared = prepare_jobs(job=job)
415
+ j = prepared[0]
416
+ if not j.get("goals") or not j.get("publish"):
417
+ raise SystemExit(f"Job {job!r} needs `goals` and `publish` in experiments.yaml")
418
+
419
+ preset = defaults.get("preset", "minicpm5-1b")
420
+ profile = j.get("eval_profile", "compare_study")
421
+ adapter_path = f"{FINETUNE_VOL_PATH}/{job}"
422
+ candidate_results_path = f"{LM_EVAL_OUTPUT}/{job}__{profile}/results.json"
423
+ baseline_results_path = f"{LM_EVAL_OUTPUT}/{preset}__baseline__{profile}/results.json"
424
+
425
+ gate_result = worker.check_gate.remote(
426
+ candidate_results_path=candidate_results_path,
427
+ baseline_results_path=baseline_results_path,
428
+ goals=j["goals"],
429
+ )
430
+ print(json.dumps(gate_result, indent=2))
431
+
432
+ result = worker.publish_adapter.remote(
433
+ job=j,
434
+ adapter_dir=adapter_path,
435
+ gate_result=gate_result,
436
+ candidate_results_path=candidate_results_path,
437
+ baseline_results_path=baseline_results_path,
438
+ )
439
+ print(json.dumps(result, indent=2))
440
+ return
441
+
442
+ if pipeline or job or category or eval_only:
443
+ job_names = [job] if job else None
444
+ result = worker.run_pipeline.remote(
445
+ job_names=job_names,
446
+ category=category,
447
+ max_steps=max_steps,
448
+ train=not eval_only,
449
+ eval_only=eval_only,
450
+ publish=publish,
451
+ )
452
+ print(json.dumps(result, indent=2))
453
+
454
+ if pull:
455
+ for row in result.get("jobs", []):
456
+ pull_artifacts(row["name"], f"{row['name']}__{row['profile']}")
457
+ return
458
+
459
+ if serve:
460
+ print(
461
+ f"Keeping GpuWorker alive for {hours}h "
462
+ f"(deploy with `modal deploy` so other terminals reuse this container)"
463
+ )
464
+ worker.ping.remote()
465
+ result = worker.keep_alive.remote(hours=hours)
466
+ print(json.dumps(result, indent=2))
467
+ return
468
+
469
+ raise SystemExit(
470
+ "Nothing to do. Use default serve mode, or pass --job, --category, --cmd, "
471
+ "--pipeline, --eval-only, --publish-only, or --ping."
472
+ )
{notebook → research/notebook}/gemma-finetune.ipynb RENAMED
File without changes
research/notebook/minicpm5-modal-finetune.ipynb ADDED
@@ -0,0 +1,216 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cells": [
3
+ {
4
+ "cell_type": "markdown",
5
+ "metadata": {},
6
+ "source": [
7
+ "# MiniCPM5-1B fine-tune on Modal Notebooks\n",
8
+ "\n",
9
+ "Interactive path for the **Modal** + **Well-Tuned** hackathon tracks.\n",
10
+ "\n",
11
+ "**Setup (sidebar before running cells):**\n",
12
+ "1. [modal.com/notebooks](https://modal.com/notebooks) — upload this `.ipynb`\n",
13
+ "2. **Compute profile** → enable GPU (e.g. A10G)\n",
14
+ "3. **Files** → attach Volume `slm-finetune` (mounts at `/mnt/slm-finetune`)\n",
15
+ "4. **Secrets** → attach `huggingface` (`HF_TOKEN`)\n",
16
+ "\n",
17
+ "Model: [openbmb/MiniCPM5-1B](https://huggingface.co/openbmb/MiniCPM5-1B)\n",
18
+ "\n",
19
+ "Docs: [Modal Notebooks](https://modal.com/docs/guide/notebooks) · [Volumes](https://modal.com/docs/guide/volumes)\n",
20
+ "\n",
21
+ "For reproducible sweeps: `modal run research/modal/finetune_app.py`"
22
+ ]
23
+ },
24
+ {
25
+ "cell_type": "code",
26
+ "metadata": {},
27
+ "source": [
28
+ "# Verify GPU (Modal Notebooks provide CUDA)\n",
29
+ "!nvidia-smi"
30
+ ],
31
+ "execution_count": null,
32
+ "outputs": []
33
+ },
34
+ {
35
+ "cell_type": "code",
36
+ "metadata": {},
37
+ "source": [
38
+ "# Clone repo (replace with your fork URL) or upload files via the Files panel\n",
39
+ "!git clone https://github.com/YOUR_USER/small-model-hackathon.git /root/repo 2>/dev/null || true\n",
40
+ "%cd /root/repo\n",
41
+ "!pwd && ls research/finetune.py models.yaml"
42
+ ],
43
+ "execution_count": null,
44
+ "outputs": [],
45
+ "id": "29c67e82"
46
+ },
47
+ {
48
+ "cell_type": "code",
49
+ "metadata": {},
50
+ "source": [
51
+ "# Install project deps (Modal default image has torch/transformers; add finetune stack)\n",
52
+ "%uv pip install uv peft datasets bitsandbytes accelerate\n",
53
+ "!uv sync --frozen --group finetune --group lm-eval --no-dev"
54
+ ],
55
+ "execution_count": null,
56
+ "outputs": [],
57
+ "id": "42516dfe"
58
+ },
59
+ {
60
+ "cell_type": "code",
61
+ "metadata": {},
62
+ "source": [
63
+ "import os\n",
64
+ "from pathlib import Path\n",
65
+ "\n",
66
+ "os.environ[\"TRUST_REMOTE_CODE\"] = \"true\"\n",
67
+ "os.environ.setdefault(\"PYTORCH_CUDA_ALLOC_CONF\", \"expandable_segments:True\")\n",
68
+ "\n",
69
+ "# Persist on attached Volume (ephemeral container disk is lost on kernel stop)\n",
70
+ "VOL = Path(\"/mnt/slm-finetune\")\n",
71
+ "OUT = VOL / \"lesson-lora-notebook\" if VOL.is_dir() else Path(\"./models/finetuned/minicpm5-1b-lesson-lora\")\n",
72
+ "OUT.mkdir(parents=True, exist_ok=True)\n",
73
+ "print(f\"Checkpoint dir: {OUT}\")"
74
+ ],
75
+ "execution_count": null,
76
+ "outputs": [],
77
+ "id": "52e970ac"
78
+ },
79
+ {
80
+ "cell_type": "markdown",
81
+ "metadata": {},
82
+ "source": [
83
+ "## Smoke fine-tune (LoRA, 20 steps)\n",
84
+ "\n",
85
+ "Uses the lesson-agent chat dataset by default."
86
+ ]
87
+ },
88
+ {
89
+ "cell_type": "code",
90
+ "metadata": {},
91
+ "source": [
92
+ "!uv run python research/finetune.py \\\n",
93
+ " --preset minicpm5-1b \\\n",
94
+ " --mode lora \\\n",
95
+ " --dataset research/data/education-lesson-chat.jsonl \\\n",
96
+ " --format chat \\\n",
97
+ " --out {OUT} \\\n",
98
+ " --max_steps 20"
99
+ ],
100
+ "execution_count": null,
101
+ "outputs": [],
102
+ "id": "dfcb2f58"
103
+ },
104
+ {
105
+ "cell_type": "markdown",
106
+ "metadata": {},
107
+ "source": [
108
+ "## Baseline lm-eval (smoke)"
109
+ ]
110
+ },
111
+ {
112
+ "cell_type": "code",
113
+ "metadata": {},
114
+ "source": [
115
+ "!uv run --package slm-evals slm-lm-eval \\\n",
116
+ " --config research/evals/configs/lm_eval_smoke.yaml \\\n",
117
+ " --preset minicpm5-1b \\\n",
118
+ " --experiment-name minicpm5-1b__notebook-baseline"
119
+ ],
120
+ "execution_count": null,
121
+ "outputs": []
122
+ },
123
+ {
124
+ "cell_type": "markdown",
125
+ "metadata": {},
126
+ "source": [
127
+ "## Post-train lm-eval (adapter)"
128
+ ]
129
+ },
130
+ {
131
+ "cell_type": "code",
132
+ "metadata": {},
133
+ "source": [
134
+ "!uv run --package slm-evals slm-lm-eval \\\n",
135
+ " --config research/evals/configs/lm_eval_smoke.yaml \\\n",
136
+ " --model openbmb/MiniCPM5-1B \\\n",
137
+ " --adapter {OUT} \\\n",
138
+ " --experiment-name minicpm5-1b-lesson-lora__notebook \\\n",
139
+ " --compare-to results/lm_eval/minicpm5-1b__notebook-baseline/results.json"
140
+ ],
141
+ "execution_count": null,
142
+ "outputs": [],
143
+ "id": "d1a14c50"
144
+ },
145
+ {
146
+ "cell_type": "markdown",
147
+ "metadata": {},
148
+ "source": [
149
+ "## Sample generation"
150
+ ]
151
+ },
152
+ {
153
+ "cell_type": "code",
154
+ "metadata": {},
155
+ "source": [
156
+ "import torch\n",
157
+ "from peft import PeftModel\n",
158
+ "from transformers import AutoModelForCausalLM, AutoTokenizer\n",
159
+ "\n",
160
+ "base_id = \"openbmb/MiniCPM5-1B\"\n",
161
+ "adapter_dir = str(OUT)\n",
162
+ "\n",
163
+ "tokenizer = AutoTokenizer.from_pretrained(base_id, trust_remote_code=True)\n",
164
+ "base = AutoModelForCausalLM.from_pretrained(\n",
165
+ " base_id, torch_dtype=torch.bfloat16, device_map=\"auto\", trust_remote_code=True\n",
166
+ ")\n",
167
+ "model = PeftModel.from_pretrained(base, adapter_dir)\n",
168
+ "model.eval()\n",
169
+ "\n",
170
+ "prompt = \"Explain photosynthesis in one short paragraph for a 10-year-old.\"\n",
171
+ "if tokenizer.chat_template:\n",
172
+ " text = tokenizer.apply_chat_template(\n",
173
+ " [{\"role\": \"user\", \"content\": prompt}],\n",
174
+ " tokenize=False,\n",
175
+ " add_generation_prompt=True,\n",
176
+ " )\n",
177
+ "else:\n",
178
+ " text = prompt\n",
179
+ "\n",
180
+ "ids = tokenizer(text, return_tensors=\"pt\").to(model.device)\n",
181
+ "out = model.generate(**ids, max_new_tokens=120, do_sample=True, temperature=0.7)\n",
182
+ "print(tokenizer.decode(out[0][ids[\"input_ids\"].shape[1]:], skip_special_tokens=True))"
183
+ ],
184
+ "execution_count": null,
185
+ "outputs": [],
186
+ "id": "07706c76"
187
+ },
188
+ {
189
+ "cell_type": "markdown",
190
+ "metadata": {},
191
+ "source": [
192
+ "## After training\n",
193
+ "\n",
194
+ "- **Volume attached?** Use the Files panel (⬇) or run locally: `modal volume get slm-finetune lesson-lora-notebook ./models/finetuned/minicpm5-1b-lora`\n",
195
+ "- **Hub:** `huggingface-cli upload your-user/minicpm5-1b-lesson-lora <path-to-OUT> . --repo-type model`\n",
196
+ "- **Share notebook:** Share → public link → \"Can view and run\" for hackathon judges\n",
197
+ "\n",
198
+ "Full docs: `research/modal/README.md` in the repo."
199
+ ],
200
+ "id": "8cd6b7dd"
201
+ }
202
+ ],
203
+ "metadata": {
204
+ "kernelspec": {
205
+ "display_name": "Python 3",
206
+ "language": "python",
207
+ "name": "python3"
208
+ },
209
+ "language_info": {
210
+ "name": "python",
211
+ "version": "3.12.0"
212
+ }
213
+ },
214
+ "nbformat": 4,
215
+ "nbformat_minor": 5
216
+ }
uv.lock CHANGED
@@ -404,6 +404,46 @@ wheels = [
404
  { url = "https://files.pythonhosted.org/packages/d8/ef/e7e485ce5e4ba3843a0a92feb767c7b6098fd6e65ce752918074d175ae71/brotlicffi-1.2.0.1-cp38-abi3-win_amd64.whl", hash = "sha256:da2e82a08e7778b8bc539d27ca03cdd684113e81394bfaaad8d0dfc6a17ddede", size = 379026, upload-time = "2026-03-05T19:54:04.322Z" },
405
  ]
406
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
407
  [[package]]
408
  name = "certifi"
409
  version = "2026.5.20"
@@ -1174,6 +1214,19 @@ wheels = [
1174
  { url = "https://files.pythonhosted.org/packages/28/27/3d6dcadc8a3214d8522c1e7f6a19554e33659be44546d44a2f7572ac7d2a/groovy-0.1.2-py3-none-any.whl", hash = "sha256:7f7975bab18c729a257a8b1ae9dcd70b7cafb1720481beae47719af57c35fa64", size = 14090, upload-time = "2025-02-28T20:24:55.152Z" },
1175
  ]
1176
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1177
  [[package]]
1178
  name = "h11"
1179
  version = "0.16.0"
@@ -1345,6 +1398,15 @@ wheels = [
1345
  { url = "https://files.pythonhosted.org/packages/1e/5e/d4e9f1a599fb8e573b7b87160658329fbf28d19eac2718f51fc3def3aa5a/idna-3.18-py3-none-any.whl", hash = "sha256:7f952cbe720b688055e3f87de14f5c3e5fdaa8bc3928985c4077ca689de849a2", size = 65455, upload-time = "2026-06-02T14:34:06.319Z" },
1346
  ]
1347
 
 
 
 
 
 
 
 
 
 
1348
  [[package]]
1349
  name = "inference"
1350
  version = "0.1.0"
@@ -1500,6 +1562,15 @@ wheels = [
1500
  { url = "https://files.pythonhosted.org/packages/b5/91/53255615acd2a1eaca307ede3c90eb550bae9c94581f8c00081b6b1c8f44/kiwisolver-1.5.0-graalpy312-graalpy250_312_native-win_amd64.whl", hash = "sha256:1f1489f769582498610e015a8ef2d36f28f505ab3096d0e16b4858a9ec214f57", size = 75987, upload-time = "2026-03-09T13:15:39.65Z" },
1501
  ]
1502
 
 
 
 
 
 
 
 
 
 
1503
  [[package]]
1504
  name = "lazy-loader"
1505
  version = "0.5"
@@ -1606,6 +1677,11 @@ hf = [
1606
  { name = "torch" },
1607
  { name = "transformers" },
1608
  ]
 
 
 
 
 
1609
 
1610
  [[package]]
1611
  name = "lxml"
@@ -1854,6 +1930,30 @@ wheels = [
1854
  { url = "https://files.pythonhosted.org/packages/b3/38/89ba8ad64ae25be8de66a6d463314cf1eb366222074cfda9ee839c56a4b4/mdurl-0.1.2-py3-none-any.whl", hash = "sha256:84008a41e51615a49fc9966191ff91509e3c40b939176e643fd50a5c2196b8f8", size = 9979, upload-time = "2022-08-14T12:40:09.779Z" },
1855
  ]
1856
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1857
  [[package]]
1858
  name = "more-itertools"
1859
  version = "11.1.0"
@@ -2754,17 +2854,17 @@ wheels = [
2754
 
2755
  [[package]]
2756
  name = "protobuf"
2757
- version = "7.35.1"
2758
  source = { registry = "https://pypi.org/simple" }
2759
- sdist = { url = "https://files.pythonhosted.org/packages/da/01/9ef0afd7999eb9badb3a768b4aedd78c86d4c65cfaf1958ab276199e76b4/protobuf-7.35.1.tar.gz", hash = "sha256:ce115a26fe0c39a2c29973d914d327e516a6455464489fe3cd1e51a1b354f81a", size = 458717, upload-time = "2026-06-11T21:55:40.257Z" }
2760
  wheels = [
2761
- { url = "https://files.pythonhosted.org/packages/10/03/8aeeb7458d22546bf64b5250ca1daeb5ff757d900e8e4a7476c6f0db843e/protobuf-7.35.1-cp310-abi3-macosx_10_9_universal2.whl", hash = "sha256:24f857477359a85c0c235261b8ba905fd51b2562f4a64ca1df5473f29850cbf6", size = 433226, upload-time = "2026-06-11T21:55:31.719Z" },
2762
- { url = "https://files.pythonhosted.org/packages/37/4b/dfb89eb0e652a1ff073c39a59fb5e3a83cfe9b57a2c83fa6d78270101767/protobuf-7.35.1-cp310-abi3-manylinux2014_aarch64.whl", hash = "sha256:11d6b0ec246892d85215b0a13ca6e0233cf5284b68f0ac02646427f4ff88a799", size = 328847, upload-time = "2026-06-11T21:55:34.035Z" },
2763
- { url = "https://files.pythonhosted.org/packages/0f/58/dc12f2cd484951524af6e3382c785869b9b3fb5e52ee95ae23add53ee8f9/protobuf-7.35.1-cp310-abi3-manylinux2014_s390x.whl", hash = "sha256:b73f9489a4b8b1c9cb1f8ed951c736392592edb24b9d6819f36d2e10b171d5b4", size = 344030, upload-time = "2026-06-11T21:55:34.941Z" },
2764
- { url = "https://files.pythonhosted.org/packages/e4/be/5b3cfe508bfab6761414ff944e3366eb13be4fd71efcd69450f89ba39f43/protobuf-7.35.1-cp310-abi3-manylinux2014_x86_64.whl", hash = "sha256:74758715c53d7158fb76caf4f0cfdacc5329a4b1bb994f865d6cf302d413a1c4", size = 327130, upload-time = "2026-06-11T21:55:35.921Z" },
2765
- { url = "https://files.pythonhosted.org/packages/d8/bc/6d6c7ba8709c85f8f2c390b2b118d6fb08a783676a572271851bf45a7d22/protobuf-7.35.1-cp310-abi3-win32.whl", hash = "sha256:353652e4efd0bca5b5fc2656abf8307ef351f0cf938c9eba09f0e09c20a25c30", size = 428945, upload-time = "2026-06-11T21:55:37.034Z" },
2766
- { url = "https://files.pythonhosted.org/packages/0a/19/8d0cb6f20a1ef7b18f1c8986ad5783f22f84cce39c6ce9a6e645ea55192e/protobuf-7.35.1-cp310-abi3-win_amd64.whl", hash = "sha256:230a75ddfc2de4806e56696ce9640c1cdfdb6543b7cfce98d42a4c0a0e7bdb87", size = 439996, upload-time = "2026-06-11T21:55:38.123Z" },
2767
- { url = "https://files.pythonhosted.org/packages/19/c7/5f7c636ec43e0c545e28d1f1db71990108306f7bdcb89f069ba97e428e7f/protobuf-7.35.1-py3-none-any.whl", hash = "sha256:4bc97768d8fe4ad6743c8a19403e314511ed9f6d13205b687e52421c023ac1b9", size = 171659, upload-time = "2026-06-11T21:55:39.155Z" },
2768
  ]
2769
 
2770
  [[package]]
@@ -3581,7 +3681,7 @@ dependencies = [
3581
 
3582
  [package.optional-dependencies]
3583
  lm-eval = [
3584
- { name = "lm-eval", extra = ["hf"] },
3585
  ]
3586
 
3587
  [package.metadata]
@@ -3590,7 +3690,7 @@ requires-dist = [
3590
  { name = "bitsandbytes", specifier = ">=0.43.0" },
3591
  { name = "datasets", specifier = ">=2.19.0" },
3592
  { name = "huggingface-hub", specifier = ">=0.22.0" },
3593
- { name = "lm-eval", extras = ["hf"], marker = "extra == 'lm-eval'", specifier = ">=0.4.9" },
3594
  { name = "pandas", specifier = ">=2.0.0" },
3595
  { name = "peft", specifier = ">=0.14.0" },
3596
  { name = "pyyaml", specifier = ">=6.0" },
@@ -3628,6 +3728,10 @@ finetune = [
3628
  lm-eval = [
3629
  { name = "slm-evals", extra = ["lm-eval"] },
3630
  ]
 
 
 
 
3631
 
3632
  [package.metadata]
3633
  requires-dist = [
@@ -3650,6 +3754,10 @@ finetune = [
3650
  { name = "peft", specifier = ">=0.14.0" },
3651
  ]
3652
  lm-eval = [{ name = "slm-evals", extras = ["lm-eval"], editable = "research/evals" }]
 
 
 
 
3653
 
3654
  [[package]]
3655
  name = "socksio"
@@ -3792,6 +3900,18 @@ wheels = [
3792
  { url = "https://files.pythonhosted.org/packages/a2/09/77d55d46fd61b4a135c444fc97158ef34a095e5681d0a6c10b75bf356191/sympy-1.14.0-py3-none-any.whl", hash = "sha256:e091cc3e99d2141a0ba2847328f5479b05d94a6635cb96148ccb3f34671bd8f5", size = 6299353, upload-time = "2025-04-27T18:04:59.103Z" },
3793
  ]
3794
 
 
 
 
 
 
 
 
 
 
 
 
 
3795
  [[package]]
3796
  name = "tabledata"
3797
  version = "1.3.5"
@@ -3867,6 +3987,15 @@ wheels = [
3867
  { url = "https://files.pythonhosted.org/packages/72/f4/0de46cfa12cdcbcd464cc59fde36912af405696f687e53a091fb432f694c/tokenizers-0.22.2-cp39-abi3-win_arm64.whl", hash = "sha256:9ce725d22864a1e965217204946f830c37876eee3b2ba6fc6255e8e903d5fcbc", size = 2612133, upload-time = "2026-01-05T10:45:17.232Z" },
3868
  ]
3869
 
 
 
 
 
 
 
 
 
 
3870
  [[package]]
3871
  name = "tomlkit"
3872
  version = "0.14.0"
@@ -4053,6 +4182,24 @@ wheels = [
4053
  { url = "https://files.pythonhosted.org/packages/3f/f9/2b3ff4e56e5fa7debfaf9eb135d0da96f3e9a1d5b27222223c7296336e5f/typer-0.25.1-py3-none-any.whl", hash = "sha256:75caa44ed46a03fb2dab8808753ffacdbfea88495e74c85a28c5eefcf5f39c89", size = 58409, upload-time = "2026-04-30T19:32:18.271Z" },
4054
  ]
4055
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4056
  [[package]]
4057
  name = "typing-extensions"
4058
  version = "4.15.0"
@@ -4117,6 +4264,92 @@ wheels = [
4117
  { url = "https://files.pythonhosted.org/packages/88/fa/e1388bbcf24ef3274f45c0c1c7b501fd14971037c1b6ee23610553307497/uvicorn-0.49.0-py3-none-any.whl", hash = "sha256:ba3d14c3ee7e41c6c654c46c9eb489d33213cdd30aa1696eab1374337c13f68f", size = 71376, upload-time = "2026-06-03T22:01:29.037Z" },
4118
  ]
4119
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4120
  [[package]]
4121
  name = "word2number"
4122
  version = "1.1"
 
404
  { url = "https://files.pythonhosted.org/packages/d8/ef/e7e485ce5e4ba3843a0a92feb767c7b6098fd6e65ce752918074d175ae71/brotlicffi-1.2.0.1-cp38-abi3-win_amd64.whl", hash = "sha256:da2e82a08e7778b8bc539d27ca03cdd684113e81394bfaaad8d0dfc6a17ddede", size = 379026, upload-time = "2026-03-05T19:54:04.322Z" },
405
  ]
406
 
407
+ [[package]]
408
+ name = "cbor2"
409
+ version = "6.1.2"
410
+ source = { registry = "https://pypi.org/simple" }
411
+ sdist = { url = "https://files.pythonhosted.org/packages/75/af/473c241e41c142ea06ebef8d1f660fa6ff928fb97210e7bec8ee5974f8cd/cbor2-6.1.2.tar.gz", hash = "sha256:6b43037a66947dee5af0abb1a4c3a13b3abac5a4a3f32f9771efbbcd030fd909", size = 86760, upload-time = "2026-06-02T19:01:29.333Z" }
412
+ wheels = [
413
+ { url = "https://files.pythonhosted.org/packages/5e/0c/a857b6ca032282b564cf25de18ad92fe0614e8b3fa3422eb10e32a873939/cbor2-6.1.2-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:92b158d3ff9d9dce70eeb09786a6e518e3cb0ecb927fd23e9a0f7fc4b175c01a", size = 409592, upload-time = "2026-06-02T19:00:44.556Z" },
414
+ { url = "https://files.pythonhosted.org/packages/29/db/e0518153b3228159d9373f3b5785d7ea2d68898e27ee1bce7d03f0b5f7aa/cbor2-6.1.2-cp312-cp312-manylinux_2_28_aarch64.whl", hash = "sha256:d29a11044b07048e19f39a87fe8fea7ea865eb0ace50dc4c29513d52d40e2ddf", size = 454598, upload-time = "2026-06-02T19:00:45.784Z" },
415
+ { url = "https://files.pythonhosted.org/packages/29/67/62127b22edc6011ba55b76a28ab7c2219a45d01871a8199532e0978b26d1/cbor2-6.1.2-cp312-cp312-manylinux_2_28_x86_64.whl", hash = "sha256:a106f174eda34d8937a621c7f3e6044586cb209170cdc8da0ffbea89d1d6e385", size = 467380, upload-time = "2026-06-02T19:00:47.196Z" },
416
+ { url = "https://files.pythonhosted.org/packages/7c/95/7992d8ec904c116ad547abb4960cc3fde695d5853c66596b1465d14d2f7b/cbor2-6.1.2-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:2ea16a25cc457a92879ff7a36cc50b587bddba09d8176bf1a94803eec5aa27eb", size = 521672, upload-time = "2026-06-02T19:00:48.656Z" },
417
+ { url = "https://files.pythonhosted.org/packages/cb/cf/80cc4be132a523f0c92fb4c71813577bb393abea9e27990ca74605e0e930/cbor2-6.1.2-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:2652a94224980d47f2a3866dd35b1afe532ecdfaf91f8cfcec39a026c457a844", size = 534402, upload-time = "2026-06-02T19:00:50.064Z" },
418
+ { url = "https://files.pythonhosted.org/packages/b1/ea/99e466d8bef61a0775a1d8538ae6c9d95f4533fadc01f8f7814cb7ab80ad/cbor2-6.1.2-cp312-cp312-win32.whl", hash = "sha256:618666292900487db4a5abcade3150105c9c9fdd22576e6ff297c9a72eef0c6a", size = 283225, upload-time = "2026-06-02T19:00:51.406Z" },
419
+ { url = "https://files.pythonhosted.org/packages/14/13/e6a677bdc499e43049006cb54fe605b0f7aef621402d31354cc42ef293c9/cbor2-6.1.2-cp312-cp312-win_amd64.whl", hash = "sha256:c61c0b2e2cee64497e6c62d1976bc212f62ac0cd2b5b903613610d79b8b06b60", size = 300844, upload-time = "2026-06-02T19:00:52.628Z" },
420
+ { url = "https://files.pythonhosted.org/packages/77/4a/08bd8461f8e2e1ce1de5ae2768f2b7ca39a090e3156c1ee0d9b5fd86e70d/cbor2-6.1.2-cp312-cp312-win_arm64.whl", hash = "sha256:c871e7266ddc545b258e6f8e5300396985dc485d7ccf8bb4777385782f302153", size = 289040, upload-time = "2026-06-02T19:00:53.971Z" },
421
+ { url = "https://files.pythonhosted.org/packages/2b/dc/bc045c8f36317e4e5f7a60d94b36833139909fc32e3a65f44bc61a36def0/cbor2-6.1.2-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:f1aa38c422d87ea61849b2a823b10b64053fb4da8763f19ac78ea9a69d682b2a", size = 408846, upload-time = "2026-06-02T19:00:55.476Z" },
422
+ { url = "https://files.pythonhosted.org/packages/2b/36/d66f5f0dd98ecbdcfc7da1fbd423f7b3782a27719f0062a560476f00b334/cbor2-6.1.2-cp313-cp313-manylinux_2_28_aarch64.whl", hash = "sha256:ff7d0bd8ff432832338a8d2430aee34f8a082342480ff537c0ba90e2b8ff7894", size = 454624, upload-time = "2026-06-02T19:00:56.744Z" },
423
+ { url = "https://files.pythonhosted.org/packages/38/6b/4884b9cf03db14dc5007825d5d1bf8678a75c49d4268d8e0c1c6e9580104/cbor2-6.1.2-cp313-cp313-manylinux_2_28_x86_64.whl", hash = "sha256:c1eedf3290d88a5f663bd8b4b8f0f0e2103d0594c293fa5f4e62e53100972309", size = 466585, upload-time = "2026-06-02T19:00:58.209Z" },
424
+ { url = "https://files.pythonhosted.org/packages/50/f6/36a15beb3915f56a79d6e9213c6d40c0f5cb90cd3462923f555d78068847/cbor2-6.1.2-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:3049b04bddf9a5a2d0e5bb25dccdaf4552fcaf607b404e249d4f78f010fcc7d0", size = 521678, upload-time = "2026-06-02T19:00:59.524Z" },
425
+ { url = "https://files.pythonhosted.org/packages/c6/3f/e899313371ebeb7a191d751de97ccd8242abc24bbc9d8e2c58e04475cfb0/cbor2-6.1.2-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:96eb687a62040401668f06a85de8f47361ef44574de1493899e0ec678109fc04", size = 534044, upload-time = "2026-06-02T19:01:00.875Z" },
426
+ { url = "https://files.pythonhosted.org/packages/1e/5e/1a872acdeb1ab9a884ec3460f73a43e02154dc20d8ccb627bbd60f4c0ea1/cbor2-6.1.2-cp313-cp313-win32.whl", hash = "sha256:03440b505882280023db1fedcee6844804e9968bb50f9eb4ff12aaf27777fcfe", size = 282328, upload-time = "2026-06-02T19:01:02.347Z" },
427
+ { url = "https://files.pythonhosted.org/packages/70/79/29721bc15d38889e7bec214ede2346ee15970bedcc5e6ce1fa30f21e9a4e/cbor2-6.1.2-cp313-cp313-win_amd64.whl", hash = "sha256:d2c8da2c0f821827dcc9eb59a5c9351791a8aa3b389a2ea7ca64c4f97bcb94cf", size = 300313, upload-time = "2026-06-02T19:01:03.69Z" },
428
+ { url = "https://files.pythonhosted.org/packages/07/98/a13b424fb2f14fe332b57f71f479953b2f291a051f797d42ddab9fcd2027/cbor2-6.1.2-cp313-cp313-win_arm64.whl", hash = "sha256:8e1478d3b980ddfcaf56e27cecbfe13057e0f67d5e8240fe8a398815acb9c4bf", size = 288725, upload-time = "2026-06-02T19:01:04.933Z" },
429
+ { url = "https://files.pythonhosted.org/packages/62/72/949bdc7422acd868a2355ae032561a104973fb5de284b36a237b85780dc9/cbor2-6.1.2-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:b0b65314a0b18c47651e17792447171a858dd77e3f161c451ad850d63f8718a9", size = 407436, upload-time = "2026-06-02T19:01:06.259Z" },
430
+ { url = "https://files.pythonhosted.org/packages/2f/bd/5969f9263102d1c15aa370b39802e4a87b1d1703fdb51588daf38b5fbe7e/cbor2-6.1.2-cp314-cp314-manylinux_2_28_aarch64.whl", hash = "sha256:8904deb2849bae40cea970e114398a19da371e1048ae1409e64f167a1205daf6", size = 453507, upload-time = "2026-06-02T19:01:07.795Z" },
431
+ { url = "https://files.pythonhosted.org/packages/93/a5/227b785692a8374e3dbdf1fe76d1a9af48239855abd68a4111a1458fd81b/cbor2-6.1.2-cp314-cp314-manylinux_2_28_x86_64.whl", hash = "sha256:b29d58d8ce00535354d873df170a3e9f0f0a02af65d12102d2552e2129c65dc8", size = 464875, upload-time = "2026-06-02T19:01:09.222Z" },
432
+ { url = "https://files.pythonhosted.org/packages/6d/48/a06527c3fbed4c32816abba4540e432fe9cd7e739a37fef0f205bd0f1e44/cbor2-6.1.2-cp314-cp314-musllinux_1_2_aarch64.whl", hash = "sha256:27be1cc0abc42f154a48a315c92feb2bfb50397e51c70860460438ea172198a5", size = 519940, upload-time = "2026-06-02T19:01:10.795Z" },
433
+ { url = "https://files.pythonhosted.org/packages/31/1b/0e3f0dac7140d4b94ffbcef765fa4cce0caa1d942060101149de998fa7be/cbor2-6.1.2-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:b8d87fb8a33ff1971cb01511e74b044767cbba1ba536d3dc0b0c48f0d1b62237", size = 532612, upload-time = "2026-06-02T19:01:12.363Z" },
434
+ { url = "https://files.pythonhosted.org/packages/35/2f/5af245e7667b65c6e4a714bb5d89c84de5573b857eba9137533d54bc2e4f/cbor2-6.1.2-cp314-cp314-win32.whl", hash = "sha256:72ba0ea913ca1a8d916867f1b7d414f140982d2873e5d92f8f51de437e08979e", size = 285886, upload-time = "2026-06-02T19:01:13.658Z" },
435
+ { url = "https://files.pythonhosted.org/packages/d9/0a/6303f3e19730450c5a82b97cd2c0ed54855f9108502041305b4c641116cd/cbor2-6.1.2-cp314-cp314-win_amd64.whl", hash = "sha256:c02b7d94fe9914798a346a2f089f0f7f85be71d120d40080916d131fa0bd0442", size = 308808, upload-time = "2026-06-02T19:01:14.944Z" },
436
+ { url = "https://files.pythonhosted.org/packages/cd/61/48f9c5545223dad9d2ea2061a76da739b4047a461297b621fc80ce0f65c0/cbor2-6.1.2-cp314-cp314-win_arm64.whl", hash = "sha256:2af1309865000c401755fd4fdd5550f74ac34c3f79eb7db15f3956714769a5a9", size = 299522, upload-time = "2026-06-02T19:01:16.393Z" },
437
+ { url = "https://files.pythonhosted.org/packages/b2/2b/efcc6578b4e6142fb8ec9212c0dee5030345db2092f26aa960236067e717/cbor2-6.1.2-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:9f26e08dd78ee77d103065543a65cfb838948fa8735180ad4d81d939950a1420", size = 402925, upload-time = "2026-06-02T19:01:17.979Z" },
438
+ { url = "https://files.pythonhosted.org/packages/58/f6/58c86aa6246b3e7de473d8ff79ac8cc986e95cafe208899a70d6916012d7/cbor2-6.1.2-cp314-cp314t-manylinux_2_28_aarch64.whl", hash = "sha256:596e238f24bf9ede11a1ad08d2115fe78105ed6dda42ce1dd35872e7e91974fd", size = 446201, upload-time = "2026-06-02T19:01:19.481Z" },
439
+ { url = "https://files.pythonhosted.org/packages/c8/12/3b90820583e9860e35cb5e91f3b2cd2ab1bbdf1c57fc63aa572952f5f75f/cbor2-6.1.2-cp314-cp314t-manylinux_2_28_x86_64.whl", hash = "sha256:08a62f69fe0f0ee1428d901423853b56bb5c775430f798401f8fac4b9affdecc", size = 460193, upload-time = "2026-06-02T19:01:20.876Z" },
440
+ { url = "https://files.pythonhosted.org/packages/ed/88/c1e841ffb39a8e7163d7d432f7ea0e59b812c5134a449c75b6b8eb8aad08/cbor2-6.1.2-cp314-cp314t-musllinux_1_2_aarch64.whl", hash = "sha256:6ca0080e4d8ab0d67c0518ac995d03151a1274b5c295c9e619fb6057c91ae49e", size = 511446, upload-time = "2026-06-02T19:01:22.18Z" },
441
+ { url = "https://files.pythonhosted.org/packages/db/0a/f1ede587a388f127b9fc3d8ecb2f5d948654fed9fc7698f8b05fd90986bf/cbor2-6.1.2-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:b44eb2f3ea1c8d9cb3e39c345204ec4d9489f8149b78eb5e058b13b14a8c7b07", size = 527683, upload-time = "2026-06-02T19:01:23.639Z" },
442
+ { url = "https://files.pythonhosted.org/packages/1a/89/e3210ea45855a8d6173821f712a71a90d23dea0c134c4017c6f666a04fdf/cbor2-6.1.2-cp314-cp314t-win32.whl", hash = "sha256:f93179b4b1ba958b5c37b56969b8f07b4fcf44a83319f47559c59f28a1c564a4", size = 280419, upload-time = "2026-06-02T19:01:25.365Z" },
443
+ { url = "https://files.pythonhosted.org/packages/96/84/b555de26cc01108a72ed1df8eb7ca1d63495a3727045f0f93318dc5f99a8/cbor2-6.1.2-cp314-cp314t-win_amd64.whl", hash = "sha256:3c6c3d6598c268abf7068ae75b23b19f708e7a4aa294341b356deb65cb2664f1", size = 302514, upload-time = "2026-06-02T19:01:26.782Z" },
444
+ { url = "https://files.pythonhosted.org/packages/d4/6e/5556939414c0d2bffed7c7a53cf2b32181b55a795944d19835d513a7bc88/cbor2-6.1.2-cp314-cp314t-win_arm64.whl", hash = "sha256:8c2202fd1906f978bff3f97b21351815753dd9a8fcf4612a5113b6b257089059", size = 290058, upload-time = "2026-06-02T19:01:28.077Z" },
445
+ ]
446
+
447
  [[package]]
448
  name = "certifi"
449
  version = "2026.5.20"
 
1214
  { url = "https://files.pythonhosted.org/packages/28/27/3d6dcadc8a3214d8522c1e7f6a19554e33659be44546d44a2f7572ac7d2a/groovy-0.1.2-py3-none-any.whl", hash = "sha256:7f7975bab18c729a257a8b1ae9dcd70b7cafb1720481beae47719af57c35fa64", size = 14090, upload-time = "2025-02-28T20:24:55.152Z" },
1215
  ]
1216
 
1217
+ [[package]]
1218
+ name = "grpclib"
1219
+ version = "0.4.9"
1220
+ source = { registry = "https://pypi.org/simple" }
1221
+ dependencies = [
1222
+ { name = "h2" },
1223
+ { name = "multidict" },
1224
+ ]
1225
+ sdist = { url = "https://files.pythonhosted.org/packages/5b/28/5a2c299ec82a876a252c5919aa895a6f1d1d35c96417c5ce4a4660dc3a80/grpclib-0.4.9.tar.gz", hash = "sha256:cc589c330fa81004c6400a52a566407574498cb5b055fa927013361e21466c46", size = 84798, upload-time = "2025-12-14T22:23:14.349Z" }
1226
+ wheels = [
1227
+ { url = "https://files.pythonhosted.org/packages/5c/90/b0cbbd9efcc82816c58f31a34963071aa19fb792a212a5d9caf8e0fc3097/grpclib-0.4.9-py3-none-any.whl", hash = "sha256:7762ec1c8ed94dfad597475152dd35cbd11aecaaca2f243e29702435ca24cf0e", size = 77063, upload-time = "2025-12-14T22:23:13.224Z" },
1228
+ ]
1229
+
1230
  [[package]]
1231
  name = "h11"
1232
  version = "0.16.0"
 
1398
  { url = "https://files.pythonhosted.org/packages/1e/5e/d4e9f1a599fb8e573b7b87160658329fbf28d19eac2718f51fc3def3aa5a/idna-3.18-py3-none-any.whl", hash = "sha256:7f952cbe720b688055e3f87de14f5c3e5fdaa8bc3928985c4077ca689de849a2", size = 65455, upload-time = "2026-06-02T14:34:06.319Z" },
1399
  ]
1400
 
1401
+ [[package]]
1402
+ name = "immutabledict"
1403
+ version = "4.3.1"
1404
+ source = { registry = "https://pypi.org/simple" }
1405
+ sdist = { url = "https://files.pythonhosted.org/packages/1d/e6/718471048fea0366c3e3d1df3acfd914ca66d571cdffcf6d37bbcd725708/immutabledict-4.3.1.tar.gz", hash = "sha256:f844a669106cfdc73f47b1a9da003782fb17dc955a54c80972e0d93d1c63c514", size = 7806, upload-time = "2026-02-15T10:32:34.668Z" }
1406
+ wheels = [
1407
+ { url = "https://files.pythonhosted.org/packages/a3/ce/f9018bf69ae91b273b6391a095e7c93fa5e1617f25b6ba81ad4b20c9df10/immutabledict-4.3.1-py3-none-any.whl", hash = "sha256:c9facdc0ff30fdb8e35bd16532026cac472a549e182c94fa201b51b25e4bf7bf", size = 5000, upload-time = "2026-02-15T10:32:33.672Z" },
1408
+ ]
1409
+
1410
  [[package]]
1411
  name = "inference"
1412
  version = "0.1.0"
 
1562
  { url = "https://files.pythonhosted.org/packages/b5/91/53255615acd2a1eaca307ede3c90eb550bae9c94581f8c00081b6b1c8f44/kiwisolver-1.5.0-graalpy312-graalpy250_312_native-win_amd64.whl", hash = "sha256:1f1489f769582498610e015a8ef2d36f28f505ab3096d0e16b4858a9ec214f57", size = 75987, upload-time = "2026-03-09T13:15:39.65Z" },
1563
  ]
1564
 
1565
+ [[package]]
1566
+ name = "langdetect"
1567
+ version = "1.0.9"
1568
+ source = { registry = "https://pypi.org/simple" }
1569
+ dependencies = [
1570
+ { name = "six" },
1571
+ ]
1572
+ sdist = { url = "https://files.pythonhosted.org/packages/0e/72/a3add0e4eec4eb9e2569554f7c70f4a3c27712f40e3284d483e88094cc0e/langdetect-1.0.9.tar.gz", hash = "sha256:cbc1fef89f8d062739774bd51eda3da3274006b3661d199c2655f6b3f6d605a0", size = 981474, upload-time = "2021-05-07T07:54:13.562Z" }
1573
+
1574
  [[package]]
1575
  name = "lazy-loader"
1576
  version = "0.5"
 
1677
  { name = "torch" },
1678
  { name = "transformers" },
1679
  ]
1680
+ ifeval = [
1681
+ { name = "immutabledict" },
1682
+ { name = "langdetect" },
1683
+ { name = "nltk" },
1684
+ ]
1685
 
1686
  [[package]]
1687
  name = "lxml"
 
1930
  { url = "https://files.pythonhosted.org/packages/b3/38/89ba8ad64ae25be8de66a6d463314cf1eb366222074cfda9ee839c56a4b4/mdurl-0.1.2-py3-none-any.whl", hash = "sha256:84008a41e51615a49fc9966191ff91509e3c40b939176e643fd50a5c2196b8f8", size = 9979, upload-time = "2022-08-14T12:40:09.779Z" },
1931
  ]
1932
 
1933
+ [[package]]
1934
+ name = "modal"
1935
+ version = "1.5.0"
1936
+ source = { registry = "https://pypi.org/simple" }
1937
+ dependencies = [
1938
+ { name = "aiohttp" },
1939
+ { name = "cbor2" },
1940
+ { name = "certifi" },
1941
+ { name = "click" },
1942
+ { name = "grpclib" },
1943
+ { name = "protobuf" },
1944
+ { name = "rich" },
1945
+ { name = "synchronicity" },
1946
+ { name = "toml" },
1947
+ { name = "types-certifi" },
1948
+ { name = "types-toml" },
1949
+ { name = "typing-extensions" },
1950
+ { name = "watchfiles" },
1951
+ ]
1952
+ sdist = { url = "https://files.pythonhosted.org/packages/59/f9/87425e60db2a8597b248417772b409c49ca3a05ff6b1282a21cd7d856f09/modal-1.5.0.tar.gz", hash = "sha256:15033cf84f5f4f9f8a3dcf47a768cfcca36d1ad38ab7b3459fd3cbc29aa84a77", size = 771722, upload-time = "2026-06-09T22:37:27.5Z" }
1953
+ wheels = [
1954
+ { url = "https://files.pythonhosted.org/packages/5c/71/85e476e7d32c0a648d5aa97c4335ac02357d059c2bb734cf175b08446597/modal-1.5.0-py3-none-any.whl", hash = "sha256:9c5687eff775d1372bd70b87e43499e40777a1de160f23786c00807bf342fcb6", size = 882122, upload-time = "2026-06-09T22:37:24.608Z" },
1955
+ ]
1956
+
1957
  [[package]]
1958
  name = "more-itertools"
1959
  version = "11.1.0"
 
2854
 
2855
  [[package]]
2856
  name = "protobuf"
2857
+ version = "6.33.6"
2858
  source = { registry = "https://pypi.org/simple" }
2859
+ sdist = { url = "https://files.pythonhosted.org/packages/66/70/e908e9c5e52ef7c3a6c7902c9dfbb34c7e29c25d2f81ade3856445fd5c94/protobuf-6.33.6.tar.gz", hash = "sha256:a6768d25248312c297558af96a9f9c929e8c4cee0659cb07e780731095f38135", size = 444531, upload-time = "2026-03-18T19:05:00.988Z" }
2860
  wheels = [
2861
+ { url = "https://files.pythonhosted.org/packages/fc/9f/2f509339e89cfa6f6a4c4ff50438db9ca488dec341f7e454adad60150b00/protobuf-6.33.6-cp310-abi3-win32.whl", hash = "sha256:7d29d9b65f8afef196f8334e80d6bc1d5d4adedb449971fefd3723824e6e77d3", size = 425739, upload-time = "2026-03-18T19:04:48.373Z" },
2862
+ { url = "https://files.pythonhosted.org/packages/76/5d/683efcd4798e0030c1bab27374fd13a89f7c2515fb1f3123efdfaa5eab57/protobuf-6.33.6-cp310-abi3-win_amd64.whl", hash = "sha256:0cd27b587afca21b7cfa59a74dcbd48a50f0a6400cfb59391340ad729d91d326", size = 437089, upload-time = "2026-03-18T19:04:50.381Z" },
2863
+ { url = "https://files.pythonhosted.org/packages/5c/01/a3c3ed5cd186f39e7880f8303cc51385a198a81469d53d0fdecf1f64d929/protobuf-6.33.6-cp39-abi3-macosx_10_9_universal2.whl", hash = "sha256:9720e6961b251bde64edfdab7d500725a2af5280f3f4c87e57c0208376aa8c3a", size = 427737, upload-time = "2026-03-18T19:04:51.866Z" },
2864
+ { url = "https://files.pythonhosted.org/packages/ee/90/b3c01fdec7d2f627b3a6884243ba328c1217ed2d978def5c12dc50d328a3/protobuf-6.33.6-cp39-abi3-manylinux2014_aarch64.whl", hash = "sha256:e2afbae9b8e1825e3529f88d514754e094278bb95eadc0e199751cdd9a2e82a2", size = 324610, upload-time = "2026-03-18T19:04:53.096Z" },
2865
+ { url = "https://files.pythonhosted.org/packages/9b/ca/25afc144934014700c52e05103c2421997482d561f3101ff352e1292fb81/protobuf-6.33.6-cp39-abi3-manylinux2014_s390x.whl", hash = "sha256:c96c37eec15086b79762ed265d59ab204dabc53056e3443e702d2681f4b39ce3", size = 339381, upload-time = "2026-03-18T19:04:54.616Z" },
2866
+ { url = "https://files.pythonhosted.org/packages/16/92/d1e32e3e0d894fe00b15ce28ad4944ab692713f2e7f0a99787405e43533a/protobuf-6.33.6-cp39-abi3-manylinux2014_x86_64.whl", hash = "sha256:e9db7e292e0ab79dd108d7f1a94fe31601ce1ee3f7b79e0692043423020b0593", size = 323436, upload-time = "2026-03-18T19:04:55.768Z" },
2867
+ { url = "https://files.pythonhosted.org/packages/c4/72/02445137af02769918a93807b2b7890047c32bfb9f90371cbc12688819eb/protobuf-6.33.6-py3-none-any.whl", hash = "sha256:77179e006c476e69bf8e8ce866640091ec42e1beb80b213c3900006ecfba6901", size = 170656, upload-time = "2026-03-18T19:04:59.826Z" },
2868
  ]
2869
 
2870
  [[package]]
 
3681
 
3682
  [package.optional-dependencies]
3683
  lm-eval = [
3684
+ { name = "lm-eval", extra = ["hf", "ifeval"] },
3685
  ]
3686
 
3687
  [package.metadata]
 
3690
  { name = "bitsandbytes", specifier = ">=0.43.0" },
3691
  { name = "datasets", specifier = ">=2.19.0" },
3692
  { name = "huggingface-hub", specifier = ">=0.22.0" },
3693
+ { name = "lm-eval", extras = ["hf", "ifeval"], marker = "extra == 'lm-eval'", specifier = ">=0.4.9" },
3694
  { name = "pandas", specifier = ">=2.0.0" },
3695
  { name = "peft", specifier = ">=0.14.0" },
3696
  { name = "pyyaml", specifier = ">=6.0" },
 
3728
  lm-eval = [
3729
  { name = "slm-evals", extra = ["lm-eval"] },
3730
  ]
3731
+ modal = [
3732
+ { name = "modal" },
3733
+ { name = "pyyaml" },
3734
+ ]
3735
 
3736
  [package.metadata]
3737
  requires-dist = [
 
3754
  { name = "peft", specifier = ">=0.14.0" },
3755
  ]
3756
  lm-eval = [{ name = "slm-evals", extras = ["lm-eval"], editable = "research/evals" }]
3757
+ modal = [
3758
+ { name = "modal", specifier = ">=0.73.0" },
3759
+ { name = "pyyaml", specifier = ">=6.0" },
3760
+ ]
3761
 
3762
  [[package]]
3763
  name = "socksio"
 
3900
  { url = "https://files.pythonhosted.org/packages/a2/09/77d55d46fd61b4a135c444fc97158ef34a095e5681d0a6c10b75bf356191/sympy-1.14.0-py3-none-any.whl", hash = "sha256:e091cc3e99d2141a0ba2847328f5479b05d94a6635cb96148ccb3f34671bd8f5", size = 6299353, upload-time = "2025-04-27T18:04:59.103Z" },
3901
  ]
3902
 
3903
+ [[package]]
3904
+ name = "synchronicity"
3905
+ version = "0.12.3"
3906
+ source = { registry = "https://pypi.org/simple" }
3907
+ dependencies = [
3908
+ { name = "typing-extensions" },
3909
+ ]
3910
+ sdist = { url = "https://files.pythonhosted.org/packages/ec/d5/e96e6082790c92480380f28aa53e111844cdac7b0f75846f4772cb535a43/synchronicity-0.12.3.tar.gz", hash = "sha256:0d4228b85eaf2805f23b4615b2039a9d24ea811646e2d9f8d0c033094eb85841", size = 60261, upload-time = "2026-05-28T12:33:50.206Z" }
3911
+ wheels = [
3912
+ { url = "https://files.pythonhosted.org/packages/57/ea/531a6ea751cbd989da386144810b1b8f529b0aae8c1a9beda8b40966c9c2/synchronicity-0.12.3-py3-none-any.whl", hash = "sha256:e476818cd14102136f41622c619de548f0000c024485fc18521c8fe908ea7574", size = 40982, upload-time = "2026-05-28T12:33:49.125Z" },
3913
+ ]
3914
+
3915
  [[package]]
3916
  name = "tabledata"
3917
  version = "1.3.5"
 
3987
  { url = "https://files.pythonhosted.org/packages/72/f4/0de46cfa12cdcbcd464cc59fde36912af405696f687e53a091fb432f694c/tokenizers-0.22.2-cp39-abi3-win_arm64.whl", hash = "sha256:9ce725d22864a1e965217204946f830c37876eee3b2ba6fc6255e8e903d5fcbc", size = 2612133, upload-time = "2026-01-05T10:45:17.232Z" },
3988
  ]
3989
 
3990
+ [[package]]
3991
+ name = "toml"
3992
+ version = "0.10.2"
3993
+ source = { registry = "https://pypi.org/simple" }
3994
+ sdist = { url = "https://files.pythonhosted.org/packages/be/ba/1f744cdc819428fc6b5084ec34d9b30660f6f9daaf70eead706e3203ec3c/toml-0.10.2.tar.gz", hash = "sha256:b3bda1d108d5dd99f4a20d24d9c348e91c4db7ab1b749200bded2f839ccbe68f", size = 22253, upload-time = "2020-11-01T01:40:22.204Z" }
3995
+ wheels = [
3996
+ { url = "https://files.pythonhosted.org/packages/44/6f/7120676b6d73228c96e17f1f794d8ab046fc910d781c8d151120c3f1569e/toml-0.10.2-py2.py3-none-any.whl", hash = "sha256:806143ae5bfb6a3c6e736a764057db0e6a0e05e338b5630894a5f779cabb4f9b", size = 16588, upload-time = "2020-11-01T01:40:20.672Z" },
3997
+ ]
3998
+
3999
  [[package]]
4000
  name = "tomlkit"
4001
  version = "0.14.0"
 
4182
  { url = "https://files.pythonhosted.org/packages/3f/f9/2b3ff4e56e5fa7debfaf9eb135d0da96f3e9a1d5b27222223c7296336e5f/typer-0.25.1-py3-none-any.whl", hash = "sha256:75caa44ed46a03fb2dab8808753ffacdbfea88495e74c85a28c5eefcf5f39c89", size = 58409, upload-time = "2026-04-30T19:32:18.271Z" },
4183
  ]
4184
 
4185
+ [[package]]
4186
+ name = "types-certifi"
4187
+ version = "2021.10.8.3"
4188
+ source = { registry = "https://pypi.org/simple" }
4189
+ sdist = { url = "https://files.pythonhosted.org/packages/52/68/943c3aeaf14624712a0357c4a67814dba5cea36d194f5c764dad7959a00c/types-certifi-2021.10.8.3.tar.gz", hash = "sha256:72cf7798d165bc0b76e1c10dd1ea3097c7063c42c21d664523b928e88b554a4f", size = 2095, upload-time = "2022-06-09T15:19:05.244Z" }
4190
+ wheels = [
4191
+ { url = "https://files.pythonhosted.org/packages/b5/63/2463d89481e811f007b0e1cd0a91e52e141b47f9de724d20db7b861dcfec/types_certifi-2021.10.8.3-py3-none-any.whl", hash = "sha256:b2d1e325e69f71f7c78e5943d410e650b4707bb0ef32e4ddf3da37f54176e88a", size = 2136, upload-time = "2022-06-09T15:19:03.127Z" },
4192
+ ]
4193
+
4194
+ [[package]]
4195
+ name = "types-toml"
4196
+ version = "0.10.8.20260518"
4197
+ source = { registry = "https://pypi.org/simple" }
4198
+ sdist = { url = "https://files.pythonhosted.org/packages/4b/11/6ece999e91f2ccb848ab4420f3f4816e78ac0541f739e6864affdaaa5737/types_toml-0.10.8.20260518.tar.gz", hash = "sha256:80e10facd24fdeda9d5c672187d72be3ac284843788d67f5aae59e3e016db6fe", size = 9419, upload-time = "2026-05-18T06:02:16.719Z" }
4199
+ wheels = [
4200
+ { url = "https://files.pythonhosted.org/packages/91/25/489751806bf5c95e4007f8e17409199c54d31e49ffbea07c5729b1286c8e/types_toml-0.10.8.20260518-py3-none-any.whl", hash = "sha256:0e564ab05f6fde62a315b3b5a9b6624fda569399795d30a37e64705a70459303", size = 9669, upload-time = "2026-05-18T06:02:15.86Z" },
4201
+ ]
4202
+
4203
  [[package]]
4204
  name = "typing-extensions"
4205
  version = "4.15.0"
 
4264
  { url = "https://files.pythonhosted.org/packages/88/fa/e1388bbcf24ef3274f45c0c1c7b501fd14971037c1b6ee23610553307497/uvicorn-0.49.0-py3-none-any.whl", hash = "sha256:ba3d14c3ee7e41c6c654c46c9eb489d33213cdd30aa1696eab1374337c13f68f", size = 71376, upload-time = "2026-06-03T22:01:29.037Z" },
4265
  ]
4266
 
4267
+ [[package]]
4268
+ name = "watchfiles"
4269
+ version = "1.2.0"
4270
+ source = { registry = "https://pypi.org/simple" }
4271
+ dependencies = [
4272
+ { name = "anyio" },
4273
+ ]
4274
+ sdist = { url = "https://files.pythonhosted.org/packages/cd/41/5e1a4bb12aac5f1493fa1bdc11154eca3b258ca4eba65d39c473fe19d8e9/watchfiles-1.2.0.tar.gz", hash = "sha256:c995fba777f1ea992f090f9236e9284cf7a5d1a0130dd5a3d82c598cacd76838", size = 108252, upload-time = "2026-05-18T04:32:04.251Z" }
4275
+ wheels = [
4276
+ { url = "https://files.pythonhosted.org/packages/b8/2f/e42c992d2afda3108ea1c02acecc991b9f31d05c14adc2a7cee9ee211fc4/watchfiles-1.2.0-cp312-cp312-macosx_10_12_x86_64.whl", hash = "sha256:bc13eb17538be00c874699dc0abe4ee2bc8d50bb1166a6b9e175ef3fd7eb8f26", size = 400115, upload-time = "2026-05-18T04:32:02.06Z" },
4277
+ { url = "https://files.pythonhosted.org/packages/5f/8f/6af2ea19065c91d8b0ea3516fdfc8c0d349f407e8e9fbf4e5a17360de8ad/watchfiles-1.2.0-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:2d95ddc1eb6914154253d239089900813f6a767e174b8e6a50e7fdacb7e4236c", size = 393659, upload-time = "2026-05-18T04:30:50.951Z" },
4278
+ { url = "https://files.pythonhosted.org/packages/13/01/b32a967c56fb3e3e5be3db52c3d3b87fa4513aa367d8ed1ad96d42952e5f/watchfiles-1.2.0-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:8f70d8b291ef6e88d19b1f297a6905ddb978888d9272b0d05e6f53309856bcfc", size = 453207, upload-time = "2026-05-18T04:31:04.231Z" },
4279
+ { url = "https://files.pythonhosted.org/packages/04/98/97557a812180338cb1abd32e1cffcc4588f59b5f23e0cb006b2ba95ba64a/watchfiles-1.2.0-cp312-cp312-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:56d8641cf834c2836922899105bd3ce3d0dfc69291d52edf0b4d0436829b34c0", size = 459273, upload-time = "2026-05-18T04:31:50.377Z" },
4280
+ { url = "https://files.pythonhosted.org/packages/e8/a8/b4b08dcb7653b8087c6586f7ce649505900e866bbcfe40dc9587af02e686/watchfiles-1.2.0-cp312-cp312-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:2581a94056e55d7d0a31a823ea92bf73749c489ca2285bfdc0fbe6b2bb49d50c", size = 489927, upload-time = "2026-05-18T04:31:42.485Z" },
4281
+ { url = "https://files.pythonhosted.org/packages/50/94/3dceea03545d2e5ddfd839f0ddd5e1cecbf1697b5a428d5ba11cef6af95d/watchfiles-1.2.0-cp312-cp312-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:41bc1199f7523b3f82843c88cbb979180c949caef0342cf90968f178e5d49b01", size = 570476, upload-time = "2026-05-18T04:31:03.071Z" },
4282
+ { url = "https://files.pythonhosted.org/packages/cc/f2/d39a5450c3532092b91f81d274360e613c2371bc874a89c7a1a3c5e8d138/watchfiles-1.2.0-cp312-cp312-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:7571e4464cb6e434958f867f7f730b8ab0b75e3f8e5eac0499168486ab3c33a8", size = 465650, upload-time = "2026-05-18T04:30:12.701Z" },
4283
+ { url = "https://files.pythonhosted.org/packages/22/24/ed72f68cbc1333ca9b9f2200aa048bb6658ae41709bc1caad4310f4bdffd/watchfiles-1.2.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:e53a384f76b631c3ae5334ce6a52f0baa3a911eb94a4eac7f160079868b716d5", size = 456398, upload-time = "2026-05-18T04:30:13.784Z" },
4284
+ { url = "https://files.pythonhosted.org/packages/0d/64/982ef4a4e5bab5b6e5b6becc8cd5e732f6130a78b855f0abec6439a9a135/watchfiles-1.2.0-cp312-cp312-manylinux_2_31_riscv64.whl", hash = "sha256:d20029a60a71a052a24c4db7673bc4de39ab89adbaccbfb5d67987c5d73f424d", size = 465140, upload-time = "2026-05-18T04:31:52.111Z" },
4285
+ { url = "https://files.pythonhosted.org/packages/a0/0c/95282abf4ed680b6096010bcfc30c5fa7a041fc5aa5a2ad17a2cc6c75bba/watchfiles-1.2.0-cp312-cp312-musllinux_1_1_aarch64.whl", hash = "sha256:2cb93af48550faf1cea04c303107c8b75833de7013e57ce27d3b8d21d8d0f58c", size = 630259, upload-time = "2026-05-18T04:31:25.676Z" },
4286
+ { url = "https://files.pythonhosted.org/packages/30/45/607c1de1530c4bdcf2cf1d1ecc2505ddba5d96bd43ba9f2b0e79876f850f/watchfiles-1.2.0-cp312-cp312-musllinux_1_1_x86_64.whl", hash = "sha256:2995c176de7692b86a2e4c58d9ec718f753150a979cb4a754e2b4ffa38e70906", size = 659859, upload-time = "2026-05-18T04:30:24.333Z" },
4287
+ { url = "https://files.pythonhosted.org/packages/fa/08/d9e2e0f9e8e6791d33aefc694ad7eefa7f901f63caff84a81ded38692f9c/watchfiles-1.2.0-cp312-cp312-win32.whl", hash = "sha256:7a2cffd17d27d2ecbb310c2b1d8174f222a5495b1a721894afa88ec11e25b898", size = 275480, upload-time = "2026-05-18T04:30:31.307Z" },
4288
+ { url = "https://files.pythonhosted.org/packages/1c/e6/9d42569c0102645cc8cea5d8c7d8a1e9d4ada2cb7f05f75e554b8aa2202a/watchfiles-1.2.0-cp312-cp312-win_amd64.whl", hash = "sha256:f155b3a1b2a5fc89cdc70d47ee5d54e3b75e88efa34982028a35daef9ba00379", size = 288718, upload-time = "2026-05-18T04:32:10.745Z" },
4289
+ { url = "https://files.pythonhosted.org/packages/0a/26/88e0dc6ee3898169d7fa22bb6a69cabf2502d2ee25cb8c876d1262d204f8/watchfiles-1.2.0-cp312-cp312-win_arm64.whl", hash = "sha256:8fa585ede612ee9f9e91b18bebf9ba11b9ae29a4e3a0d0cf6fca3e382133f0d5", size = 281026, upload-time = "2026-05-18T04:30:22.23Z" },
4290
+ { url = "https://files.pythonhosted.org/packages/d1/4d/70a7feced9f87e2ff26dba42667290f41694fc64646c67261fbb8cab5d5c/watchfiles-1.2.0-cp313-cp313-macosx_10_12_x86_64.whl", hash = "sha256:01ea8d66f0693b9b60a6541c8d10263091ca9a9060d242f3c1f3143f9aad2c98", size = 399730, upload-time = "2026-05-18T04:31:38.162Z" },
4291
+ { url = "https://files.pythonhosted.org/packages/31/3a/0da302f2307aee316922806ebd5726c542cbd787c938271cf14a074c7daf/watchfiles-1.2.0-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:7ba0480b9a74af058f43b337e937a451e109295c420916d68ad24e3dc02f5e44", size = 392842, upload-time = "2026-05-18T04:30:27.051Z" },
4292
+ { url = "https://files.pythonhosted.org/packages/db/ef/d5bdb705c224dbc256aa0c1ec47bf4e61ec52558f2afb44a71a1fe4d7015/watchfiles-1.2.0-cp313-cp313-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:4f34e26a19f91f710c08e0183429f0d1d15df734e6bc78c31e77b9ea9c433658", size = 452989, upload-time = "2026-05-18T04:31:11.945Z" },
4293
+ { url = "https://files.pythonhosted.org/packages/71/29/5495f2c1661949ef7a35e4d71111d129cfe7606414a26887a919d0a55406/watchfiles-1.2.0-cp313-cp313-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:b4e77f6a55f858504069abd35d336a637555c09bca453dde1ee1e5ada8a6a1fb", size = 458978, upload-time = "2026-05-18T04:30:52.606Z" },
4294
+ { url = "https://files.pythonhosted.org/packages/d5/8c/7f9c07c433811c2fffd93e13fdfb7135de9aab5f2ae41be08960fa0047dc/watchfiles-1.2.0-cp313-cp313-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:0cb4d80e212f116474a545c21c912b445f16bb0cef9e6a73a498164223e14e2f", size = 490248, upload-time = "2026-05-18T04:31:36.003Z" },
4295
+ { url = "https://files.pythonhosted.org/packages/3c/11/d93632febc52fbc21be90231bb7c17fd5387f46c9076fd40a5f9c2ae6910/watchfiles-1.2.0-cp313-cp313-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:b974946a10af379d425e2eef5b62f5c6ebeaccf91d45eaad6f5b27ecd4f91aa0", size = 571847, upload-time = "2026-05-18T04:31:10.862Z" },
4296
+ { url = "https://files.pythonhosted.org/packages/55/b4/383173e73aabb07ad1d9c7aa859d95437ac46a6d6a1e11005facda0c9d19/watchfiles-1.2.0-cp313-cp313-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:86bc13c25a8d1fcd70b51d0ce7c9b65e90de5666fcbfd3e34957cc73ee19aeb5", size = 465974, upload-time = "2026-05-18T04:30:17.006Z" },
4297
+ { url = "https://files.pythonhosted.org/packages/a7/6c/89b1a230a78f57c52dd8893adb1f92f94411721b6ec12596c56d98c74356/watchfiles-1.2.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:ca148d73dea36c9763aaa351e4d7a51780ec1584217c45276f4fe8239c768b71", size = 454782, upload-time = "2026-05-18T04:30:35.656Z" },
4298
+ { url = "https://files.pythonhosted.org/packages/24/62/1732118367cfff0a9fce3bf62ff4bfded09ef5df21d9d446b858b3f70a96/watchfiles-1.2.0-cp313-cp313-manylinux_2_31_riscv64.whl", hash = "sha256:c525543d91961c6955b2636b308569e84a1d1c5f5f2932041ab9ef46422f43e3", size = 465182, upload-time = "2026-05-18T04:30:20.846Z" },
4299
+ { url = "https://files.pythonhosted.org/packages/28/96/716f7e5f51339bf22963f3345f9f27d7f3b30e2eadc597e257c881dd3c53/watchfiles-1.2.0-cp313-cp313-musllinux_1_1_aarch64.whl", hash = "sha256:a204794696ffb8f9b10fba6f7cb5216d42f3b2b71860ccac6b6e42f5f10973b0", size = 629841, upload-time = "2026-05-18T04:31:05.397Z" },
4300
+ { url = "https://files.pythonhosted.org/packages/4c/fe/c40783950fd771ccf66ab3ec2722d188a9af1c7f96c6e811f36e40c6e03f/watchfiles-1.2.0-cp313-cp313-musllinux_1_1_x86_64.whl", hash = "sha256:10d86db20695afe7997ac9e1717637d6714a8d0220458c33f3d2061f54cec427", size = 658028, upload-time = "2026-05-18T04:31:48.22Z" },
4301
+ { url = "https://files.pythonhosted.org/packages/71/72/4508db1856d1d87fcbb3b63f4839bab1b5682cb0e8d224d122263c09654a/watchfiles-1.2.0-cp313-cp313-win32.whl", hash = "sha256:eb283ee99e21ad6443c8cdb06ac5b34b1308c329cbdf03fa02b445363714c799", size = 275183, upload-time = "2026-05-18T04:30:59.57Z" },
4302
+ { url = "https://files.pythonhosted.org/packages/f9/36/14b76ca57652e5cc5fd1c11f32a261292c08a0d19a00351013c2549cbfb2/watchfiles-1.2.0-cp313-cp313-win_amd64.whl", hash = "sha256:a0f27f01bee51861392bb6b7c4fdb290b27d1eb194e9e28788d68102a0e898d9", size = 288059, upload-time = "2026-05-18T04:32:07.937Z" },
4303
+ { url = "https://files.pythonhosted.org/packages/1b/8d/0a85e395398d8d20fadfe5c5d32c726eee17a519e78fb356f2cf7531bffe/watchfiles-1.2.0-cp313-cp313-win_arm64.whl", hash = "sha256:3651aa7058595e9cfb75d35dd5ada2bf9f48a5b8a0f3562821d3e210c507e077", size = 280186, upload-time = "2026-05-18T04:31:54.484Z" },
4304
+ { url = "https://files.pythonhosted.org/packages/37/68/36db056f1fdcc5f07302f56e631774d6835bcd6fa3ace402304621d5f9e5/watchfiles-1.2.0-cp313-cp313t-macosx_10_12_x86_64.whl", hash = "sha256:faea288b6f0ab1902ef08f4ca6de005dccf856c4e0c4f21b8c5fce02d90a1b08", size = 399031, upload-time = "2026-05-18T04:30:44.576Z" },
4305
+ { url = "https://files.pythonhosted.org/packages/c1/64/01a9d6f66a82a5c101ce939274106cc72759d62427e153f01edd2b9f87c2/watchfiles-1.2.0-cp313-cp313t-macosx_11_0_arm64.whl", hash = "sha256:01859b11fd9fbca670f4d5da00fbac282cfea9bd67a2125d8b2833a3b5617ea9", size = 391205, upload-time = "2026-05-18T04:30:25.413Z" },
4306
+ { url = "https://files.pythonhosted.org/packages/84/2c/0a44fe058cb4bb7b8ede6b6670698bbb7c0400740e378d00022189b7b31d/watchfiles-1.2.0-cp313-cp313t-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:fff610d7bb2256a317bb1e96f0d7862c7aa8076733ee5df0fd41bbe76a24a4f4", size = 451892, upload-time = "2026-05-18T04:32:14.005Z" },
4307
+ { url = "https://files.pythonhosted.org/packages/67/a1/351e0d56cd35e6488b5c8b4fb11a809a5bc923e8fe8fed9faf8920be0c89/watchfiles-1.2.0-cp313-cp313t-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:b141a4891c995a039cd89e9a49e62df1dc8a559a5d1a6e4c7106d16c12777a55", size = 458867, upload-time = "2026-05-18T04:31:22.279Z" },
4308
+ { url = "https://files.pythonhosted.org/packages/d5/7d/9d09605187f1b838998624049fcf8bf47b73c1a3b76901fcac1782f62277/watchfiles-1.2.0-cp313-cp313t-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:f22943b7770483f6ea0721c6b11d022947a98eb0acae14694de034f4d0d38925", size = 490217, upload-time = "2026-05-18T04:31:43.657Z" },
4309
+ { url = "https://files.pythonhosted.org/packages/60/5d/a17a16eccb182f04188cd308ec24b1a71a9b5c4e7098269cf35d9fa56d02/watchfiles-1.2.0-cp313-cp313t-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:1bc6195825b7dcd217968bb1f801a60fd4c16e8eeab5bedc7fe917d7d5995ab4", size = 571458, upload-time = "2026-05-18T04:32:11.875Z" },
4310
+ { url = "https://files.pythonhosted.org/packages/d3/3d/4dd457062083ab1938e5dfd45032eb425cee2ac817287ca8ff4356183e5d/watchfiles-1.2.0-cp313-cp313t-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:d4a4b147f5dca2a5d325a06a832fb43f345751adfbc63204aec30e0d9ca965a2", size = 464707, upload-time = "2026-05-18T04:30:43.492Z" },
4311
+ { url = "https://files.pythonhosted.org/packages/c6/71/ea8c57b128f5383de74d0c7d2d9c57ad7c9a65a930c451bd25d524b295b7/watchfiles-1.2.0-cp313-cp313t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:4543579a9bdb0c9560039b4ffddbdb39545707659fbc430ce4c10f3f68d557f9", size = 454663, upload-time = "2026-05-18T04:30:16.061Z" },
4312
+ { url = "https://files.pythonhosted.org/packages/53/fd/2e812bf938406d7db351f0703ddd3fc6c061cf30d96153a77bc79a943a44/watchfiles-1.2.0-cp313-cp313t-manylinux_2_31_riscv64.whl", hash = "sha256:20aa0e708b920bde876a4aa82dc7dd6ebea228a63a67cda6632c2fc87b787efa", size = 463537, upload-time = "2026-05-18T04:31:44.9Z" },
4313
+ { url = "https://files.pythonhosted.org/packages/86/56/d17a7f1dd1bc3035f1072694a551301272f1739c2d8e319c927cb9e29b38/watchfiles-1.2.0-cp313-cp313t-musllinux_1_1_aarch64.whl", hash = "sha256:d413349d565dab74297f2a63e84a097936be69bf8f3b3801f27f380e32040f44", size = 629194, upload-time = "2026-05-18T04:31:14.141Z" },
4314
+ { url = "https://files.pythonhosted.org/packages/be/06/f1ff66bf5cae50aa4062779a0ecd0bbaf15e466195719074078947d9a17d/watchfiles-1.2.0-cp313-cp313t-musllinux_1_1_x86_64.whl", hash = "sha256:f28b2725eb8cce327b9b3ab02415c853011dc55c95832fe90de6bc56f5315f72", size = 656194, upload-time = "2026-05-18T04:31:47.14Z" },
4315
+ { url = "https://files.pythonhosted.org/packages/e7/54/a9c7ea9a82a4ac65e7004c0a03920b5cdd2f9c3b678757d9cd425aa51d53/watchfiles-1.2.0-cp314-cp314-macosx_10_12_x86_64.whl", hash = "sha256:b8c8358484d5fa12ef34f05b7f4168eaf1932f408725ff6d023c33ec17bd79d4", size = 400205, upload-time = "2026-05-18T04:32:05.153Z" },
4316
+ { url = "https://files.pythonhosted.org/packages/aa/5d/c9ab3534374a4a67450696905d6ef16a04405448b8dc52bd752ae50423d4/watchfiles-1.2.0-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:9f04b092229ad2c50126dd3c922c8822e51e605993764a33058d4a791ab42281", size = 392508, upload-time = "2026-05-18T04:30:54.849Z" },
4317
+ { url = "https://files.pythonhosted.org/packages/26/ca/1ad30103535cf0cecd7b993e8d50edc5351b1820e38f2d22e3df58962feb/watchfiles-1.2.0-cp314-cp314-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:7a7ce236284f002a156f70add88efe5c70879cccbb658be0822c54b1306fc09d", size = 452448, upload-time = "2026-05-18T04:30:53.727Z" },
4318
+ { url = "https://files.pythonhosted.org/packages/37/a1/ceee2cdf2afbd715fa07758d39c9859513eae411b23196f7fd039e5feedd/watchfiles-1.2.0-cp314-cp314-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:b9909cc2b48468b575eefa944919e1fe8a36c5849d5c7c168f80a8c1db69398e", size = 459605, upload-time = "2026-05-18T04:30:23.312Z" },
4319
+ { url = "https://files.pythonhosted.org/packages/e8/f6/421e30fd1cb3907a84ed92ab3f1983e37ba2dca015e9a894a048418417a2/watchfiles-1.2.0-cp314-cp314-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:0a37faaed405c67e28e6be45a1fa4f206ef5a2860f27c237db9fa30704c38242", size = 490757, upload-time = "2026-05-18T04:30:47.358Z" },
4320
+ { url = "https://files.pythonhosted.org/packages/41/b0/55ed1b97ed08be7bba6f9a541cac15f2a858e1d74d2b07b6da70a82aab00/watchfiles-1.2.0-cp314-cp314-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:9649193aa27bd9ff2e80ff29bfaa93085496c7a3a377592823cc58b77ee88add", size = 568672, upload-time = "2026-05-18T04:30:38.915Z" },
4321
+ { url = "https://files.pythonhosted.org/packages/d1/cf/d8ae8a80dd7bafab395ea7681c10237311bbf34d37704a8c744e7cf31fc7/watchfiles-1.2.0-cp314-cp314-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:4e4ff8e37f99cf1da89e255e07c9c4b37c214038c4283707bdec308cb1b0ea1f", size = 464197, upload-time = "2026-05-18T04:30:09.914Z" },
4322
+ { url = "https://files.pythonhosted.org/packages/7c/8a/3076c496ca8dafe0e8cd03fcebdfc47be4b1174b4e5b24ff6e396e6b3af2/watchfiles-1.2.0-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:054dc20fd2e3132b4c3883b4a00d72fd6e1f56fdaf89fccd12e8057d74cd74d7", size = 453181, upload-time = "2026-05-18T04:30:14.829Z" },
4323
+ { url = "https://files.pythonhosted.org/packages/e5/10/9745e17c98e7b8a86454df0a3c7b5686bd650383f1e9f26e4ebcbd6cc0c0/watchfiles-1.2.0-cp314-cp314-manylinux_2_31_riscv64.whl", hash = "sha256:e140ed30ebde76796b686e67c182cff10ea2fbab186fafd1560f74bb5a473a6e", size = 465109, upload-time = "2026-05-18T04:30:28.123Z" },
4324
+ { url = "https://files.pythonhosted.org/packages/8f/95/8ef4a95481d3e0cb52d62a06fa6e972e81424be2d9698b91a2fecca9904c/watchfiles-1.2.0-cp314-cp314-musllinux_1_1_aarch64.whl", hash = "sha256:bb7e52ecf68ba46d22df23467b87cffeb2146908aa523ebfe803019618cfda06", size = 630653, upload-time = "2026-05-18T04:31:49.304Z" },
4325
+ { url = "https://files.pythonhosted.org/packages/fd/e4/3b3bf36b0f829b50c6ebcb8d031583863c59f923d6a6af3d485e470d0fac/watchfiles-1.2.0-cp314-cp314-musllinux_1_1_x86_64.whl", hash = "sha256:23282a321c8baf9b3a3c4afff673f9fe65eb7fdc2338d765ccad9d3d1916a5ba", size = 657838, upload-time = "2026-05-18T04:31:06.497Z" },
4326
+ { url = "https://files.pythonhosted.org/packages/21/b1/6cbbb50c1f3002ab568777d44aa21206dfb8807a840990c4037523b51812/watchfiles-1.2.0-cp314-cp314-win32.whl", hash = "sha256:c0db965c5f79aa49fe672d297cf1febc5ad149b658594944f49a54a2b96270a7", size = 275108, upload-time = "2026-05-18T04:30:06.891Z" },
4327
+ { url = "https://files.pythonhosted.org/packages/92/45/190ce6db8dcb4536682cf75d3889ff1a27182a58cb519d343cb6d9ea63d8/watchfiles-1.2.0-cp314-cp314-win_amd64.whl", hash = "sha256:71283b39fd17e5408eb123bd37aeecfd9d54c81fc184421943208aadb879d103", size = 288441, upload-time = "2026-05-18T04:32:12.901Z" },
4328
+ { url = "https://files.pythonhosted.org/packages/74/0d/3eae1c2313ab08378431d907c3f8095ecca00f3eda33111cf4f0f2591799/watchfiles-1.2.0-cp314-cp314-win_arm64.whl", hash = "sha256:c5c19526f4e54a00f2666a6c0e9e40d582c09e865055ea7378bf0009aab857b3", size = 280684, upload-time = "2026-05-18T04:31:26.902Z" },
4329
+ { url = "https://files.pythonhosted.org/packages/b1/75/fb64e6c25d6b5ca636d03df34ffb1c6e9873303e76d27967e045f8df088f/watchfiles-1.2.0-cp314-cp314t-macosx_10_12_x86_64.whl", hash = "sha256:d73a585accffa5ae39c17264c36ec3166d2fad7000c780f5ef83b2722afb9dd2", size = 398857, upload-time = "2026-05-18T04:32:17.108Z" },
4330
+ { url = "https://files.pythonhosted.org/packages/73/4e/9f7adf01754cbf81843722ccfec169d8f26c69778281a302855cecd2ee08/watchfiles-1.2.0-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:ae99b14c5f21e026e0e9d96f40e07d8570ebee6cafd9d8fc318354606daa7a28", size = 392413, upload-time = "2026-05-18T04:31:07.911Z" },
4331
+ { url = "https://files.pythonhosted.org/packages/47/c8/bec626bcc2d69f44b9acb24ce7d60ed7b16b73628eea747fcbd169d8edda/watchfiles-1.2.0-cp314-cp314t-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:4429f3b105524a10b72c3a819b091c495d2811d419c1e1e8df773a5a5974f831", size = 452409, upload-time = "2026-05-18T04:31:20.142Z" },
4332
+ { url = "https://files.pythonhosted.org/packages/00/b7/b6362068e81e7c556d155a34c35d40ac3ef42d747b06d7f6e5bf58e359c2/watchfiles-1.2.0-cp314-cp314t-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:43d818978d06062d9b22c4fab2ebe44cf5213d42dc8e62bda8c2760cfa2eeb33", size = 458827, upload-time = "2026-05-18T04:32:06.219Z" },
4333
+ { url = "https://files.pythonhosted.org/packages/67/f8/9a813fa42afb1e0b4625e75f0479826644d3ee8dc287e093799bc01f390c/watchfiles-1.2.0-cp314-cp314t-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:b9f732dc58b2dbe69e464ccf8fff7a03b0dd0be439da4c0720d3558527d3d6b4", size = 490104, upload-time = "2026-05-18T04:31:56.034Z" },
4334
+ { url = "https://files.pythonhosted.org/packages/2f/bf/27dfb6094ca4c9aad21298b5525b6c53cb36121ee454331d05161e58d130/watchfiles-1.2.0-cp314-cp314t-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:8f200104103feb097de4cab8fe4f5dd18a2026934c7dea98c55a2f5fd6d5a33b", size = 571360, upload-time = "2026-05-18T04:31:57.133Z" },
4335
+ { url = "https://files.pythonhosted.org/packages/fb/39/44a096d67270ea93df91d33877dbe91fbda3aa4f8ec2edf799d93eda8736/watchfiles-1.2.0-cp314-cp314t-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:63ac26eefbf4af1741247d6fb68b11c49a25b2f7413fbd318a83a12aaa9cf666", size = 464644, upload-time = "2026-05-18T04:30:57.33Z" },
4336
+ { url = "https://files.pythonhosted.org/packages/0e/80/c7472203bad6268e3ef1ad260739704847898938ad7ea8b63a5131f46b50/watchfiles-1.2.0-cp314-cp314t-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:0c4997d4e4a55f0d02b6cde327322daf3a0400e5df6c6b15948994bf72497925", size = 454771, upload-time = "2026-05-18T04:30:48.736Z" },
4337
+ { url = "https://files.pythonhosted.org/packages/51/cf/3b10b268b4b7f0fc26e9debb5eef1998b515887840f444cd3ec80c688755/watchfiles-1.2.0-cp314-cp314t-manylinux_2_31_riscv64.whl", hash = "sha256:4c887eba18b7945ac73067a8b4a66f21cd46c2539b2bc68588f7be6c7eb6d26b", size = 463494, upload-time = "2026-05-18T04:31:33.826Z" },
4338
+ { url = "https://files.pythonhosted.org/packages/3d/3e/a4302545cd589262a0dc7d140e86f7688eba3f9c72776c27f7e23b8864c4/watchfiles-1.2.0-cp314-cp314t-musllinux_1_1_aarch64.whl", hash = "sha256:3416ff151bb6b5a8d8d11664974fbef4d9305b9b2957839ab5a270468fd8df30", size = 629383, upload-time = "2026-05-18T04:31:15.596Z" },
4339
+ { url = "https://files.pythonhosted.org/packages/db/99/d5649df0a9a410d45b7c882304d0b790903ac9b6e8f2cfd12114e0c6b9f2/watchfiles-1.2.0-cp314-cp314t-musllinux_1_1_x86_64.whl", hash = "sha256:0e831a271c035d89789cffc386b6aa1375f39f1cd25eb7ca0997e4970d152fc5", size = 656093, upload-time = "2026-05-18T04:31:58.707Z" },
4340
+ { url = "https://files.pythonhosted.org/packages/92/b9/362702539275019a54dd2e94511b31a9b89c5f9e6a21966de7eb692549fc/watchfiles-1.2.0-cp315-cp315-macosx_10_12_x86_64.whl", hash = "sha256:37a6721cdf3f65dbb13aa9503510ccb4451603ac837e44d265d7992a597e1374", size = 400109, upload-time = "2026-05-18T04:31:16.879Z" },
4341
+ { url = "https://files.pythonhosted.org/packages/8f/75/71d5ba62db781e5587bded1d944c675374bc4aa37ff33d5018d98e8b6538/watchfiles-1.2.0-cp315-cp315-macosx_11_0_arm64.whl", hash = "sha256:2b37d10b5a63bd4d87e18472d80fa525bd670586fae62e5dd580452764879b65", size = 392167, upload-time = "2026-05-18T04:31:28.058Z" },
4342
+ { url = "https://files.pythonhosted.org/packages/3c/01/c66dd95d0423fe30d31820e2d1d5bda773764131bbb6ac0cb1cf303ac328/watchfiles-1.2.0-cp315-cp315-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:0a105bc2283f67e8fbec74253ec2d94925de92ed72c0393f1206bf326b7b7b69", size = 452372, upload-time = "2026-05-18T04:31:00.836Z" },
4343
+ { url = "https://files.pythonhosted.org/packages/91/15/2fe99557e72f85627c6a8eed50d889e8d101623e060a22ad75b875cb932d/watchfiles-1.2.0-cp315-cp315-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:5327989a465505f05cfe06f04fa9d0c2fd5432bb243e10e6f012b1bdca3c8579", size = 459596, upload-time = "2026-05-18T04:31:34.96Z" },
4344
+ { url = "https://files.pythonhosted.org/packages/ed/23/d4acfa0023367428ed48351b3b9b267893037b6cadae55620c61c24bcfd4/watchfiles-1.2.0-cp315-cp315-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:ecb47f183a8025b2aa18b546725c3657e542112ae9c0613a2af79b4fa8d04ad7", size = 490869, upload-time = "2026-05-18T04:31:59.923Z" },
4345
+ { url = "https://files.pythonhosted.org/packages/a4/5f/3164cbdce06c9fb95c4f7b9e2f9760b5e2797af43a9ecc317ef42a23a278/watchfiles-1.2.0-cp315-cp315-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:8520a4ab0e37f770afc34459c4f8f7019e153f9124dc101c15538365875d1ab2", size = 571641, upload-time = "2026-05-18T04:32:00.948Z" },
4346
+ { url = "https://files.pythonhosted.org/packages/41/e6/85d3731c55e65cd7690f3f803d24c139588aaf863e4bf2148fe7a7fa1a19/watchfiles-1.2.0-cp315-cp315-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:71cd71740ed2c15211ebb237ced4e39a1cdf6f80566e5fe95428da1626f4fde6", size = 464444, upload-time = "2026-05-18T04:30:34.298Z" },
4347
+ { url = "https://files.pythonhosted.org/packages/f4/7d/562641012b8b09872742c3b8adf9629ec479fd78f8d68ae4a0c13da8add6/watchfiles-1.2.0-cp315-cp315-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:f88af53d6ddaf72179ef613ddc905e6f4785f712b49b80b3bef9f3525e6194b4", size = 453593, upload-time = "2026-05-18T04:31:23.464Z" },
4348
+ { url = "https://files.pythonhosted.org/packages/56/fe/cb8ef3d6f929d14158fdaaad9925985b7310abc9384dcd4d82dd0016fb59/watchfiles-1.2.0-cp315-cp315-manylinux_2_31_riscv64.whl", hash = "sha256:cee9d5efd929efdac5f7e58f72b3376f676b64050a91c5b99a7094c5b2317488", size = 465096, upload-time = "2026-05-18T04:31:30.384Z" },
4349
+ { url = "https://files.pythonhosted.org/packages/25/91/80908e835e100527a9267147b08c0eee1fa6ab0ffec15edc04d1d44885f7/watchfiles-1.2.0-cp315-cp315-musllinux_1_1_aarch64.whl", hash = "sha256:b718bf356bbc15e559bd8ef41782b573b8ae0e3f177ab244b440568d7ea02cfb", size = 630638, upload-time = "2026-05-18T04:30:49.89Z" },
4350
+ { url = "https://files.pythonhosted.org/packages/46/4b/95ab2f256bb4af3cb2eb23b9317bda984ee6e0f11733a5c004a6c95b06e3/watchfiles-1.2.0-cp315-cp315-musllinux_1_1_x86_64.whl", hash = "sha256:922c0e019fe68b3ae392965a766b02a71ba1168c932cebc3733cd52c5fe5b377", size = 657684, upload-time = "2026-05-18T04:31:32.027Z" },
4351
+ ]
4352
+
4353
  [[package]]
4354
  name = "word2number"
4355
  version = "1.1"