lesson-agent / .cursor /plans /skill_agent_pptx_5413e3c2.plan.md
MSGEncrypted's picture
skill education wip test
a31982f
|
Raw
History Blame
8.62 kB
---
name: Skill Agent PPTX
overview: Add a Hermes-style skill agent library on top of your existing TransformersBackend, with one education PowerPoint task as the first tab in a multi-tab Gradio Space structured so more task tabs can be added later.
todos:
- id: agent-lib
content: "Create libs/agent: SkillRegistry, ToolRegistry, AgentRunner, TraceRecorder, pydantic outline models"
status: completed
- id: pptx-skill
content: Add skills/education-pptx/SKILL.md and create_pptx tool (python-pptx)
status: completed
- id: gradio-tabs
content: "Refactor Gradio app into Tabs: Education PPTX (primary) + Chat (debug)"
status: completed
- id: docker-workspace
content: Wire agent package into uv workspace, Dockerfile, models.yaml active_model
status: completed
- id: trace-demo
content: Add trace JSON export + README demo script for hackathon submission
status: completed
isProject: false
---
# Skill Agent + Education PowerPoint (Phase 1)
## Goal
Replace the generic chat demo with a **local skill-based agent** that uses your **transformers** presets (default: `minicpm5-1b` or `openbmb/MiniCPM5-1B` from [`models.yaml`](models.yaml)) to run a real workflow: **topic in → slide outline → downloadable `.pptx` out**.
This follows the **agentskills.io / Hermes SKILL.md pattern** without embedding the full Hermes runtime (too heavy for HF Space Docker).
**Hackathon alignment:** Backyard AI (teacher/tutor you know), **Best Agent**, **Tiny Titan** (≤4B), **OpenBMB** (MiniCPM), **Sharing is Caring** (publish trace JSON to Hub), optional **Well-Tuned** if you ship a fine-tuned preset later.
---
## Architecture
```mermaid
flowchart TB
subgraph ui [apps/gradio-space]
Tabs[gr.Tabs]
EduTab[EducationPptxTab]
ChatTab[ChatTab optional later]
Tabs --> EduTab
end
subgraph agent [libs/agent]
Runner[AgentRunner]
Skills[SkillRegistry]
Tools[ToolRegistry]
Trace[TraceRecorder]
Runner --> Skills
Runner --> Tools
Runner --> Trace
end
subgraph inference [libs/inference]
Factory[factory.get_backend]
TF[TransformersBackend]
Factory --> TF
end
EduTab --> Runner
Runner --> Factory
Tools --> PptxTool[python-pptx]
```
**Agent loop (simple, reliable for small models):**
1. Load skill `education-pptx` from `skills/education-pptx/SKILL.md`
2. User provides: topic, grade level, number of slides (3–8)
3. **Step A — LLM:** generate structured slide outline (JSON schema in prompt; parse with fallback regex)
4. **Step B — Tool:** `create_pptx(outline)` writes file via `python-pptx` (deterministic; no LLM needed for file bytes)
5. **Step C — LLM (optional):** one-sentence “teacher notes” per slide
6. Return: trace steps + `gr.File` download + markdown preview
Small models are weak at multi-hop tool JSON — keep the loop **fixed 2-step** (outline → tool) rather than open-ended ReAct for v1.
---
## New package: `libs/agent`
Add workspace member `libs/agent` with:
| Module | Responsibility |
|--------|----------------|
| `skills.py` | Load `SKILL.md` (YAML frontmatter + body), list skills by `task` tag |
| `tools.py` | Register callable tools with name, description, JSON schema |
| `runner.py` | `AgentRunner.run(skill_id, user_input, backend)` — orchestrates LLM + tools |
| `trace.py` | Append-only step log (`thought`, `tool_call`, `tool_result`, `artifact`) |
| `prompts.py` | Skill-specific system prompts and JSON outline template |
**Dependencies:** `inference` (workspace), `python-pptx`, `pydantic` (outline validation)
**Extend [`libs/inference/src/inference/base.py`](libs/inference/src/inference/base.py) usage only** — no changes required to `TransformersBackend` beyond optionally bumping `max_tokens` for outline generation via env or per-call kwargs (already supported in [`transformers.py`](libs/inference/src/inference/transformers.py)).
---
## First skill: `skills/education-pptx/SKILL.md`
```markdown
---
name: education-pptx
description: Create a short lesson PowerPoint from a topic and grade level
task: education
tools:
- create_pptx
model_hints:
- minicpm5-1b
- qwen3b-gguf
---
## Workflow
1. Ask for topic, audience grade, slide count.
2. Produce JSON outline: title, slides[{title, bullets[], speaker_note}].
3. Call create_pptx with validated outline.
4. Return download link and preview.
```
---
## First tool: `create_pptx`
Implement in `libs/agent/src/agent/tools/pptx.py`:
- Input: Pydantic model `SlideOutline` (title, slides list)
- Output: path under `/tmp/agent_outputs/{run_id}.pptx` (HF Space writable temp)
- Simple template: title slide + bullet slides + optional speaker notes in notes field
- No images in v1 (keeps scope shippable by June 15)
---
## Gradio UI changes: [`apps/gradio-space/src/gradio_space/app.py`](apps/gradio-space/src/gradio_space/app.py)
Refactor into:
```
gradio_space/
app.py # build_demo(), launch
tabs/
__init__.py
education_pptx.py # first task tab
chat.py # move existing ChatInterface here (secondary tab)
```
**Tab 1 — Lesson Slides (primary submission UI):**
- Inputs: Topic, Grade (dropdown), Slides (slider 3–8)
- Button: Generate
- Outputs: Markdown outline preview, File download, Agent trace (accordion)
- Status line: model name + device from existing `warmup()` / `model_status()`
**Tab 2 — Chat (keep for debugging):** existing chat wired to same `ACTIVE_MODEL`
Use `gr.Tabs()` at top level; only Tab 1 needs polish for demo video.
**Model default for Space:** set `active_model: minicpm5-1b` in [`models.yaml`](models.yaml) (OpenBMB + Tiny Titan). Space hardware: GPU basic if transformers on CPU is too slow.
---
## Trace export (Sharing is Caring badge)
After each run, write trace JSON to `outputs/traces/{run_id}.json` and expose a “Copy trace” / optional Hub dataset upload script:
- `scripts/upload_trace.py` — pushes latest trace to a HF dataset repo (manual one-time setup; not required for v1 demo)
Trace schema (minimal):
```json
{
"skill": "education-pptx",
"model": "minicpm5-1b",
"input": {"topic": "...", "grade": "6", "slides": 5},
"steps": [{"type": "llm", "prompt_hash": "...", "output": "..."}, {"type": "tool", "name": "create_pptx", "result": "..."}],
"artifact": "lesson_photosynthesis.pptx"
}
```
---
## Docker / workspace updates
- [`pyproject.toml`](pyproject.toml): add `agent` workspace member + root dep
- [`Dockerfile`](Dockerfile): COPY `libs/agent`, `skills/`, update `uv sync`
- [`apps/gradio-space/pyproject.toml`](apps/gradio-space/pyproject.toml): depend on `agent`
- Update root [`README.md`](README.md) hackathon story: “Lesson slide builder for a teacher you know”
---
## Phase 2 (after v1 ships — not in first PR)
- New tabs: `tabs/quiz_maker.py`, `tabs/worksheet.py` — each maps to a new `skills/*/SKILL.md`
- `SkillRegistry` already supports multiple skills; tabs just call `AgentRunner.run(skill_id=...)`
- **Off-Brand:** custom layout via `gr.Blocks` theming or `gr.Server` if time allows
- Fine-tuned Gemma preset from [`notebook/gemma-finetune.ipynb`](notebook/gemma-finetune.ipynb) for **Well-Tuned** badge
---
## Demo video script (for submission)
1. Introduce real user (teacher/tutor) and problem: “building a 5-slide lesson takes 30+ minutes”
2. Enter topic + grade in Tab 1, click Generate (~30–90s on GPU)
3. Show outline preview + download `.pptx`, open in LibreOffice/Google Slides
4. Show agent trace proving local model + tool pipeline (no cloud LLM API)
---
## Risks and mitigations
| Risk | Mitigation |
|------|------------|
| Small model outputs invalid JSON | Pydantic validate + one repair retry with “fix JSON only” prompt |
| CPU Space too slow | Pin `minicpm5-1b`, use GPU basic, or fallback `qwen3b-gguf` + llama.cpp for outline-only step |
| pptx dependency size | `python-pptx` is lightweight (~few MB) |
| Scope creep (many tabs) | Ship Tab 1 only; stub Tab 2 chat for dev |
---
## Files to create / modify (summary)
**Create:**
- `libs/agent/` (package + runner, tools, skills loader, trace)
- `skills/education-pptx/SKILL.md`
- `apps/gradio-space/src/gradio_space/tabs/education_pptx.py`
- `apps/gradio-space/src/gradio_space/tabs/chat.py`
**Modify:**
- [`apps/gradio-space/src/gradio_space/app.py`](apps/gradio-space/src/gradio_space/app.py) — Tabs shell
- [`models.yaml`](models.yaml) — `active_model: minicpm5-1b` for Space
- [`Dockerfile`](Dockerfile), [`pyproject.toml`](pyproject.toml), workspace lockfile
- [`README.md`](README.md) — product story + agent docs