Spaces:
Running on Zero
name: Skill Agent PPTX
overview: >-
Add a Hermes-style skill agent library on top of your existing
TransformersBackend, with one education PowerPoint task as the first tab in a
multi-tab Gradio Space — structured so more task tabs can be added later.
todos:
- id: agent-lib
content: >-
Create libs/agent: SkillRegistry, ToolRegistry, AgentRunner,
TraceRecorder, pydantic outline models
status: completed
- id: pptx-skill
content: Add skills/education-pptx/SKILL.md and create_pptx tool (python-pptx)
status: completed
- id: gradio-tabs
content: 'Refactor Gradio app into Tabs: Education PPTX (primary) + Chat (debug)'
status: completed
- id: docker-workspace
content: Wire agent package into uv workspace, Dockerfile, models.yaml active_model
status: completed
- id: trace-demo
content: Add trace JSON export + README demo script for hackathon submission
status: completed
isProject: false
Skill Agent + Education PowerPoint (Phase 1)
Goal
Replace the generic chat demo with a local skill-based agent that uses your transformers presets (default: minicpm5-1b or openbmb/MiniCPM5-1B from models.yaml) to run a real workflow: topic in → slide outline → downloadable .pptx out.
This follows the agentskills.io / Hermes SKILL.md pattern without embedding the full Hermes runtime (too heavy for HF Space Docker).
Hackathon alignment: Backyard AI (teacher/tutor you know), Best Agent, Tiny Titan (≤4B), OpenBMB (MiniCPM), Sharing is Caring (publish trace JSON to Hub), optional Well-Tuned if you ship a fine-tuned preset later.
Architecture
flowchart TB
subgraph ui [apps/gradio-space]
Tabs[gr.Tabs]
EduTab[EducationPptxTab]
ChatTab[ChatTab optional later]
Tabs --> EduTab
end
subgraph agent [libs/agent]
Runner[AgentRunner]
Skills[SkillRegistry]
Tools[ToolRegistry]
Trace[TraceRecorder]
Runner --> Skills
Runner --> Tools
Runner --> Trace
end
subgraph inference [libs/inference]
Factory[factory.get_backend]
TF[TransformersBackend]
Factory --> TF
end
EduTab --> Runner
Runner --> Factory
Tools --> PptxTool[python-pptx]
Agent loop (simple, reliable for small models):
- Load skill
education-pptxfromskills/education-pptx/SKILL.md - User provides: topic, grade level, number of slides (3–8)
- Step A — LLM: generate structured slide outline (JSON schema in prompt; parse with fallback regex)
- Step B — Tool:
create_pptx(outline)writes file viapython-pptx(deterministic; no LLM needed for file bytes) - Step C — LLM (optional): one-sentence “teacher notes” per slide
- Return: trace steps +
gr.Filedownload + markdown preview
Small models are weak at multi-hop tool JSON — keep the loop fixed 2-step (outline → tool) rather than open-ended ReAct for v1.
New package: libs/agent
Add workspace member libs/agent with:
| Module | Responsibility |
|---|---|
skills.py |
Load SKILL.md (YAML frontmatter + body), list skills by task tag |
tools.py |
Register callable tools with name, description, JSON schema |
runner.py |
AgentRunner.run(skill_id, user_input, backend) — orchestrates LLM + tools |
trace.py |
Append-only step log (thought, tool_call, tool_result, artifact) |
prompts.py |
Skill-specific system prompts and JSON outline template |
Dependencies: inference (workspace), python-pptx, pydantic (outline validation)
Extend libs/inference/src/inference/base.py usage only — no changes required to TransformersBackend beyond optionally bumping max_tokens for outline generation via env or per-call kwargs (already supported in transformers.py).
First skill: skills/education-pptx/SKILL.md
---
name: education-pptx
description: Create a short lesson PowerPoint from a topic and grade level
task: education
tools:
- create_pptx
model_hints:
- minicpm5-1b
- qwen3b-gguf
---
## Workflow
1. Ask for topic, audience grade, slide count.
2. Produce JSON outline: title, slides[{title, bullets[], speaker_note}].
3. Call create_pptx with validated outline.
4. Return download link and preview.
First tool: create_pptx
Implement in libs/agent/src/agent/tools/pptx.py:
- Input: Pydantic model
SlideOutline(title, slides list) - Output: path under
/tmp/agent_outputs/{run_id}.pptx(HF Space writable temp) - Simple template: title slide + bullet slides + optional speaker notes in notes field
- No images in v1 (keeps scope shippable by June 15)
Gradio UI changes: apps/gradio-space/src/gradio_space/app.py
Refactor into:
gradio_space/
app.py # build_demo(), launch
tabs/
__init__.py
education_pptx.py # first task tab
chat.py # move existing ChatInterface here (secondary tab)
Tab 1 — Lesson Slides (primary submission UI):
- Inputs: Topic, Grade (dropdown), Slides (slider 3–8)
- Button: Generate
- Outputs: Markdown outline preview, File download, Agent trace (accordion)
- Status line: model name + device from existing
warmup()/model_status()
Tab 2 — Chat (keep for debugging): existing chat wired to same ACTIVE_MODEL
Use gr.Tabs() at top level; only Tab 1 needs polish for demo video.
Model default for Space: set active_model: minicpm5-1b in models.yaml (OpenBMB + Tiny Titan). Space hardware: GPU basic if transformers on CPU is too slow.
Trace export (Sharing is Caring badge)
After each run, write trace JSON to outputs/traces/{run_id}.json and expose a “Copy trace” / optional Hub dataset upload script:
scripts/upload_trace.py— pushes latest trace to a HF dataset repo (manual one-time setup; not required for v1 demo)
Trace schema (minimal):
{
"skill": "education-pptx",
"model": "minicpm5-1b",
"input": {"topic": "...", "grade": "6", "slides": 5},
"steps": [{"type": "llm", "prompt_hash": "...", "output": "..."}, {"type": "tool", "name": "create_pptx", "result": "..."}],
"artifact": "lesson_photosynthesis.pptx"
}
Docker / workspace updates
pyproject.toml: addagentworkspace member + root depDockerfile: COPYlibs/agent,skills/, updateuv syncapps/gradio-space/pyproject.toml: depend onagent- Update root
README.mdhackathon story: “Lesson slide builder for a teacher you know”
Phase 2 (after v1 ships — not in first PR)
- New tabs:
tabs/quiz_maker.py,tabs/worksheet.py— each maps to a newskills/*/SKILL.md SkillRegistryalready supports multiple skills; tabs just callAgentRunner.run(skill_id=...)- Off-Brand: custom layout via
gr.Blockstheming orgr.Serverif time allows - Fine-tuned Gemma preset from
notebook/gemma-finetune.ipynbfor Well-Tuned badge
Demo video script (for submission)
- Introduce real user (teacher/tutor) and problem: “building a 5-slide lesson takes 30+ minutes”
- Enter topic + grade in Tab 1, click Generate (~30–90s on GPU)
- Show outline preview + download
.pptx, open in LibreOffice/Google Slides - Show agent trace proving local model + tool pipeline (no cloud LLM API)
Risks and mitigations
| Risk | Mitigation |
|---|---|
| Small model outputs invalid JSON | Pydantic validate + one repair retry with “fix JSON only” prompt |
| CPU Space too slow | Pin minicpm5-1b, use GPU basic, or fallback qwen3b-gguf + llama.cpp for outline-only step |
| pptx dependency size | python-pptx is lightweight (~few MB) |
| Scope creep (many tabs) | Ship Tab 1 only; stub Tab 2 chat for dev |
Files to create / modify (summary)
Create:
libs/agent/(package + runner, tools, skills loader, trace)skills/education-pptx/SKILL.mdapps/gradio-space/src/gradio_space/tabs/education_pptx.pyapps/gradio-space/src/gradio_space/tabs/chat.py
Modify:
apps/gradio-space/src/gradio_space/app.py— Tabs shellmodels.yaml—active_model: minicpm5-1bfor SpaceDockerfile,pyproject.toml, workspace lockfileREADME.md— product story + agent docs