Spaces:

build-small-hackathon
/

lesson-agent

Running on Zero

App Files Files Community

lesson-agent / .cursor /plans /skill_agent_pptx_5413e3c2.plan.md

MSGEncrypted

skill education wip test

a31982f 20 days ago

8.62 kB

name: Skill Agent PPTX
overview: >-
  Add a Hermes-style skill agent library on top of your existing
  TransformersBackend, with one education PowerPoint task as the first tab in a
  multi-tab Gradio Space — structured so more task tabs can be added later.
todos:
  - id: agent-lib
    content: >-
      Create libs/agent: SkillRegistry, ToolRegistry, AgentRunner,
      TraceRecorder, pydantic outline models
    status: completed
  - id: pptx-skill
    content: Add skills/education-pptx/SKILL.md and create_pptx tool (python-pptx)
    status: completed
  - id: gradio-tabs
    content: 'Refactor Gradio app into Tabs: Education PPTX (primary) + Chat (debug)'
    status: completed
  - id: docker-workspace
    content: Wire agent package into uv workspace, Dockerfile, models.yaml active_model
    status: completed
  - id: trace-demo
    content: Add trace JSON export + README demo script for hackathon submission
    status: completed
isProject: false

Skill Agent + Education PowerPoint (Phase 1)

Goal

Replace the generic chat demo with a local skill-based agent that uses your transformers presets (default: minicpm5-1b or openbmb/MiniCPM5-1B from models.yaml) to run a real workflow: topic in → slide outline → downloadable .pptx out.

This follows the agentskills.io / Hermes SKILL.md pattern without embedding the full Hermes runtime (too heavy for HF Space Docker).

Hackathon alignment: Backyard AI (teacher/tutor you know), Best Agent, Tiny Titan (≤4B), OpenBMB (MiniCPM), Sharing is Caring (publish trace JSON to Hub), optional Well-Tuned if you ship a fine-tuned preset later.

Architecture

flowchart TB
  subgraph ui [apps/gradio-space]
    Tabs[gr.Tabs]
    EduTab[EducationPptxTab]
    ChatTab[ChatTab optional later]
    Tabs --> EduTab
  end

  subgraph agent [libs/agent]
    Runner[AgentRunner]
    Skills[SkillRegistry]
    Tools[ToolRegistry]
    Trace[TraceRecorder]
    Runner --> Skills
    Runner --> Tools
    Runner --> Trace
  end

  subgraph inference [libs/inference]
    Factory[factory.get_backend]
    TF[TransformersBackend]
    Factory --> TF
  end

  EduTab --> Runner
  Runner --> Factory
  Tools --> PptxTool[python-pptx]

Agent loop (simple, reliable for small models):

Load skill education-pptx from skills/education-pptx/SKILL.md
User provides: topic, grade level, number of slides (3–8)
Step A — LLM: generate structured slide outline (JSON schema in prompt; parse with fallback regex)
Step B — Tool: create_pptx(outline) writes file via python-pptx (deterministic; no LLM needed for file bytes)
Step C — LLM (optional): one-sentence “teacher notes” per slide
Return: trace steps + gr.File download + markdown preview

Small models are weak at multi-hop tool JSON — keep the loop fixed 2-step (outline → tool) rather than open-ended ReAct for v1.

New package: `libs/agent`

Add workspace member libs/agent with:

Module	Responsibility
`skills.py`	Load `SKILL.md` (YAML frontmatter + body), list skills by `task` tag
`tools.py`	Register callable tools with name, description, JSON schema
`runner.py`	`AgentRunner.run(skill_id, user_input, backend)` — orchestrates LLM + tools
`trace.py`	Append-only step log (`thought`, `tool_call`, `tool_result`, `artifact`)
`prompts.py`	Skill-specific system prompts and JSON outline template

Dependencies: inference (workspace), python-pptx, pydantic (outline validation)

Extend libs/inference/src/inference/base.py usage only — no changes required to TransformersBackend beyond optionally bumping max_tokens for outline generation via env or per-call kwargs (already supported in transformers.py).

First skill: `skills/education-pptx/SKILL.md`

---
name: education-pptx
description: Create a short lesson PowerPoint from a topic and grade level
task: education
tools:
  - create_pptx
model_hints:
  - minicpm5-1b
  - qwen3b-gguf
---

## Workflow
1. Ask for topic, audience grade, slide count.
2. Produce JSON outline: title, slides[{title, bullets[], speaker_note}].
3. Call create_pptx with validated outline.
4. Return download link and preview.

First tool: `create_pptx`

Implement in libs/agent/src/agent/tools/pptx.py:

Input: Pydantic model SlideOutline (title, slides list)
Output: path under /tmp/agent_outputs/{run_id}.pptx (HF Space writable temp)
Simple template: title slide + bullet slides + optional speaker notes in notes field
No images in v1 (keeps scope shippable by June 15)

Gradio UI changes: `apps/gradio-space/src/gradio_space/app.py`

Refactor into:

gradio_space/
  app.py              # build_demo(), launch
  tabs/
    __init__.py
    education_pptx.py # first task tab
    chat.py           # move existing ChatInterface here (secondary tab)

Tab 1 — Lesson Slides (primary submission UI):

Inputs: Topic, Grade (dropdown), Slides (slider 3–8)
Button: Generate
Outputs: Markdown outline preview, File download, Agent trace (accordion)
Status line: model name + device from existing warmup() / model_status()

Tab 2 — Chat (keep for debugging): existing chat wired to same ACTIVE_MODEL

Use gr.Tabs() at top level; only Tab 1 needs polish for demo video.

Model default for Space: set active_model: minicpm5-1b in models.yaml (OpenBMB + Tiny Titan). Space hardware: GPU basic if transformers on CPU is too slow.

Trace export (Sharing is Caring badge)

After each run, write trace JSON to outputs/traces/{run_id}.json and expose a “Copy trace” / optional Hub dataset upload script:

scripts/upload_trace.py — pushes latest trace to a HF dataset repo (manual one-time setup; not required for v1 demo)

Trace schema (minimal):

{
  "skill": "education-pptx",
  "model": "minicpm5-1b",
  "input": {"topic": "...", "grade": "6", "slides": 5},
  "steps": [{"type": "llm", "prompt_hash": "...", "output": "..."}, {"type": "tool", "name": "create_pptx", "result": "..."}],
  "artifact": "lesson_photosynthesis.pptx"
}

Docker / workspace updates

pyproject.toml: add agent workspace member + root dep
Dockerfile: COPY libs/agent, skills/, update uv sync
apps/gradio-space/pyproject.toml: depend on agent
Update root README.md hackathon story: “Lesson slide builder for a teacher you know”

Phase 2 (after v1 ships — not in first PR)

New tabs: tabs/quiz_maker.py, tabs/worksheet.py — each maps to a new skills/*/SKILL.md
SkillRegistry already supports multiple skills; tabs just call AgentRunner.run(skill_id=...)
Off-Brand: custom layout via gr.Blocks theming or gr.Server if time allows
Fine-tuned Gemma preset from notebook/gemma-finetune.ipynb for Well-Tuned badge

Demo video script (for submission)

Introduce real user (teacher/tutor) and problem: “building a 5-slide lesson takes 30+ minutes”
Enter topic + grade in Tab 1, click Generate (~30–90s on GPU)
Show outline preview + download .pptx, open in LibreOffice/Google Slides
Show agent trace proving local model + tool pipeline (no cloud LLM API)

Risks and mitigations

Risk	Mitigation
Small model outputs invalid JSON	Pydantic validate + one repair retry with “fix JSON only” prompt
CPU Space too slow	Pin `minicpm5-1b`, use GPU basic, or fallback `qwen3b-gguf` + llama.cpp for outline-only step
pptx dependency size	`python-pptx` is lightweight (~few MB)
Scope creep (many tabs)	Ship Tab 1 only; stub Tab 2 chat for dev

Files to create / modify (summary)

Create:

libs/agent/ (package + runner, tools, skills loader, trace)
skills/education-pptx/SKILL.md
apps/gradio-space/src/gradio_space/tabs/education_pptx.py
apps/gradio-space/src/gradio_space/tabs/chat.py

Modify:

apps/gradio-space/src/gradio_space/app.py — Tabs shell
models.yaml — active_model: minicpm5-1b for Space
Dockerfile, pyproject.toml, workspace lockfile
README.md — product story + agent docs