MSG commited on
Commit
1e52a1f
Β·
1 Parent(s): bd75839

Feat/monday 4 sprint fast (#21)

Browse files

* quizz makers wip

* skills quizz

* quizz harness

* quizz run skills ui

* index html quizz

* quizz maker skills

* fix common

* fix common and modal app

* experimental wip

* experiemnt wip

* common check and multiple publish

.cursor/plans/quiz_maker_skill_52f29d14.plan.md CHANGED
@@ -4,19 +4,19 @@ overview: "Sprint 1 (teaching loop): ship a quiz-maker skill mirroring education
4
  todos:
5
  - id: quiz-skill-backend
6
  content: Create quiz-maker skill, QuizOutline models, prompts, create_quiz tool, iter_quiz_maker runner
7
- status: pending
8
  - id: quiz-tests
9
  content: "Agent tests: JSON repair, fallback_quiz, docx/html smoke"
10
- status: pending
11
  - id: quiz-classic-tab
12
  content: Add tabs/quiz_maker.py with source modes + wire Classic Gradio tab
13
- status: pending
14
  - id: quiz-studio-ui
15
  content: Add api_generate_quiz + Studio Quiz sidebar view with DOCX/HTML downloads
16
- status: pending
17
  - id: quiz-teaching-cta
18
  content: "Slides view CTA: Create quiz on this topic (pre-fill topic/grade/session)"
19
- status: pending
20
  isProject: false
21
  ---
22
 
 
4
  todos:
5
  - id: quiz-skill-backend
6
  content: Create quiz-maker skill, QuizOutline models, prompts, create_quiz tool, iter_quiz_maker runner
7
+ status: completed
8
  - id: quiz-tests
9
  content: "Agent tests: JSON repair, fallback_quiz, docx/html smoke"
10
+ status: completed
11
  - id: quiz-classic-tab
12
  content: Add tabs/quiz_maker.py with source modes + wire Classic Gradio tab
13
+ status: completed
14
  - id: quiz-studio-ui
15
  content: Add api_generate_quiz + Studio Quiz sidebar view with DOCX/HTML downloads
16
+ status: completed
17
  - id: quiz-teaching-cta
18
  content: "Slides view CTA: Create quiz on this topic (pre-fill topic/grade/session)"
19
+ status: completed
20
  isProject: false
21
  ---
22
 
apps/gradio-space/README.md CHANGED
@@ -31,6 +31,7 @@ This package uses **Gradio 6 Server mode** (`gradio.Server`):
31
  - `discover_sources`, `auto_search_ingest`, `ingest_sources`, `ingest_url`, `ingest_files`
32
  - `research_chat`, `generate_slides` (supports `source_mode`: none / web / rag)
33
  - `generate_slides_from_conversation` β€” build a deck from Research, Language lessons, or Chat history
 
34
 
35
  **Voice & coach**
36
 
@@ -61,8 +62,11 @@ Set `ALLOW_MODEL_SWITCH=true` in `.env` (see [USAGE.md](../../USAGE.md)). The Se
61
  1. Open `/` β€” **Small Model Finetuning** project workspace
62
  2. **Research** β€” ingest a PDF or URL on your topic β†’ ask 2 RAG questions with citations
63
  3. Tap **Generate slides from chat** β†’ switch to **Slides** β†’ preview deck β†’ **Present** (fullscreen, arrow keys)
64
- 4. Download **PPTX** and expand **Agent trace**
65
- 5. Optional: **Language lessons** β†’ French voice turn β†’ **Slides from chat** on the same topic
 
 
 
66
 
67
  ### Language lessons + Cohere stack (voice demo)
68
 
 
31
  - `discover_sources`, `auto_search_ingest`, `ingest_sources`, `ingest_url`, `ingest_files`
32
  - `research_chat`, `generate_slides` (supports `source_mode`: none / web / rag)
33
  - `generate_slides_from_conversation` β€” build a deck from Research, Language lessons, or Chat history
34
+ - `generate_quiz` β€” printable MCQ worksheet (DOCX + HTML) with optional RAG / web sources
35
 
36
  **Voice & coach**
37
 
 
62
  1. Open `/` β€” **Small Model Finetuning** project workspace
63
  2. **Research** β€” ingest a PDF or URL on your topic β†’ ask 2 RAG questions with citations
64
  3. Tap **Generate slides from chat** β†’ switch to **Slides** β†’ preview deck β†’ **Present** (fullscreen, arrow keys)
65
+ 4. Tap **Create quiz on this topic** β†’ **Quiz** view β†’ generate worksheet β†’ download **DOCX** (answer key included)
66
+ 5. Download **PPTX** and expand **Agent trace**
67
+ 6. Optional: **Language lessons** β†’ French voice turn β†’ **Slides from chat** on the same topic
68
+
69
+ Classic UI (`/classic`) adds a **Quiz maker** tab after **Lesson slides** with the same agent pipeline.
70
 
71
  ### Language lessons + Cohere stack (voice demo)
72
 
apps/gradio-space/src/gradio_space/api/studio.py CHANGED
@@ -39,6 +39,7 @@ from gradio_space.research_helpers import (
39
  )
40
  from gradio_space.conversation_helpers import format_conversation_context
41
  from gradio_space.tabs.education_pptx import SOURCE_MODES, SEARCH_WORKFLOWS, generate_lesson_slides
 
42
  from gradio_space.tabs.research_mind import (
43
  ask_question,
44
  auto_search_ingest,
@@ -642,6 +643,132 @@ def api_generate_slides(
642
  )
643
 
644
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
645
  def api_generate_slides_from_conversation(
646
  history: list | None,
647
  history_kind: str,
@@ -1226,6 +1353,34 @@ def register_studio_apis(server: gr.Server) -> None:
1226
  file_paths,
1227
  )
1228
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1229
  @server.api(name="language_lesson_turn")
1230
  def _language_lesson_turn(
1231
  message: str = "",
 
39
  )
40
  from gradio_space.conversation_helpers import format_conversation_context
41
  from gradio_space.tabs.education_pptx import SOURCE_MODES, SEARCH_WORKFLOWS, generate_lesson_slides
42
+ from gradio_space.tabs.quiz_maker import generate_quiz
43
  from gradio_space.tabs.research_mind import (
44
  ask_question,
45
  auto_search_ingest,
 
643
  )
644
 
645
 
646
+ def _build_quiz_api_response(
647
+ last: tuple,
648
+ *,
649
+ topic: str,
650
+ sid: str,
651
+ rag_notice: str = "",
652
+ ) -> dict[str, Any]:
653
+ (
654
+ outline_md,
655
+ preview_html,
656
+ docx,
657
+ html_export,
658
+ processing_log,
659
+ trace_sum,
660
+ trace_json,
661
+ status,
662
+ ) = last
663
+
664
+ if preview_html and "form-error" in preview_html:
665
+ return err(status or "Generation failed.", status=status, progress_log=processing_log)
666
+
667
+ if rag_notice:
668
+ status = f"{rag_notice}\n\n{status or 'Quiz generated.'}".strip()
669
+
670
+ downloads = {
671
+ "docx": docx,
672
+ "html": html_export,
673
+ }
674
+ trace_str = trace_json if isinstance(trace_json, str) else ""
675
+ return ok(
676
+ topic=topic,
677
+ session_id=sid,
678
+ outline_md=outline_md,
679
+ preview_html=preview_html,
680
+ downloads=downloads,
681
+ status=status,
682
+ rag_fallback=bool(rag_notice),
683
+ progress_log=processing_log,
684
+ trace_summary=trace_sum,
685
+ trace_json=trace_str,
686
+ trace_html=render_trace_details(
687
+ trace_summary=trace_sum,
688
+ trace_json=trace_str,
689
+ progress_log=processing_log,
690
+ ),
691
+ elapsed_seconds=_elapsed_seconds_from_log(processing_log),
692
+ progress=_progress_from_trace(trace_str),
693
+ )
694
+
695
+
696
+ def _run_quiz_generation(**kwargs) -> dict[str, Any]:
697
+ topic = kwargs.pop("topic")
698
+ sid = kwargs.pop("sid", "")
699
+ rag_notice = kwargs.pop("rag_notice", "")
700
+
701
+ gen = generate_quiz(topic, **kwargs)
702
+ last: tuple | None = None
703
+ for item in gen:
704
+ last = item
705
+ if last is None:
706
+ return err("Generation failed before producing output.")
707
+ return _build_quiz_api_response(last, topic=topic, sid=sid, rag_notice=rag_notice)
708
+
709
+
710
+ def api_generate_quiz(
711
+ topic: str,
712
+ grade: str = "6",
713
+ question_count: int = 5,
714
+ session_id: str = "",
715
+ use_rag: bool = True,
716
+ doc_ids: list[str] | None = None,
717
+ source_mode: str = "",
718
+ search_workflow: str = "two_step",
719
+ urls_text: str = "",
720
+ selected_urls: list[str] | None = None,
721
+ file_paths: list[str] | None = None,
722
+ ) -> dict[str, Any]:
723
+ rag_docs = doc_ids or []
724
+ sid = (session_id or "").strip()
725
+ if not (source_mode or "").strip() and use_rag and not sid:
726
+ sid = _pick_session(topic)
727
+
728
+ source_label, workflow_label, effective_sid, effective_docs = _resolve_source_labels(
729
+ source_mode,
730
+ search_workflow,
731
+ use_rag,
732
+ sid,
733
+ rag_docs,
734
+ )
735
+
736
+ rag_notice = ""
737
+ if (source_mode or "").strip().lower() == "rag" or (
738
+ not (source_mode or "").strip() and use_rag
739
+ ):
740
+ has_sources = _session_has_rag_sources(sid, rag_docs)
741
+ if use_rag and not has_sources and source_label == _SOURCE_LABELS["rag"]:
742
+ rag_notice = (
743
+ "Cross-Reference Sources is on, but this session has no indexed documents β€” "
744
+ "generated from model knowledge only. Ingest sources in Step 1 to enable RAG."
745
+ )
746
+ source_label = _SOURCE_LABELS["none"]
747
+ effective_sid = ""
748
+ effective_docs = []
749
+
750
+ upload_files = file_paths if file_paths else None
751
+
752
+ return _run_quiz_generation(
753
+ topic=topic,
754
+ sid=sid,
755
+ rag_notice=rag_notice,
756
+ grade=grade,
757
+ question_count=int(question_count),
758
+ source_mode_label=source_label,
759
+ search_workflow_label=workflow_label,
760
+ urls_text=urls_text or "",
761
+ selected_urls=selected_urls or [],
762
+ upload_files=upload_files,
763
+ session_id=effective_sid,
764
+ doc_ids=effective_docs,
765
+ workspace_topic=topic,
766
+ workspace_session=effective_sid,
767
+ workspace_doc_ids=effective_docs,
768
+ progress=_NoopProgress(),
769
+ )
770
+
771
+
772
  def api_generate_slides_from_conversation(
773
  history: list | None,
774
  history_kind: str,
 
1353
  file_paths,
1354
  )
1355
 
1356
+ @server.api(name="generate_quiz")
1357
+ def _generate_quiz(
1358
+ topic: str,
1359
+ grade: str = "6",
1360
+ question_count: int = 5,
1361
+ session_id: str = "",
1362
+ use_rag: bool = True,
1363
+ doc_ids: list[str] | None = None,
1364
+ source_mode: str = "",
1365
+ search_workflow: str = "two_step",
1366
+ urls_text: str = "",
1367
+ selected_urls: list[str] | None = None,
1368
+ file_paths: list[str] | None = None,
1369
+ ) -> dict[str, Any]:
1370
+ return api_generate_quiz(
1371
+ topic,
1372
+ grade,
1373
+ question_count,
1374
+ session_id,
1375
+ use_rag,
1376
+ doc_ids,
1377
+ source_mode,
1378
+ search_workflow,
1379
+ urls_text,
1380
+ selected_urls,
1381
+ file_paths,
1382
+ )
1383
+
1384
  @server.api(name="language_lesson_turn")
1385
  def _language_lesson_turn(
1386
  message: str = "",
apps/gradio-space/src/gradio_space/app.py CHANGED
@@ -9,6 +9,7 @@ from gradio_space.tabs import (
9
  build_chat_tab,
10
  build_education_pptx_tab,
11
  build_echo_coach_tab,
 
12
  build_research_mind_tab,
13
  build_teacher_voice_tab,
14
  )
@@ -63,6 +64,8 @@ def build_demo() -> gr.Blocks:
63
  with gr.Tabs():
64
  with gr.Tab("Lesson slides"):
65
  build_education_pptx_tab(workspace)
 
 
66
  with gr.Tab("ResearchMind"):
67
  build_research_mind_tab(workspace)
68
  with gr.Tab("EchoCoach"):
 
9
  build_chat_tab,
10
  build_education_pptx_tab,
11
  build_echo_coach_tab,
12
+ build_quiz_maker_tab,
13
  build_research_mind_tab,
14
  build_teacher_voice_tab,
15
  )
 
64
  with gr.Tabs():
65
  with gr.Tab("Lesson slides"):
66
  build_education_pptx_tab(workspace)
67
+ with gr.Tab("Quiz maker"):
68
+ build_quiz_maker_tab(workspace)
69
  with gr.Tab("ResearchMind"):
70
  build_research_mind_tab(workspace)
71
  with gr.Tab("EchoCoach"):
apps/gradio-space/src/gradio_space/server.py CHANGED
@@ -23,7 +23,7 @@ from gradio_space.ui.theme import get_theme, load_css
23
  _PKG_ROOT = Path(__file__).resolve().parent
24
  _APP_ROOT = _PKG_ROOT.parents[1]
25
  _STATIC_DIR = _APP_ROOT / "static" / "studio"
26
- _STUDIO_ASSET_VERSION = "20260615c"
27
  _STUDIO_INDEX_HTML = _STATIC_DIR / "index.html"
28
 
29
 
 
23
  _PKG_ROOT = Path(__file__).resolve().parent
24
  _APP_ROOT = _PKG_ROOT.parents[1]
25
  _STATIC_DIR = _APP_ROOT / "static" / "studio"
26
+ _STUDIO_ASSET_VERSION = "20260615d"
27
  _STUDIO_INDEX_HTML = _STATIC_DIR / "index.html"
28
 
29
 
apps/gradio-space/src/gradio_space/tabs/__init__.py CHANGED
@@ -1,6 +1,7 @@
1
  from gradio_space.tabs.chat import build_chat_tab
2
  from gradio_space.tabs.education_pptx import build_education_pptx_tab
3
  from gradio_space.tabs.echo_coach import build_echo_coach_tab
 
4
  from gradio_space.tabs.research_mind import build_research_mind_tab
5
  from gradio_space.tabs.teacher_voice import build_teacher_voice_tab
6
 
@@ -8,6 +9,7 @@ __all__ = [
8
  "build_chat_tab",
9
  "build_education_pptx_tab",
10
  "build_echo_coach_tab",
 
11
  "build_research_mind_tab",
12
  "build_teacher_voice_tab",
13
  ]
 
1
  from gradio_space.tabs.chat import build_chat_tab
2
  from gradio_space.tabs.education_pptx import build_education_pptx_tab
3
  from gradio_space.tabs.echo_coach import build_echo_coach_tab
4
+ from gradio_space.tabs.quiz_maker import build_quiz_maker_tab
5
  from gradio_space.tabs.research_mind import build_research_mind_tab
6
  from gradio_space.tabs.teacher_voice import build_teacher_voice_tab
7
 
 
9
  "build_chat_tab",
10
  "build_education_pptx_tab",
11
  "build_echo_coach_tab",
12
+ "build_quiz_maker_tab",
13
  "build_research_mind_tab",
14
  "build_teacher_voice_tab",
15
  ]
apps/gradio-space/src/gradio_space/tabs/quiz_maker.py ADDED
@@ -0,0 +1,430 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from html import escape
2
+ from pathlib import Path
3
+
4
+ import gradio as gr
5
+
6
+ from agent.progress import QuizGenerationProgress
7
+ from agent.runner import AgentRunner, QuizAgentResult
8
+ from gradio_space.model_loading import ensure_model_loaded, get_active_model_key
9
+ from gradio_space.research_helpers import (
10
+ list_session_choices,
11
+ merge_lesson_urls,
12
+ refresh_doc_choices,
13
+ refresh_sessions,
14
+ resolve_doc_ids,
15
+ resolve_session,
16
+ resolve_topic,
17
+ )
18
+ from gradio_space.spaces_runtime import gpu_task
19
+ from gradio_space.tabs.education_pptx import (
20
+ SEARCH_WORKFLOWS,
21
+ SOURCE_MODES,
22
+ discover_lesson_sources,
23
+ strip_md_inline,
24
+ update_source_visibility,
25
+ )
26
+ from gradio_space.ui.components import build_advanced_panel, DOC_CHOICE_LIST_CLASSES, WorkspaceWidgets
27
+ from inference.factory import get_backend
28
+
29
+ _SOURCE_LABEL_TO_VALUE = {label: value for label, value in SOURCE_MODES}
30
+ _WORKFLOW_LABEL_TO_VALUE = {label: value for label, value in SEARCH_WORKFLOWS}
31
+
32
+
33
+ def _source_mode_value(label: str) -> str:
34
+ return _SOURCE_LABEL_TO_VALUE.get(label, "none")
35
+
36
+
37
+ def _search_workflow_value(label: str) -> str:
38
+ return _WORKFLOW_LABEL_TO_VALUE.get(label, "two_step")
39
+
40
+
41
+ def _error_html(message: str) -> str:
42
+ safe = (
43
+ message.replace("&", "&")
44
+ .replace("<", "&lt;")
45
+ .replace(">", "&gt;")
46
+ )
47
+ return (
48
+ f'<div style="padding:12px;border:1px solid #c44;border-radius:8px;'
49
+ f'background:#fff5f5;color:#8a1f1f;">{safe}</div>'
50
+ )
51
+
52
+
53
+ def _empty_outputs(message: str) -> tuple:
54
+ log_html = (
55
+ f'<div class="slide-gen-log"><div class="slide-gen-log-banner error">'
56
+ f"{message}</div></div>"
57
+ )
58
+ return (
59
+ message,
60
+ _error_html(message),
61
+ None,
62
+ None,
63
+ log_html,
64
+ message,
65
+ message,
66
+ message,
67
+ )
68
+
69
+
70
+ def _running_preview_html(step_label: str = "Generating quiz…") -> str:
71
+ safe = (
72
+ step_label.replace("&", "&amp;")
73
+ .replace("<", "&lt;")
74
+ .replace(">", "&gt;")
75
+ )
76
+ return (
77
+ '<div class="lesson-running-preview">'
78
+ '<div class="lesson-running-spinner" aria-hidden="true"></div>'
79
+ f"<p><strong>{safe}</strong></p>"
80
+ "<p class=\"lesson-running-hint\">Local models can take 30–90s on CPU. "
81
+ "Steps update live below.</p>"
82
+ "</div>"
83
+ )
84
+
85
+
86
+ def _interim_outputs(
87
+ quiz_progress: QuizGenerationProgress,
88
+ *,
89
+ status: str = "_Generating quiz…_",
90
+ step_label: str = "Generating quiz…",
91
+ ) -> tuple:
92
+ log_html = quiz_progress.format_log_html(running=True)
93
+ return (
94
+ "",
95
+ _running_preview_html(step_label),
96
+ None,
97
+ None,
98
+ log_html,
99
+ "",
100
+ "",
101
+ status,
102
+ )
103
+
104
+
105
+ def _format_processing_log(
106
+ progress: QuizGenerationProgress,
107
+ *,
108
+ trace_summary: str = "",
109
+ source_status: str = "",
110
+ ) -> str:
111
+ footer_parts: list[str] = []
112
+ if source_status:
113
+ footer_parts.append(
114
+ f"<p><strong>Sources:</strong> {escape(strip_md_inline(source_status))}</p>"
115
+ )
116
+ if trace_summary:
117
+ footer_parts.append(
118
+ f'<pre class="slide-gen-log-trace">{escape(trace_summary)}</pre>'
119
+ )
120
+ footer_html = "".join(footer_parts)
121
+ return progress.format_log_html(running=False, footer_html=footer_html)
122
+
123
+
124
+ @gpu_task(duration=300)
125
+ def generate_quiz(
126
+ topic: str,
127
+ grade: str,
128
+ question_count: int,
129
+ source_mode_label: str,
130
+ search_workflow_label: str,
131
+ urls_text: str,
132
+ selected_urls: list[str],
133
+ upload_files: list[str] | None,
134
+ session_id: str,
135
+ doc_ids: list[str] | None,
136
+ workspace_topic: str = "",
137
+ workspace_session: str = "",
138
+ workspace_doc_ids: list[str] | None = None,
139
+ progress: gr.Progress = gr.Progress(),
140
+ ):
141
+ topic = resolve_topic(topic, workspace_topic)
142
+ session_id = resolve_session(session_id, workspace_session)
143
+ doc_ids = resolve_doc_ids(doc_ids, workspace_doc_ids)
144
+ quiz_progress = QuizGenerationProgress(
145
+ on_update=lambda fraction, desc: progress(fraction, desc=desc),
146
+ )
147
+ quiz_progress.begin("load_model", "Load language model")
148
+
149
+ model_key = get_active_model_key()
150
+ load_error = ensure_model_loaded(model_key)
151
+ if load_error:
152
+ yield _empty_outputs(load_error)
153
+ return
154
+
155
+ if not topic.strip():
156
+ message = "Please enter a quiz topic."
157
+ yield _empty_outputs(message)
158
+ return
159
+
160
+ source_mode = _source_mode_value(source_mode_label)
161
+ search_workflow = _search_workflow_value(search_workflow_label)
162
+ merged_urls = merge_lesson_urls(urls_text, selected_urls)
163
+ files = [Path(p) for p in (upload_files or [])]
164
+
165
+ current_step = "Load language model"
166
+ yield _interim_outputs(quiz_progress, step_label=current_step)
167
+
168
+ result = None
169
+ try:
170
+ runner = AgentRunner()
171
+ for item in runner.iter_quiz_maker(
172
+ topic=topic,
173
+ grade=grade,
174
+ question_count=int(question_count),
175
+ model_key=model_key,
176
+ backend=get_backend(model_key),
177
+ source_mode=source_mode, # type: ignore[arg-type]
178
+ search_workflow=search_workflow, # type: ignore[arg-type]
179
+ urls=merged_urls,
180
+ files=files,
181
+ session_id=session_id or None,
182
+ doc_ids=doc_ids or [],
183
+ progress=quiz_progress,
184
+ ):
185
+ if isinstance(item, QuizAgentResult):
186
+ result = item
187
+ break
188
+ current_step = item.steps[-1].label if item.steps else current_step
189
+ yield _interim_outputs(quiz_progress, step_label=current_step)
190
+ except Exception as exc: # noqa: BLE001
191
+ message = f"Agent error: {exc}"
192
+ quiz_progress.finish()
193
+ yield (
194
+ message,
195
+ _error_html(message),
196
+ None,
197
+ None,
198
+ quiz_progress.format_log_html(running=False),
199
+ message,
200
+ message,
201
+ message,
202
+ )
203
+ return
204
+
205
+ if result is None:
206
+ message = "Agent error: generation finished without a result."
207
+ yield _empty_outputs(message)
208
+ return
209
+
210
+ progress(1.0, desc="Done")
211
+ trace_summary = (
212
+ f"Run `{result.trace.run_id}` Β· skill `{result.trace.skill}` Β· "
213
+ f"model `{result.trace.model}`\n\n"
214
+ f"Trace saved: `{result.trace_path}`"
215
+ )
216
+ source_status = result.source_summary or "_No external sources used (model only)._"
217
+ processing_log = _format_processing_log(
218
+ quiz_progress,
219
+ trace_summary=trace_summary,
220
+ source_status=source_status,
221
+ )
222
+ yield (
223
+ result.markdown_preview,
224
+ result.html_preview,
225
+ str(Path(result.docx_path).resolve()),
226
+ str(Path(result.html_export_path).resolve()),
227
+ processing_log,
228
+ trace_summary,
229
+ result.trace.to_json(),
230
+ source_status,
231
+ )
232
+
233
+
234
+ def build_quiz_maker_tab(workspace: WorkspaceWidgets) -> None:
235
+ gr.Markdown("### Quiz maker", elem_classes=["lesson-tab-heading"])
236
+ gr.HTML(
237
+ '<p class="tab-subtitle">Create a printable multiple-choice quiz with answer key '
238
+ "from your topic and optional research sources.</p>"
239
+ )
240
+
241
+ with gr.Column(elem_classes=["lesson-form-primary"]):
242
+ topic = gr.Textbox(
243
+ label="Quiz topic",
244
+ placeholder="e.g. Photosynthesis, Fractions, The water cycle…",
245
+ lines=2,
246
+ max_lines=3,
247
+ elem_classes=["lesson-topic-input"],
248
+ )
249
+
250
+ with gr.Row(elem_classes=["lesson-form-secondary"]):
251
+ grade = gr.Dropdown(
252
+ label="Grade",
253
+ choices=["K", "1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "Adult"],
254
+ value="6",
255
+ scale=1,
256
+ min_width=100,
257
+ )
258
+ question_count = gr.Slider(
259
+ minimum=5,
260
+ maximum=10,
261
+ step=1,
262
+ value=5,
263
+ label="Questions",
264
+ scale=2,
265
+ )
266
+
267
+ with gr.Accordion("Research sources (optional)", open=False, elem_classes=["lesson-optional-accordion"]):
268
+ source_mode = gr.Radio(
269
+ label="Source mode",
270
+ choices=[m[0] for m in SOURCE_MODES],
271
+ value=SOURCE_MODES[0][0],
272
+ )
273
+ search_workflow = gr.Radio(
274
+ label="Web search workflow",
275
+ choices=[m[0] for m in SEARCH_WORKFLOWS],
276
+ value=SEARCH_WORKFLOWS[0][0],
277
+ visible=False,
278
+ )
279
+ discover_btn = gr.Button("Discover sources", variant="secondary", visible=False)
280
+ with gr.Row():
281
+ session_dd = gr.Dropdown(
282
+ label="ResearchMind session",
283
+ choices=list_session_choices(),
284
+ value="",
285
+ visible=False,
286
+ )
287
+ refresh_sess_btn = gr.Button("↻", size="sm", visible=False, min_width=40)
288
+ url_choices = gr.CheckboxGroup(
289
+ label="Suggested URLs to use",
290
+ choices=[],
291
+ visible=False,
292
+ elem_classes=DOC_CHOICE_LIST_CLASSES,
293
+ )
294
+ urls_text = gr.Textbox(
295
+ label="URLs (one per line, optional)",
296
+ lines=3,
297
+ placeholder="https://en.wikipedia.org/wiki/...",
298
+ visible=False,
299
+ )
300
+ upload_files = gr.File(
301
+ label="Upload PDF or DOCX",
302
+ file_count="multiple",
303
+ file_types=[".pdf", ".docx"],
304
+ visible=False,
305
+ )
306
+ doc_dd = gr.CheckboxGroup(
307
+ label="Documents in session (RAG scope)",
308
+ choices=[],
309
+ value=[],
310
+ visible=False,
311
+ elem_classes=DOC_CHOICE_LIST_CLASSES,
312
+ )
313
+
314
+ with gr.Row(elem_classes=["lesson-generate-row"]):
315
+ generate_btn = gr.Button(
316
+ "Generate quiz",
317
+ variant="primary",
318
+ elem_classes=["primary-cta"],
319
+ scale=1,
320
+ )
321
+
322
+ source_status = gr.Markdown(value="_Ready to generate._", elem_classes=["lesson-status"])
323
+ processing_log = gr.HTML(
324
+ value=(
325
+ '<div class="slide-gen-log slide-gen-log-idle">'
326
+ "<p>Generation steps and timings appear here when you run.</p>"
327
+ "</div>"
328
+ ),
329
+ elem_classes=["lesson-processing-log"],
330
+ )
331
+
332
+ with gr.Tabs():
333
+ with gr.Tab("Worksheet preview"):
334
+ quiz_preview = gr.HTML(label="Quiz preview")
335
+ with gr.Tab("Outline"):
336
+ outline_preview = gr.Markdown(label="Outline (markdown)")
337
+
338
+ with gr.Row():
339
+ docx_file = gr.File(label="Download worksheet (.docx)", interactive=False)
340
+ html_file = gr.File(label="Download HTML preview", interactive=False)
341
+
342
+ with gr.Accordion("Agent trace", open=False):
343
+ trace_summary = gr.Markdown()
344
+ trace_json = gr.Code(language="json", label="Trace JSON")
345
+
346
+ advanced = build_advanced_panel()
347
+
348
+ source_controls = [
349
+ search_workflow,
350
+ discover_btn,
351
+ url_choices,
352
+ urls_text,
353
+ upload_files,
354
+ session_dd,
355
+ refresh_sess_btn,
356
+ doc_dd,
357
+ generate_btn,
358
+ ]
359
+
360
+ def _refresh_visibility(mode_label: str, workflow_label: str):
361
+ return update_source_visibility(mode_label, workflow_label)
362
+
363
+ source_mode.change(
364
+ fn=_refresh_visibility,
365
+ inputs=[source_mode, search_workflow],
366
+ outputs=source_controls,
367
+ )
368
+ search_workflow.change(
369
+ fn=_refresh_visibility,
370
+ inputs=[source_mode, search_workflow],
371
+ outputs=source_controls,
372
+ )
373
+
374
+ refresh_sess_btn.click(fn=refresh_sessions, inputs=[session_dd], outputs=[session_dd])
375
+ session_dd.change(
376
+ fn=refresh_doc_choices,
377
+ inputs=[session_dd, doc_dd],
378
+ outputs=[doc_dd],
379
+ )
380
+
381
+ discover_btn.click(
382
+ fn=discover_lesson_sources,
383
+ inputs=[topic, session_dd, workspace.topic, workspace.session_dd],
384
+ outputs=[source_status, url_choices, session_dd],
385
+ )
386
+
387
+ generate_btn.click(
388
+ fn=generate_quiz,
389
+ inputs=[
390
+ topic,
391
+ grade,
392
+ question_count,
393
+ source_mode,
394
+ search_workflow,
395
+ urls_text,
396
+ url_choices,
397
+ upload_files,
398
+ session_dd,
399
+ doc_dd,
400
+ workspace.topic,
401
+ workspace.session_dd,
402
+ workspace.doc_dd,
403
+ ],
404
+ outputs=[
405
+ outline_preview,
406
+ quiz_preview,
407
+ docx_file,
408
+ html_file,
409
+ processing_log,
410
+ trace_summary,
411
+ trace_json,
412
+ source_status,
413
+ ],
414
+ show_progress="hidden",
415
+ )
416
+
417
+ def _sync_session_from_workspace(ws_session: str, local_session: str):
418
+ if ws_session and ws_session != local_session:
419
+ return gr.update(value=ws_session)
420
+ return gr.update()
421
+
422
+ workspace.session_dd.change(
423
+ fn=_sync_session_from_workspace,
424
+ inputs=[workspace.session_dd, session_dd],
425
+ outputs=[session_dd],
426
+ ).then(
427
+ fn=refresh_doc_choices,
428
+ inputs=[session_dd, doc_dd],
429
+ outputs=[doc_dd],
430
+ )
apps/gradio-space/static/studio/index.html CHANGED
@@ -36,6 +36,7 @@
36
  <nav class="sidebar-nav">
37
  <button type="button" class="nav-item" data-view="research"><span class="material-symbols-outlined">search</span>Research</button>
38
  <button type="button" class="nav-item active" data-view="slides"><span class="material-symbols-outlined">present_to_all</span>Slides</button>
 
39
  <button type="button" class="nav-item" data-view="language-lessons"><span class="material-symbols-outlined">translate</span>Language lessons</button>
40
  <button type="button" class="nav-item" data-view="debug"><span class="material-symbols-outlined">chat</span>Chat</button>
41
  <button type="button" id="btn-open-settings" class="nav-item"><span class="material-symbols-outlined">settings</span>Settings</button>
@@ -313,6 +314,10 @@
313
  <div id="slide-outline" class="slide-outline"></div>
314
  </details>
315
  <div id="downloads" class="downloads hidden"></div>
 
 
 
 
316
  <details class="slide-export-help">
317
  <summary>Export help β€” open in Google Docs</summary>
318
  <p class="status-text">Download the <strong>.docx</strong> file, upload it to <a href="https://drive.google.com" target="_blank" rel="noopener">Google Drive</a>, then choose <strong>Open with β†’ Google Docs</strong>. You can also upload the <strong>.html</strong> file via Google Docs β†’ File β†’ Open β†’ Upload.</p>
@@ -320,6 +325,119 @@
320
  </div>
321
  </section>
322
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
323
  <section class="col col-studio">
324
  <div class="lessons-layout view-lessons-only">
325
  <aside class="lessons-rail">
 
36
  <nav class="sidebar-nav">
37
  <button type="button" class="nav-item" data-view="research"><span class="material-symbols-outlined">search</span>Research</button>
38
  <button type="button" class="nav-item active" data-view="slides"><span class="material-symbols-outlined">present_to_all</span>Slides</button>
39
+ <button type="button" class="nav-item" data-view="quiz"><span class="material-symbols-outlined">quiz</span>Quiz</button>
40
  <button type="button" class="nav-item" data-view="language-lessons"><span class="material-symbols-outlined">translate</span>Language lessons</button>
41
  <button type="button" class="nav-item" data-view="debug"><span class="material-symbols-outlined">chat</span>Chat</button>
42
  <button type="button" id="btn-open-settings" class="nav-item"><span class="material-symbols-outlined">settings</span>Settings</button>
 
314
  <div id="slide-outline" class="slide-outline"></div>
315
  </details>
316
  <div id="downloads" class="downloads hidden"></div>
317
+ <button type="button" id="btn-slides-to-quiz" class="btn btn-ghost btn-block slides-to-quiz hidden">
318
+ <span class="material-symbols-outlined">quiz</span>
319
+ Create quiz on this topic
320
+ </button>
321
  <details class="slide-export-help">
322
  <summary>Export help β€” open in Google Docs</summary>
323
  <p class="status-text">Download the <strong>.docx</strong> file, upload it to <a href="https://drive.google.com" target="_blank" rel="noopener">Google Drive</a>, then choose <strong>Open with β†’ Google Docs</strong>. You can also upload the <strong>.html</strong> file via Google Docs β†’ File β†’ Open β†’ Upload.</p>
 
325
  </div>
326
  </section>
327
 
328
+ <section class="col col-quiz view-quiz-only">
329
+ <div class="card card-tall">
330
+ <div class="card-header">
331
+ <div class="step-badge">3</div>
332
+ <h2>Quiz maker</h2>
333
+ </div>
334
+ <div class="controls-panel">
335
+ <div class="controls-grid">
336
+ <label class="field">
337
+ <span>Topic override (optional)</span>
338
+ <input id="quiz-topic" type="text" class="input" placeholder="Uses workspace topic when empty" />
339
+ </label>
340
+ <label class="field">
341
+ <span>Grade</span>
342
+ <select id="quiz-grade" class="input">
343
+ <option value="K">K</option>
344
+ <option value="1">1</option>
345
+ <option value="2">2</option>
346
+ <option value="3">3</option>
347
+ <option value="4">4</option>
348
+ <option value="5">5</option>
349
+ <option value="6" selected>6</option>
350
+ <option value="7">7</option>
351
+ <option value="8">8</option>
352
+ <option value="9">9</option>
353
+ <option value="10">10</option>
354
+ <option value="11">11</option>
355
+ <option value="12">12</option>
356
+ <option value="Adult">Adult</option>
357
+ </select>
358
+ </label>
359
+ <label class="field field-wide">
360
+ <span>Questions: <strong id="quiz-count-val">5</strong></span>
361
+ <input id="quiz-count" type="range" min="5" max="10" value="5" />
362
+ </label>
363
+ </div>
364
+ <details class="slide-source-details" id="quiz-source-details">
365
+ <summary>Research sources (optional)</summary>
366
+ <label class="field">
367
+ <span>Source mode</span>
368
+ <select id="quiz-source-mode" class="input">
369
+ <option value="">Auto (RAG toggle)</option>
370
+ <option value="none">None (model only)</option>
371
+ <option value="web">Web search</option>
372
+ <option value="rag">RAG (indexed sources)</option>
373
+ </select>
374
+ </label>
375
+ <label class="field slide-web-workflow hidden" id="quiz-web-workflow-wrap">
376
+ <span>Web search workflow</span>
377
+ <select id="quiz-search-workflow" class="input">
378
+ <option value="two_step">Discover &amp; confirm</option>
379
+ <option value="auto">Auto search &amp; ingest</option>
380
+ </select>
381
+ </label>
382
+ <div class="slide-web-discover hidden" id="quiz-web-discover-wrap">
383
+ <button type="button" id="btn-quiz-discover" class="btn btn-secondary btn-block">Discover sources</button>
384
+ <div id="quiz-url-choices-panel" class="url-choices-panel hidden">
385
+ <div id="quiz-url-choices-list" class="url-choices-list"></div>
386
+ </div>
387
+ <label class="field">
388
+ <span>URLs (one per line)</span>
389
+ <textarea id="quiz-urls-text" class="input" rows="2" placeholder="https://…"></textarea>
390
+ </label>
391
+ </div>
392
+ <label class="upload-zone upload-zone-compact">
393
+ <input id="quiz-source-files" type="file" accept=".pdf,.docx" multiple hidden />
394
+ <span class="material-symbols-outlined">upload_file</span>
395
+ <span>Upload PDF or Doc for generation</span>
396
+ </label>
397
+ </details>
398
+ <div class="controls-actions">
399
+ <button type="button" id="btn-generate-quiz" class="btn btn-primary">
400
+ <span class="material-symbols-outlined">auto_awesome</span>
401
+ Generate quiz
402
+ </button>
403
+ </div>
404
+ <p id="quiz-generate-status" class="status-text">Ready to generate.</p>
405
+ <div id="quiz-progress-panel" class="progress-panel hidden">
406
+ <div class="progress-panel-head">
407
+ <span id="quiz-progress-elapsed" class="progress-elapsed">Elapsed: 0s</span>
408
+ <span id="quiz-progress-eta" class="progress-eta"></span>
409
+ </div>
410
+ <div class="progress-bar-track" aria-hidden="true">
411
+ <div id="quiz-progress-bar-fill" class="progress-bar-fill" style="width: 0%"></div>
412
+ </div>
413
+ <p id="quiz-progress-current" class="progress-current">Idle</p>
414
+ <ol id="quiz-progress-steps" class="progress-steps"></ol>
415
+ <div id="quiz-progress-log" class="progress-log hidden" aria-live="polite"></div>
416
+ <details class="studio-debug-trace" id="quiz-trace-details">
417
+ <summary>Agent trace</summary>
418
+ <div id="quiz-trace-panel"></div>
419
+ </details>
420
+ </div>
421
+ </div>
422
+ <div id="quiz-preview" class="slide-canvas">
423
+ <div id="quiz-preview-overlay" class="region-loading hidden" aria-live="polite">
424
+ <div class="region-loading-inner">
425
+ <span class="studio-spinner" aria-hidden="true"></span>
426
+ <p class="region-loading-text">Generating quiz…</p>
427
+ </div>
428
+ </div>
429
+ <div id="quiz-preview-content" class="slide-canvas-content">
430
+ <div class="studio-canvas-empty"><p>Generate a quiz to preview the worksheet here.</p></div>
431
+ </div>
432
+ </div>
433
+ <details class="slide-outline-details hidden" id="quiz-outline-details">
434
+ <summary>Outline (markdown)</summary>
435
+ <div id="quiz-outline" class="slide-outline"></div>
436
+ </details>
437
+ <div id="quiz-downloads" class="downloads hidden"></div>
438
+ </div>
439
+ </section>
440
+
441
  <section class="col col-studio">
442
  <div class="lessons-layout view-lessons-only">
443
  <aside class="lessons-rail">
apps/gradio-space/static/studio/studio.css CHANGED
@@ -879,7 +879,8 @@ body.sidebar-open {
879
  .studio-coach-hint { margin: 0; opacity: 0.8; line-height: 1.45; }
880
 
881
  .workspace[data-view="research"] .col-slides,
882
- .workspace[data-view="research"] .col-studio { display: none; }
 
883
  .workspace[data-view="research"] {
884
  grid-template-columns: 1fr;
885
  max-width: 1120px;
@@ -1184,7 +1185,45 @@ body.sidebar-open {
1184
  }
1185
 
1186
  .workspace[data-view="language-lessons"] .col-research,
1187
- .workspace[data-view="language-lessons"] .col-slides { display: none; }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1188
 
1189
  .workspace[data-view="language-lessons"] .col-debug { display: none; }
1190
 
@@ -1388,6 +1427,7 @@ body.sidebar-open {
1388
  }
1389
 
1390
  .workspace[data-view="slides"] .col-studio,
 
1391
  .workspace[data-view="research"] .col-debug { display: none; }
1392
 
1393
  .coach-card-head {
@@ -1889,7 +1929,8 @@ body.sidebar-open {
1889
 
1890
  .workspace[data-view="debug"] .col-research,
1891
  .workspace[data-view="debug"] .col-slides,
1892
- .workspace[data-view="debug"] .col-studio { display: none; }
 
1893
 
1894
  .workspace[data-view="debug"] {
1895
  grid-template-columns: 1fr;
 
879
  .studio-coach-hint { margin: 0; opacity: 0.8; line-height: 1.45; }
880
 
881
  .workspace[data-view="research"] .col-slides,
882
+ .workspace[data-view="research"] .col-studio,
883
+ .workspace[data-view="research"] .col-quiz { display: none; }
884
  .workspace[data-view="research"] {
885
  grid-template-columns: 1fr;
886
  max-width: 1120px;
 
1185
  }
1186
 
1187
  .workspace[data-view="language-lessons"] .col-research,
1188
+ .workspace[data-view="language-lessons"] .col-slides,
1189
+ .workspace[data-view="language-lessons"] .col-quiz { display: none; }
1190
+
1191
+ .view-quiz-only { display: none; }
1192
+
1193
+ .workspace[data-view="quiz"] .col-research,
1194
+ .workspace[data-view="quiz"] .col-slides,
1195
+ .workspace[data-view="quiz"] .col-studio,
1196
+ .workspace[data-view="quiz"] .col-debug { display: none; }
1197
+
1198
+ .workspace[data-view="quiz"] {
1199
+ grid-template-columns: minmax(0, 1fr);
1200
+ max-width: 960px;
1201
+ gap: 1.25rem;
1202
+ }
1203
+
1204
+ .workspace[data-view="quiz"] .col-quiz {
1205
+ display: block;
1206
+ grid-column: 1 / -1;
1207
+ width: 100%;
1208
+ min-width: 0;
1209
+ }
1210
+
1211
+ .workspace[data-view="quiz"] .quiz-preview-inner {
1212
+ font-size: 0.92rem;
1213
+ }
1214
+
1215
+ .quiz-preview-frame {
1216
+ width: 100%;
1217
+ min-height: 520px;
1218
+ border: 1px solid var(--border-subtle, #ddd);
1219
+ border-radius: 8px;
1220
+ background: #fff;
1221
+ }
1222
+
1223
+ .slides-to-quiz {
1224
+ margin-top: 0.75rem;
1225
+ text-align: left;
1226
+ }
1227
 
1228
  .workspace[data-view="language-lessons"] .col-debug { display: none; }
1229
 
 
1427
  }
1428
 
1429
  .workspace[data-view="slides"] .col-studio,
1430
+ .workspace[data-view="slides"] .col-quiz,
1431
  .workspace[data-view="research"] .col-debug { display: none; }
1432
 
1433
  .coach-card-head {
 
1929
 
1930
  .workspace[data-view="debug"] .col-research,
1931
  .workspace[data-view="debug"] .col-slides,
1932
+ .workspace[data-view="debug"] .col-studio,
1933
+ .workspace[data-view="debug"] .col-quiz { display: none; }
1934
 
1935
  .workspace[data-view="debug"] {
1936
  grid-template-columns: 1fr;
apps/gradio-space/static/studio/studio.js CHANGED
@@ -50,6 +50,13 @@ const SLIDE_PIPELINE_STEPS = [
50
  "Build PPTX, DOCX, and HTML exports",
51
  ];
52
 
 
 
 
 
 
 
 
53
  const state = {
54
  workspaceTopic: "small model finetuning",
55
  workspaceSessionId: "",
@@ -58,6 +65,8 @@ const state = {
58
  selectedUrls: [],
59
  slideDiscoveredUrls: [],
60
  slideSelectedUrls: [],
 
 
61
  lessonsDiscoveredUrls: [],
62
  lessonsSelectedUrls: [],
63
  researchChatHistory: [],
@@ -65,6 +74,9 @@ const state = {
65
  lessonsMode: "lesson",
66
  history: [],
67
  downloads: null,
 
 
 
68
  client: null,
69
  progressTimer: null,
70
  progressStartedAt: null,
@@ -171,9 +183,25 @@ function syncSlideSourceUi() {
171
  }
172
  }
173
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
174
  function syncResearchLayout() {
175
  syncIngestWorkflowUi();
176
  syncSlideSourceUi();
 
177
  updateResearchDocCount(state.workspaceDocIds?.length || 0);
178
  }
179
 
@@ -376,6 +404,13 @@ function renderSlideGenerationResult(data, { scrollToCanvas = false, pulsePresen
376
 
377
  setTracePanel("#slides-trace-panel", data);
378
 
 
 
 
 
 
 
 
379
  if (scrollToCanvas) {
380
  $("#slide-canvas")?.scrollIntoView({ behavior: "smooth", block: "nearest" });
381
  }
@@ -732,6 +767,19 @@ function renderSlideUrlChoices(urls, selected) {
732
  syncSlideSourceUi();
733
  }
734
 
 
 
 
 
 
 
 
 
 
 
 
 
 
735
  function syncUrlSelectAll() {
736
  const boxes = [...document.querySelectorAll("#url-choices-list input[type=checkbox]")];
737
  const selectAll = $("#url-select-all");
@@ -785,6 +833,18 @@ async function discoverSlideSources() {
785
  });
786
  }
787
 
 
 
 
 
 
 
 
 
 
 
 
 
788
  async function autoSearchIngest() {
789
  const topic = effectiveTopic("");
790
  if (!topic) {
@@ -1558,6 +1618,238 @@ async function generateSlidesFromConversation(kind) {
1558
  );
1559
  }
1560
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1561
  function renderLessonsReply(data) {
1562
  state.history = data.history ?? state.history;
1563
  if (state.history.length) {
@@ -1776,6 +2068,9 @@ function bindUi() {
1776
  $("#slide-count").addEventListener("input", (e) => {
1777
  $("#slide-count-val").textContent = e.target.value;
1778
  });
 
 
 
1779
 
1780
  document.querySelectorAll(".nav-item[data-view]").forEach((btn) => {
1781
  btn.addEventListener("click", () => {
@@ -1835,6 +2130,10 @@ function bindUi() {
1835
  $("#slide-search-workflow")?.addEventListener("change", syncSlideSourceUi);
1836
  $("#btn-slide-discover")?.addEventListener("click", () => discoverSlideSources().catch(() => {}));
1837
 
 
 
 
 
1838
  $("#btn-research-ask").addEventListener("click", () => askResearchQuestion().catch(() => {}));
1839
  $("#research-question")?.addEventListener("keydown", (e) => {
1840
  if (e.key === "Enter" && !e.shiftKey) {
@@ -1844,6 +2143,8 @@ function bindUi() {
1844
  });
1845
 
1846
  $("#btn-generate").addEventListener("click", () => generateSlides().catch(() => {}));
 
 
1847
  $("#btn-present")?.addEventListener("click", () => openPresenter());
1848
  $("#btn-research-to-slides")?.addEventListener("click", () =>
1849
  generateSlidesFromConversation("research").catch(() => {})
 
50
  "Build PPTX, DOCX, and HTML exports",
51
  ];
52
 
53
+ const QUIZ_PIPELINE_STEPS = [
54
+ "Load language model",
55
+ "Gather lesson sources",
56
+ "Generate quiz outline",
57
+ "Build DOCX and HTML quiz exports",
58
+ ];
59
+
60
  const state = {
61
  workspaceTopic: "small model finetuning",
62
  workspaceSessionId: "",
 
65
  selectedUrls: [],
66
  slideDiscoveredUrls: [],
67
  slideSelectedUrls: [],
68
+ quizDiscoveredUrls: [],
69
+ quizSelectedUrls: [],
70
  lessonsDiscoveredUrls: [],
71
  lessonsSelectedUrls: [],
72
  researchChatHistory: [],
 
74
  lessonsMode: "lesson",
75
  history: [],
76
  downloads: null,
77
+ quizDownloads: null,
78
+ lastSlideTopic: "",
79
+ lastSlideGrade: "6",
80
  client: null,
81
  progressTimer: null,
82
  progressStartedAt: null,
 
183
  }
184
  }
185
 
186
+ function syncQuizSourceUi() {
187
+ const mode = $("#quiz-source-mode")?.value || "";
188
+ const isWeb = mode === "web";
189
+ $("#quiz-web-workflow-wrap")?.classList.toggle("hidden", !isWeb);
190
+ $("#quiz-web-discover-wrap")?.classList.toggle("hidden", !isWeb);
191
+ if (isWeb && $("#quiz-search-workflow")?.value === "two_step") {
192
+ $("#quiz-url-choices-panel")?.classList.toggle(
193
+ "hidden",
194
+ !state.quizDiscoveredUrls.length
195
+ );
196
+ } else {
197
+ $("#quiz-url-choices-panel")?.classList.add("hidden");
198
+ }
199
+ }
200
+
201
  function syncResearchLayout() {
202
  syncIngestWorkflowUi();
203
  syncSlideSourceUi();
204
+ syncQuizSourceUi();
205
  updateResearchDocCount(state.workspaceDocIds?.length || 0);
206
  }
207
 
 
404
 
405
  setTracePanel("#slides-trace-panel", data);
406
 
407
+ const cta = $("#btn-slides-to-quiz");
408
+ if (cta) {
409
+ state.lastSlideTopic = data.topic || effectiveTopic($("#lesson-topic")?.value);
410
+ state.lastSlideGrade = $("#lesson-grade")?.value || "6";
411
+ cta.classList.remove("hidden");
412
+ }
413
+
414
  if (scrollToCanvas) {
415
  $("#slide-canvas")?.scrollIntoView({ behavior: "smooth", block: "nearest" });
416
  }
 
767
  syncSlideSourceUi();
768
  }
769
 
770
+ function renderQuizUrlChoices(urls, selected) {
771
+ state.quizDiscoveredUrls = urls || [];
772
+ state.quizSelectedUrls = selected?.length ? selected : [...state.quizDiscoveredUrls];
773
+ renderUrlChoices(
774
+ urls,
775
+ selected,
776
+ "#quiz-url-choices-list",
777
+ "#quiz-url-choices-panel",
778
+ { discovered: state.quizDiscoveredUrls, selected: state.quizSelectedUrls }
779
+ );
780
+ syncQuizSourceUi();
781
+ }
782
+
783
  function syncUrlSelectAll() {
784
  const boxes = [...document.querySelectorAll("#url-choices-list input[type=checkbox]")];
785
  const selectAll = $("#url-select-all");
 
833
  });
834
  }
835
 
836
+ async function discoverQuizSources() {
837
+ const topic = effectiveTopic($("#quiz-topic")?.value);
838
+ if (!topic) {
839
+ showError("Set a topic before discovering sources.");
840
+ return;
841
+ }
842
+ await withRegionLoading($(".col-quiz .controls-panel"), "Discovering sources…", async () => {
843
+ const data = await callApi("discover_sources", [topic, state.workspaceSessionId]);
844
+ renderQuizUrlChoices(data.urls || [], data.selected_urls || data.urls || []);
845
+ });
846
+ }
847
+
848
  async function autoSearchIngest() {
849
  const topic = effectiveTopic("");
850
  if (!topic) {
 
1618
  );
1619
  }
1620
 
1621
+ async function collectQuizGenerationParams() {
1622
+ const topic = effectiveTopic($("#quiz-topic")?.value);
1623
+ const grade = $("#quiz-grade")?.value;
1624
+ const questionCount = Number($("#quiz-count")?.value || 5);
1625
+ const useRag = Boolean($("#lessons-use-rag")?.checked);
1626
+ const docIds = effectiveDocIds([]);
1627
+ const sourceMode = $("#quiz-source-mode")?.value || "";
1628
+ const searchWorkflow = $("#quiz-search-workflow")?.value || "two_step";
1629
+ const urlsText = $("#quiz-urls-text")?.value.trim() || "";
1630
+ const selectedUrls = getSelectedDiscoveredUrls("#quiz-url-choices-list");
1631
+ const filePaths = [];
1632
+ const quizFiles = $("#quiz-source-files")?.files;
1633
+ if (quizFiles?.length) {
1634
+ for (const file of quizFiles) {
1635
+ filePaths.push(await uploadFile(file));
1636
+ }
1637
+ }
1638
+ return {
1639
+ topic,
1640
+ grade,
1641
+ questionCount,
1642
+ sessionId: state.workspaceSessionId,
1643
+ useRag,
1644
+ docIds,
1645
+ sourceMode,
1646
+ searchWorkflow,
1647
+ urlsText,
1648
+ selectedUrls,
1649
+ filePaths,
1650
+ };
1651
+ }
1652
+
1653
+ function startQuizProgressPanel() {
1654
+ const panel = $("#quiz-progress-panel");
1655
+ const stepsEl = $("#quiz-progress-steps");
1656
+ panel?.classList.remove("hidden");
1657
+ state.progressStartedAt = Date.now();
1658
+ if (stepsEl) {
1659
+ stepsEl.innerHTML = QUIZ_PIPELINE_STEPS.map(
1660
+ (label, index) =>
1661
+ `<li data-step="${index}" class="progress-step pending">${label}</li>`
1662
+ ).join("");
1663
+ }
1664
+ $("#quiz-progress-log")?.classList.add("hidden");
1665
+ if ($("#quiz-progress-log")) $("#quiz-progress-log").textContent = "";
1666
+ if ($("#quiz-progress-eta")) $("#quiz-progress-eta").textContent = "Est. remaining: calculating…";
1667
+ updateQuizProgressElapsed();
1668
+ if (state.progressTimer) clearInterval(state.progressTimer);
1669
+ state.progressTimer = setInterval(updateQuizProgressElapsed, 500);
1670
+ }
1671
+
1672
+ function updateQuizProgressElapsed() {
1673
+ if (!state.progressStartedAt) return;
1674
+ const elapsed = (Date.now() - state.progressStartedAt) / 1000;
1675
+ if ($("#quiz-progress-elapsed")) {
1676
+ $("#quiz-progress-elapsed").textContent = `Elapsed: ${elapsed.toFixed(1)}s`;
1677
+ }
1678
+ const eta = estimateQuizRemaining(elapsed);
1679
+ if ($("#quiz-progress-eta")) {
1680
+ $("#quiz-progress-eta").textContent =
1681
+ eta !== null ? `Est. remaining: ~${Math.max(0, Math.round(eta))}s` : "";
1682
+ }
1683
+ }
1684
+
1685
+ function estimateQuizRemaining(elapsed) {
1686
+ if (elapsed < 3) return null;
1687
+ const stepNodes = [...document.querySelectorAll("#quiz-progress-steps .progress-step")];
1688
+ const activeIndex = stepNodes.findIndex((node) => node.classList.contains("active"));
1689
+ const doneCount = stepNodes.filter((node) => node.classList.contains("done")).length;
1690
+ const progress = Math.max((doneCount + (activeIndex >= 0 ? 0.35 : 0)) / stepNodes.length, 0.15);
1691
+ return elapsed / progress - elapsed;
1692
+ }
1693
+
1694
+ function advanceQuizProgressWhileWaiting() {
1695
+ let current = 0;
1696
+ const mark = (index, status) => {
1697
+ const node = document.querySelector(`#quiz-progress-steps [data-step="${index}"]`);
1698
+ if (!node) return;
1699
+ node.classList.remove("pending", "active", "done");
1700
+ node.classList.add(status);
1701
+ };
1702
+ mark(current, "active");
1703
+ const timer = setInterval(() => {
1704
+ if (!$("#quiz-progress-panel") || $("#quiz-progress-panel").classList.contains("hidden")) {
1705
+ clearInterval(timer);
1706
+ return;
1707
+ }
1708
+ if (current < QUIZ_PIPELINE_STEPS.length - 1) {
1709
+ mark(current, "done");
1710
+ current += 1;
1711
+ mark(current, "active");
1712
+ }
1713
+ }, 9000);
1714
+ return timer;
1715
+ }
1716
+
1717
+ function finishQuizProgressPanel(data) {
1718
+ if (state.progressTimer) {
1719
+ clearInterval(state.progressTimer);
1720
+ state.progressTimer = null;
1721
+ }
1722
+ const stepsEl = $("#quiz-progress-steps");
1723
+ const traceSteps = data?.progress?.steps || [];
1724
+ if (stepsEl) {
1725
+ if (traceSteps.length) {
1726
+ stepsEl.innerHTML = traceSteps
1727
+ .map((step) => {
1728
+ const duration = step.duration_s != null ? ` (${step.duration_s}s)` : "";
1729
+ const detail = step.detail ? ` β€” ${step.detail}` : "";
1730
+ return `<li class="progress-step done">${step.label}${duration}${detail}</li>`;
1731
+ })
1732
+ .join("");
1733
+ } else {
1734
+ document.querySelectorAll("#quiz-progress-steps .progress-step").forEach((node) => {
1735
+ node.classList.remove("pending", "active");
1736
+ node.classList.add("done");
1737
+ });
1738
+ }
1739
+ }
1740
+ if (data?.progress_log) {
1741
+ const logEl = $("#quiz-progress-log");
1742
+ const log = data.progress_log;
1743
+ if (logEl) {
1744
+ if (/<[a-z][\s\S]*>/i.test(log)) logEl.innerHTML = log;
1745
+ else logEl.textContent = stripMd(log);
1746
+ logEl.classList.remove("hidden");
1747
+ }
1748
+ }
1749
+ if (data?.elapsed_seconds != null && $("#quiz-progress-elapsed")) {
1750
+ $("#quiz-progress-elapsed").textContent = `Elapsed: ${Number(data.elapsed_seconds).toFixed(1)}s`;
1751
+ }
1752
+ if ($("#quiz-progress-eta")) $("#quiz-progress-eta").textContent = "Complete";
1753
+ setTracePanel("#quiz-trace-panel", data);
1754
+ }
1755
+
1756
+ async function runQuizGenerationApi(apiArgs) {
1757
+ startQuizProgressPanel();
1758
+ const waitTimer = advanceQuizProgressWhileWaiting();
1759
+ try {
1760
+ return await callApi("generate_quiz", apiArgs);
1761
+ } finally {
1762
+ clearInterval(waitTimer);
1763
+ if (state.progressTimer) {
1764
+ clearInterval(state.progressTimer);
1765
+ state.progressTimer = null;
1766
+ }
1767
+ }
1768
+ }
1769
+
1770
+ function renderQuizGenerationResult(data, { scrollToPreview = false } = {}) {
1771
+ finishQuizProgressPanel(data);
1772
+ $("#quiz-generate-status").textContent = stripMd(data.status || "Quiz generated.");
1773
+ const contentEl = $("#quiz-preview-content");
1774
+ if (data.preview_html && contentEl) {
1775
+ const blob = new Blob([data.preview_html], { type: "text/html;charset=utf-8" });
1776
+ const url = URL.createObjectURL(blob);
1777
+ contentEl.innerHTML = `<iframe class="quiz-preview-frame" src="${url}" title="Quiz preview"></iframe>`;
1778
+ } else if (contentEl) {
1779
+ contentEl.innerHTML = '<div class="studio-canvas-empty"><p>Preview unavailable.</p></div>';
1780
+ }
1781
+
1782
+ state.quizDownloads = data.downloads;
1783
+ const dl = $("#quiz-downloads");
1784
+ if (data.downloads?.docx) {
1785
+ dl.classList.remove("hidden");
1786
+ dl.innerHTML = `
1787
+ <a href="${fileUrl(data.downloads.docx)}" download>DOCX worksheet</a>
1788
+ <a href="${fileUrl(data.downloads.html)}" download>HTML preview</a>`;
1789
+ } else {
1790
+ dl.classList.add("hidden");
1791
+ dl.innerHTML = "";
1792
+ }
1793
+
1794
+ const outlineDetails = $("#quiz-outline-details");
1795
+ const outlineEl = $("#quiz-outline");
1796
+ if (data.outline_md) {
1797
+ outlineEl.innerHTML = renderMarkdownLite(data.outline_md);
1798
+ outlineDetails?.classList.remove("hidden");
1799
+ } else {
1800
+ outlineEl.innerHTML = "";
1801
+ outlineDetails?.classList.add("hidden");
1802
+ }
1803
+
1804
+ setTracePanel("#quiz-trace-panel", data);
1805
+
1806
+ if (scrollToPreview) {
1807
+ $("#quiz-preview")?.scrollIntoView({ behavior: "smooth", block: "nearest" });
1808
+ }
1809
+ }
1810
+
1811
+ async function generateQuiz() {
1812
+ const params = await collectQuizGenerationParams();
1813
+
1814
+ await withRegionLoading(
1815
+ $("#quiz-preview"),
1816
+ "Generating quiz…",
1817
+ async () => {
1818
+ let data;
1819
+ try {
1820
+ data = await runQuizGenerationApi([
1821
+ params.topic,
1822
+ params.grade,
1823
+ params.questionCount,
1824
+ params.sessionId,
1825
+ params.useRag,
1826
+ params.docIds,
1827
+ params.sourceMode,
1828
+ params.searchWorkflow,
1829
+ params.urlsText,
1830
+ params.selectedUrls,
1831
+ params.filePaths,
1832
+ ]);
1833
+ } catch (_err) {
1834
+ if ($("#quiz-progress-eta")) $("#quiz-progress-eta").textContent = "Failed";
1835
+ throw _err;
1836
+ }
1837
+
1838
+ renderQuizGenerationResult(data, { scrollToPreview: true });
1839
+ },
1840
+ { overlayEl: $("#quiz-preview-overlay") }
1841
+ );
1842
+ }
1843
+
1844
+ function openQuizFromSlides() {
1845
+ const topic = state.lastSlideTopic || effectiveTopic($("#lesson-topic")?.value);
1846
+ const grade = state.lastSlideGrade || $("#lesson-grade")?.value || "6";
1847
+ if ($("#quiz-topic")) $("#quiz-topic").value = topic;
1848
+ if ($("#quiz-grade")) $("#quiz-grade").value = grade;
1849
+ setWorkspaceView("quiz");
1850
+ window.setTimeout(() => $("#quiz-topic")?.focus(), 80);
1851
+ }
1852
+
1853
  function renderLessonsReply(data) {
1854
  state.history = data.history ?? state.history;
1855
  if (state.history.length) {
 
2068
  $("#slide-count").addEventListener("input", (e) => {
2069
  $("#slide-count-val").textContent = e.target.value;
2070
  });
2071
+ $("#quiz-count")?.addEventListener("input", (e) => {
2072
+ $("#quiz-count-val").textContent = e.target.value;
2073
+ });
2074
 
2075
  document.querySelectorAll(".nav-item[data-view]").forEach((btn) => {
2076
  btn.addEventListener("click", () => {
 
2130
  $("#slide-search-workflow")?.addEventListener("change", syncSlideSourceUi);
2131
  $("#btn-slide-discover")?.addEventListener("click", () => discoverSlideSources().catch(() => {}));
2132
 
2133
+ $("#quiz-source-mode")?.addEventListener("change", syncQuizSourceUi);
2134
+ $("#quiz-search-workflow")?.addEventListener("change", syncQuizSourceUi);
2135
+ $("#btn-quiz-discover")?.addEventListener("click", () => discoverQuizSources().catch(() => {}));
2136
+
2137
  $("#btn-research-ask").addEventListener("click", () => askResearchQuestion().catch(() => {}));
2138
  $("#research-question")?.addEventListener("keydown", (e) => {
2139
  if (e.key === "Enter" && !e.shiftKey) {
 
2143
  });
2144
 
2145
  $("#btn-generate").addEventListener("click", () => generateSlides().catch(() => {}));
2146
+ $("#btn-generate-quiz")?.addEventListener("click", () => generateQuiz().catch(() => {}));
2147
+ $("#btn-slides-to-quiz")?.addEventListener("click", () => openQuizFromSlides());
2148
  $("#btn-present")?.addEventListener("click", () => openPresenter());
2149
  $("#btn-research-to-slides")?.addEventListener("click", () =>
2150
  generateSlidesFromConversation("research").catch(() => {})
libs/agent/src/agent/models.py CHANGED
@@ -17,6 +17,32 @@ class SlideOutline(BaseModel):
17
  slides: list[SlideSpec] = Field(min_length=1)
18
 
19
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
20
  class EducationPptxInput(BaseModel):
21
  topic: str
22
  grade: str
 
17
  slides: list[SlideSpec] = Field(min_length=1)
18
 
19
 
20
+ class QuizQuestion(BaseModel):
21
+ prompt: str
22
+ choices: list[str] = Field(min_length=4, max_length=4)
23
+ correct_index: int = Field(ge=0, le=3)
24
+ explanation: str = ""
25
+
26
+
27
+ class QuizOutline(BaseModel):
28
+ title: str
29
+ instructions: str = ""
30
+ questions: list[QuizQuestion] = Field(min_length=3, max_length=12)
31
+
32
+
33
+ class QuizMakerInput(BaseModel):
34
+ topic: str
35
+ grade: str
36
+ question_count: int = Field(ge=5, le=10, default=5)
37
+ source_mode: Literal["none", "web", "rag"] = "none"
38
+ search_workflow: Literal["two_step", "auto"] = "two_step"
39
+ urls: list[str] = Field(default_factory=list)
40
+ files: list[Path] = Field(default_factory=list)
41
+ session_id: str | None = None
42
+ doc_ids: list[str] = Field(default_factory=list)
43
+ conversation_context: str = ""
44
+
45
+
46
  class EducationPptxInput(BaseModel):
47
  topic: str
48
  grade: str
libs/agent/src/agent/progress.py CHANGED
@@ -204,3 +204,62 @@ class SlideGenerationProgress:
204
  fraction = min(self._completed_weight / total_weight, 0.98)
205
  desc = label if not detail else f"{label} β€” {detail}"
206
  self.on_update(fraction, desc)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
204
  fraction = min(self._completed_weight / total_weight, 0.98)
205
  desc = label if not detail else f"{label} β€” {detail}"
206
  self.on_update(fraction, desc)
207
+
208
+
209
+ @dataclass
210
+ class QuizGenerationProgress(SlideGenerationProgress):
211
+ """Quiz generation progress tracker (same steps, quiz-specific banner text)."""
212
+
213
+ def format_log_html(
214
+ self,
215
+ *,
216
+ running: bool = False,
217
+ footer_html: str = "",
218
+ ) -> str:
219
+ elapsed = self.elapsed_s()
220
+ eta = self.estimate_remaining_s() if running else None
221
+ banner = (
222
+ '<div class="slide-gen-log-banner running">Generating quiz…</div>'
223
+ if running
224
+ else '<div class="slide-gen-log-banner done">Quiz generation complete</div>'
225
+ )
226
+ eta_html = (
227
+ f'<div class="slide-gen-log-meta">Est. remaining: ~{int(eta)}s</div>'
228
+ if eta is not None and running
229
+ else ""
230
+ )
231
+ steps_html: list[str] = []
232
+ for step in self.steps:
233
+ done = step.ended_at is not None
234
+ status = "done" if done else "active"
235
+ icon = "βœ“" if done else "●"
236
+ duration = (
237
+ f' <span class="slide-gen-log-dur">({step.duration_s:.1f}s)</span>'
238
+ if step.duration_s is not None
239
+ else ""
240
+ )
241
+ detail = (
242
+ f' <span class="slide-gen-log-detail">β€” {escape(step.detail)}</span>'
243
+ if step.detail
244
+ else ""
245
+ )
246
+ steps_html.append(
247
+ f'<li class="slide-gen-log-step {status}">'
248
+ f'<span class="slide-gen-log-icon">{icon}</span>'
249
+ f'<span class="slide-gen-log-label">{escape(step.label)}</span>'
250
+ f"{duration}{detail}</li>"
251
+ )
252
+ steps_block = (
253
+ f'<ol class="slide-gen-log-steps">{"".join(steps_html)}</ol>'
254
+ if steps_html
255
+ else '<p class="slide-gen-log-empty">Waiting for first step…</p>'
256
+ )
257
+ return (
258
+ f'<div class="slide-gen-log">'
259
+ f"{banner}"
260
+ f'<div class="slide-gen-log-meta">Elapsed: {elapsed:.1f}s</div>'
261
+ f"{eta_html}"
262
+ f"{steps_block}"
263
+ f"{footer_html}"
264
+ f"</div>"
265
+ )
libs/agent/src/agent/prompts.py CHANGED
@@ -2,7 +2,7 @@ from __future__ import annotations
2
 
3
  import json
4
 
5
- from agent.models import EducationPptxInput, SlideOutline, SlideSpec
6
 
7
 
8
  def education_outline_system(skill_body: str) -> str:
@@ -182,3 +182,151 @@ def outline_json_example(slide_count: int) -> str:
182
  ],
183
  }
184
  return json.dumps(example, indent=2)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
 
3
  import json
4
 
5
+ from agent.models import EducationPptxInput, QuizMakerInput, QuizOutline, QuizQuestion, SlideOutline, SlideSpec
6
 
7
 
8
  def education_outline_system(skill_body: str) -> str:
 
182
  ],
183
  }
184
  return json.dumps(example, indent=2)
185
+
186
+
187
+ def quiz_max_tokens(question_count: int) -> int:
188
+ count = max(3, min(int(question_count), 12))
189
+ return min(1536, 120 + count * 180)
190
+
191
+
192
+ def quiz_outline_system(skill_body: str) -> str:
193
+ return f"""You are an expert teacher writing multiple-choice quizzes.
194
+ Follow the skill workflow below and output ONLY valid JSON (no markdown fences).
195
+
196
+ Skill workflow:
197
+ {skill_body}
198
+
199
+ Required JSON shape:
200
+ {{
201
+ "title": "Photosynthesis Quiz β€” Grade 6",
202
+ "instructions": "Read each question. Circle the best answer.",
203
+ "questions": [
204
+ {{
205
+ "prompt": "What do plants use to make food?",
206
+ "choices": ["Sunlight", "Rocks", "Plastic", "Metal"],
207
+ "correct_index": 0,
208
+ "explanation": "Plants use sunlight in photosynthesis."
209
+ }}
210
+ ]
211
+ }}
212
+
213
+ Rules:
214
+ - Each question has exactly 4 choices; correct_index is 0-3.
215
+ - Grade-appropriate vocabulary and plausible distractors.
216
+ - Output compact JSON only β€” no preamble, no markdown fences.
217
+ - When source excerpts are provided, ground questions in those sources.
218
+ """
219
+
220
+
221
+ def quiz_outline_user(req: QuizMakerInput, *, source_context: str = "") -> str:
222
+ base = (
223
+ f"Topic: {req.topic}\n"
224
+ f"Grade level: {req.grade}\n"
225
+ f"Number of questions: {req.question_count}\n"
226
+ )
227
+ if source_context.strip():
228
+ base += (
229
+ "\nUse the following retrieved source excerpts as factual grounding. "
230
+ "Prefer these over general knowledge when they apply.\n\n"
231
+ f"{source_context}\n"
232
+ )
233
+ if req.conversation_context.strip():
234
+ base += (
235
+ "\nBase the quiz on this conversation transcript when relevant.\n\n"
236
+ f"{req.conversation_context.strip()}\n"
237
+ )
238
+ return base + "\nReturn JSON only."
239
+
240
+
241
+ def quiz_outline_repair(
242
+ invalid_output: str,
243
+ error: str,
244
+ *,
245
+ expected_questions: int | None = None,
246
+ ) -> str:
247
+ count_line = ""
248
+ if expected_questions is not None:
249
+ count_line = f"\nYou must include exactly {expected_questions} items in the questions array.\n"
250
+ return (
251
+ "The previous response was invalid JSON or did not match the QuizOutline schema.\n"
252
+ f"Validation error: {error}\n"
253
+ f"{count_line}"
254
+ f"Previous output:\n{invalid_output}\n\n"
255
+ "Return corrected JSON only, no explanation."
256
+ )
257
+
258
+
259
+ def quiz_outline_retry_user(req: QuizMakerInput, *, example_json: str) -> str:
260
+ return (
261
+ f"Topic: {req.topic}\n"
262
+ f"Grade level: {req.grade}\n"
263
+ f"Number of questions: {req.question_count}\n\n"
264
+ "Your previous response was empty or invalid. "
265
+ "Write real quiz content for the topic. "
266
+ "Return ONLY valid JSON matching this structure:\n"
267
+ f"{example_json}"
268
+ )
269
+
270
+
271
+ def quiz_json_example(question_count: int) -> str:
272
+ example = {
273
+ "title": "Example Quiz",
274
+ "instructions": "Circle the best answer for each question.",
275
+ "questions": [
276
+ {
277
+ "prompt": f"Question {i}?",
278
+ "choices": ["Correct answer", "Distractor A", "Distractor B", "Distractor C"],
279
+ "correct_index": 0,
280
+ "explanation": "Brief teacher note.",
281
+ }
282
+ for i in range(1, question_count + 1)
283
+ ],
284
+ }
285
+ return json.dumps(example, indent=2)
286
+
287
+
288
+ def fallback_quiz(req: QuizMakerInput) -> QuizOutline:
289
+ """Deterministic quiz when the model returns empty or unparseable JSON."""
290
+ topic = req.topic.strip() or "Lesson"
291
+ grade = req.grade
292
+ n = req.question_count
293
+ questions: list[QuizQuestion] = []
294
+ for i in range(1, n + 1):
295
+ questions.append(
296
+ QuizQuestion(
297
+ prompt=f"What is an important idea about {topic} (question {i})?",
298
+ choices=[
299
+ f"A key fact about {topic}",
300
+ "An unrelated detail",
301
+ "A common misconception",
302
+ "None of these",
303
+ ],
304
+ correct_index=0,
305
+ explanation="Template question β€” edit using your lesson sources.",
306
+ )
307
+ )
308
+ return QuizOutline(
309
+ title=f"{topic[:1].upper() + topic[1:]} Quiz β€” Grade {grade}",
310
+ instructions="Read each question carefully. Circle the best answer.",
311
+ questions=questions,
312
+ )
313
+
314
+
315
+ def quiz_to_markdown(outline: QuizOutline) -> str:
316
+ lines = [f"# {outline.title}", ""]
317
+ if outline.instructions.strip():
318
+ lines.extend([outline.instructions.strip(), ""])
319
+ for i, q in enumerate(outline.questions, start=1):
320
+ lines.append(f"## Question {i}")
321
+ lines.append("")
322
+ lines.append(q.prompt)
323
+ lines.append("")
324
+ for label, choice in zip("ABCD", q.choices, strict=True):
325
+ lines.append(f"- **{label}.** {choice}")
326
+ correct = "ABCD"[q.correct_index]
327
+ lines.append("")
328
+ lines.append(f"**Answer:** {correct}")
329
+ if q.explanation.strip():
330
+ lines.append(f"*{q.explanation.strip()}*")
331
+ lines.append("")
332
+ return "\n".join(lines).strip() + "\n"
libs/agent/src/agent/runner.py CHANGED
@@ -17,6 +17,9 @@ from researchmind.retrieve import retrieve
17
  from agent.models import (
18
  Citation,
19
  EducationPptxInput,
 
 
 
20
  ResearchChatInput,
21
  ResearchChatResult,
22
  ResearchDiscoverResult,
@@ -25,17 +28,25 @@ from agent.models import (
25
  SlideSpec,
26
  )
27
  from agent.preview import outline_to_html, render_slide_images
28
- from agent.progress import SlideGenerationProgress
29
  from agent.prompts import (
30
  education_outline_repair,
31
  education_outline_retry_user,
32
  education_outline_system,
33
  education_outline_user,
34
  fallback_outline,
 
35
  outline_json_example,
36
  outline_looks_like_schema_echo,
37
  outline_max_tokens,
38
  outline_to_markdown,
 
 
 
 
 
 
 
39
  )
40
  from agent.skills import SkillRegistry
41
  from agent.tools.docx import create_docx, create_html_export
@@ -43,8 +54,11 @@ from agent.tools_registry import ToolRegistry
43
  from agent.trace import TraceRecorder
44
 
45
  EDUCATION_PPTX_SKILL = "education-pptx"
 
46
  RESEARCH_MIND_SKILL = "research-mind"
47
 
 
 
48
 
49
  @dataclass
50
  class AgentResult:
@@ -60,6 +74,18 @@ class AgentResult:
60
  source_summary: str = ""
61
 
62
 
 
 
 
 
 
 
 
 
 
 
 
 
63
  class AgentRunner:
64
  def __init__(
65
  self,
@@ -322,9 +348,217 @@ class AgentRunner:
322
  source_summary=source_summary,
323
  )
324
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
325
  def _gather_lesson_source_context(
326
  self,
327
- req: EducationPptxInput,
328
  backend: InferenceBackend,
329
  model_key: str,
330
  trace: TraceRecorder,
@@ -519,6 +753,194 @@ class AgentRunner:
519
  )
520
  return fallback_outline(req)
521
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
522
  def _parse_outline_or_error(
523
  self,
524
  raw: str,
@@ -675,7 +1097,7 @@ class AgentRunner:
675
  def _lesson_doc_ids(
676
  store: Any,
677
  session_id: str | None,
678
- req: EducationPptxInput,
679
  ingest: ResearchIngestResult | None,
680
  ) -> list[str]:
681
  if req.doc_ids:
@@ -711,7 +1133,7 @@ class AgentRunner:
711
  def _lesson_retrieve_scope(
712
  store: Any,
713
  session_id: str | None,
714
- req: EducationPptxInput,
715
  ingest: ResearchIngestResult | None,
716
  ) -> tuple[str | None, list[str] | None]:
717
  from researchmind.scope import resolve_retrieve_scope
 
17
  from agent.models import (
18
  Citation,
19
  EducationPptxInput,
20
+ QuizMakerInput,
21
+ QuizOutline,
22
+ QuizQuestion,
23
  ResearchChatInput,
24
  ResearchChatResult,
25
  ResearchDiscoverResult,
 
28
  SlideSpec,
29
  )
30
  from agent.preview import outline_to_html, render_slide_images
31
+ from agent.progress import QuizGenerationProgress, SlideGenerationProgress
32
  from agent.prompts import (
33
  education_outline_repair,
34
  education_outline_retry_user,
35
  education_outline_system,
36
  education_outline_user,
37
  fallback_outline,
38
+ fallback_quiz,
39
  outline_json_example,
40
  outline_looks_like_schema_echo,
41
  outline_max_tokens,
42
  outline_to_markdown,
43
+ quiz_json_example,
44
+ quiz_max_tokens,
45
+ quiz_outline_repair,
46
+ quiz_outline_retry_user,
47
+ quiz_outline_system,
48
+ quiz_outline_user,
49
+ quiz_to_markdown,
50
  )
51
  from agent.skills import SkillRegistry
52
  from agent.tools.docx import create_docx, create_html_export
 
54
  from agent.trace import TraceRecorder
55
 
56
  EDUCATION_PPTX_SKILL = "education-pptx"
57
+ QUIZ_MAKER_SKILL = "quiz-maker"
58
  RESEARCH_MIND_SKILL = "research-mind"
59
 
60
+ LessonSourceInput = EducationPptxInput | QuizMakerInput
61
+
62
 
63
  @dataclass
64
  class AgentResult:
 
74
  source_summary: str = ""
75
 
76
 
77
+ @dataclass
78
+ class QuizAgentResult:
79
+ markdown_preview: str
80
+ html_preview: str
81
+ docx_path: str
82
+ html_export_path: str
83
+ trace: TraceRecorder
84
+ trace_path: str
85
+ outline: QuizOutline
86
+ source_summary: str = ""
87
+
88
+
89
  class AgentRunner:
90
  def __init__(
91
  self,
 
348
  source_summary=source_summary,
349
  )
350
 
351
+ def run_quiz_maker(
352
+ self,
353
+ *,
354
+ topic: str,
355
+ grade: str,
356
+ question_count: int = 5,
357
+ model_key: str,
358
+ backend: InferenceBackend,
359
+ source_mode: Literal["none", "web", "rag"] = "none",
360
+ search_workflow: Literal["two_step", "auto"] = "two_step",
361
+ urls: list[str] | None = None,
362
+ files: list[Path] | None = None,
363
+ session_id: str | None = None,
364
+ doc_ids: list[str] | None = None,
365
+ conversation_context: str = "",
366
+ progress: QuizGenerationProgress | None = None,
367
+ ) -> QuizAgentResult:
368
+ result: QuizAgentResult | None = None
369
+ for item in self.iter_quiz_maker(
370
+ topic=topic,
371
+ grade=grade,
372
+ question_count=question_count,
373
+ model_key=model_key,
374
+ backend=backend,
375
+ source_mode=source_mode,
376
+ search_workflow=search_workflow,
377
+ urls=urls,
378
+ files=files,
379
+ session_id=session_id,
380
+ doc_ids=doc_ids,
381
+ conversation_context=conversation_context,
382
+ progress=progress,
383
+ ):
384
+ if isinstance(item, QuizAgentResult):
385
+ result = item
386
+ if result is None:
387
+ raise RuntimeError("Quiz generation did not return a result")
388
+ return result
389
+
390
+ def iter_quiz_maker(
391
+ self,
392
+ *,
393
+ topic: str,
394
+ grade: str,
395
+ question_count: int = 5,
396
+ model_key: str,
397
+ backend: InferenceBackend,
398
+ source_mode: Literal["none", "web", "rag"] = "none",
399
+ search_workflow: Literal["two_step", "auto"] = "two_step",
400
+ urls: list[str] | None = None,
401
+ files: list[Path] | None = None,
402
+ session_id: str | None = None,
403
+ doc_ids: list[str] | None = None,
404
+ conversation_context: str = "",
405
+ progress: QuizGenerationProgress | None = None,
406
+ ) -> Iterator[QuizGenerationProgress | QuizAgentResult]:
407
+ skill = self._skills.get(QUIZ_MAKER_SKILL)
408
+ req = QuizMakerInput(
409
+ topic=topic.strip(),
410
+ grade=grade,
411
+ question_count=question_count,
412
+ source_mode=source_mode,
413
+ search_workflow=search_workflow,
414
+ urls=urls or [],
415
+ files=files or [],
416
+ session_id=session_id,
417
+ doc_ids=doc_ids or [],
418
+ conversation_context=(conversation_context or "").strip(),
419
+ )
420
+
421
+ trace = TraceRecorder(
422
+ skill=skill.name,
423
+ model=model_key,
424
+ user_input=req.model_dump(mode="json"),
425
+ )
426
+
427
+ try:
428
+ yield from self._iter_quiz_maker_steps(
429
+ req=req,
430
+ skill=skill,
431
+ model_key=model_key,
432
+ backend=backend,
433
+ trace=trace,
434
+ progress=progress,
435
+ )
436
+ except Exception as exc:
437
+ trace.log_note("Run failed", error=str(exc))
438
+ try:
439
+ trace.save()
440
+ except OSError:
441
+ pass
442
+ raise
443
+
444
+ def _iter_quiz_maker_steps(
445
+ self,
446
+ *,
447
+ req: QuizMakerInput,
448
+ skill: Any,
449
+ model_key: str,
450
+ backend: InferenceBackend,
451
+ trace: TraceRecorder,
452
+ progress: QuizGenerationProgress | None,
453
+ ) -> Iterator[QuizGenerationProgress | QuizAgentResult]:
454
+ if req.conversation_context.strip():
455
+ trace.log_note(
456
+ "Conversation grounding",
457
+ chars=len(req.conversation_context.strip()),
458
+ )
459
+ if progress is not None:
460
+ progress.begin("load_model", "Load language model")
461
+ yield progress
462
+ load_started = monotonic()
463
+ backend.load()
464
+ load_ms = int((monotonic() - load_started) * 1000)
465
+ trace.log_step("load_model", "Load language model", duration_ms=load_ms)
466
+
467
+ if progress is not None:
468
+ progress.begin(
469
+ "gather_sources",
470
+ "Gather lesson sources",
471
+ detail=req.source_mode,
472
+ )
473
+ yield progress
474
+ source_started = monotonic()
475
+ source_context, source_summary, active_session = self._gather_lesson_source_context(
476
+ req, backend, model_key, trace
477
+ )
478
+ source_ms = int((monotonic() - source_started) * 1000)
479
+ trace.log_step(
480
+ "gather_sources",
481
+ "Gather lesson sources",
482
+ duration_ms=source_ms,
483
+ source_mode=req.source_mode,
484
+ )
485
+ if active_session:
486
+ req = req.model_copy(update={"session_id": active_session})
487
+ if progress is not None:
488
+ yield progress
489
+
490
+ if progress is not None:
491
+ progress.begin(
492
+ "generate_outline",
493
+ "Generate quiz outline",
494
+ detail=f"{req.question_count} questions Β· grade {req.grade}",
495
+ )
496
+ yield progress
497
+ outline_started = monotonic()
498
+ outline = self._generate_quiz_outline(
499
+ skill, req, backend, trace, source_context=source_context, progress=progress
500
+ )
501
+ outline_ms = int((monotonic() - outline_started) * 1000)
502
+ trace.log_step(
503
+ "generate_outline",
504
+ "Generate quiz outline",
505
+ duration_ms=outline_ms,
506
+ question_count=len(outline.questions),
507
+ )
508
+ for step in trace.steps:
509
+ if step.get("type") == "note" and step.get("phase") == "outline_fallback":
510
+ note = str(step.get("message") or "")
511
+ source_summary = f"{source_summary}\n\n_{note}_".strip() if source_summary else f"_{note}_"
512
+
513
+ if progress is not None:
514
+ yield progress
515
+
516
+ if progress is not None:
517
+ progress.begin("create_exports", "Build DOCX and HTML quiz exports")
518
+ yield progress
519
+ export_started = monotonic()
520
+ tool = self._tools.get("create_quiz")
521
+ export_paths = tool.handler(outline, run_id=trace.run_id)
522
+ trace.log_tool(
523
+ "create_quiz",
524
+ {"title": outline.title, "question_count": len(outline.questions)},
525
+ json.dumps(export_paths),
526
+ )
527
+ docx_path = export_paths["docx"]
528
+ html_export_path = export_paths["html"]
529
+ export_ms = int((monotonic() - export_started) * 1000)
530
+ trace.log_step(
531
+ "create_exports",
532
+ "Build DOCX and HTML quiz exports",
533
+ duration_ms=export_ms,
534
+ )
535
+
536
+ trace.set_artifact(docx_path)
537
+
538
+ markdown = quiz_to_markdown(outline)
539
+ html_preview_path = Path(html_export_path)
540
+ html_preview = html_preview_path.read_text(encoding="utf-8")
541
+
542
+ if progress is not None:
543
+ progress.finish()
544
+ yield progress
545
+
546
+ trace_path = trace.save()
547
+
548
+ yield QuizAgentResult(
549
+ markdown_preview=markdown,
550
+ html_preview=html_preview,
551
+ docx_path=docx_path,
552
+ html_export_path=html_export_path,
553
+ trace=trace,
554
+ trace_path=str(trace_path),
555
+ outline=outline,
556
+ source_summary=source_summary,
557
+ )
558
+
559
  def _gather_lesson_source_context(
560
  self,
561
+ req: LessonSourceInput,
562
  backend: InferenceBackend,
563
  model_key: str,
564
  trace: TraceRecorder,
 
753
  )
754
  return fallback_outline(req)
755
 
756
+ def _generate_quiz_outline(
757
+ self,
758
+ skill: Any,
759
+ req: QuizMakerInput,
760
+ backend: InferenceBackend,
761
+ trace: TraceRecorder,
762
+ *,
763
+ source_context: str = "",
764
+ progress: QuizGenerationProgress | None = None,
765
+ ) -> QuizOutline:
766
+ system = quiz_outline_system(skill.body)
767
+ user = quiz_outline_user(req, source_context=source_context)
768
+ messages = [
769
+ {"role": "system", "content": system},
770
+ {"role": "user", "content": user},
771
+ ]
772
+ prompt_text = system + "\n\n" + user
773
+ token_budget = quiz_max_tokens(req.question_count)
774
+
775
+ raw = self._normalize_outline_llm_text(
776
+ backend.chat(messages, max_tokens=token_budget, temperature=0.0)
777
+ )
778
+ trace.log_llm(prompt_text, raw)
779
+
780
+ if not raw:
781
+ trace.log_note(
782
+ "Empty quiz outline response; retrying with JSON example",
783
+ phase="outline_retry",
784
+ )
785
+ example = quiz_json_example(req.question_count)
786
+ retry_user = quiz_outline_retry_user(req, example_json=example)
787
+ retry_messages = [
788
+ {"role": "system", "content": system},
789
+ {"role": "user", "content": retry_user},
790
+ ]
791
+ retry_prompt = system + "\n\n" + retry_user
792
+ raw = self._normalize_outline_llm_text(
793
+ backend.chat(retry_messages, max_tokens=token_budget, temperature=0.0)
794
+ )
795
+ trace.log_llm(retry_prompt, raw)
796
+
797
+ outline, parse_error = self._parse_quiz_outline_or_error(
798
+ raw, req.question_count, trace
799
+ )
800
+ if outline is not None:
801
+ return outline
802
+
803
+ if progress is not None:
804
+ progress.begin(
805
+ "repair_outline",
806
+ "Repair quiz JSON",
807
+ detail=(parse_error or "invalid JSON")[:80],
808
+ )
809
+ repair_started = monotonic()
810
+ repair_user = quiz_outline_repair(
811
+ raw,
812
+ parse_error or "invalid JSON",
813
+ expected_questions=req.question_count,
814
+ )
815
+ repair_messages = messages + [
816
+ {"role": "assistant", "content": raw},
817
+ {"role": "user", "content": repair_user},
818
+ ]
819
+ repaired = self._normalize_outline_llm_text(
820
+ backend.chat(
821
+ repair_messages,
822
+ max_tokens=min(768, token_budget),
823
+ temperature=0.0,
824
+ )
825
+ )
826
+ trace.log_llm(repair_user, repaired)
827
+ outline, repair_error = self._parse_quiz_outline_or_error(
828
+ repaired, req.question_count, trace
829
+ )
830
+ repair_ms = int((monotonic() - repair_started) * 1000)
831
+ if outline is not None:
832
+ trace.log_step(
833
+ "repair_outline",
834
+ "Repair quiz JSON",
835
+ duration_ms=repair_ms,
836
+ )
837
+ return outline
838
+
839
+ trace.log_step(
840
+ "repair_outline",
841
+ "Repair quiz JSON",
842
+ duration_ms=repair_ms,
843
+ error=repair_error or parse_error,
844
+ )
845
+ trace.log_note(
846
+ "Model quiz outline invalid after repair; using template questions.",
847
+ phase="outline_fallback",
848
+ )
849
+ if progress is not None:
850
+ progress.begin(
851
+ "fallback_outline",
852
+ "Use template quiz",
853
+ detail=(repair_error or parse_error or "invalid JSON")[:80],
854
+ )
855
+ return fallback_quiz(req)
856
+
857
+ def _parse_quiz_outline_or_error(
858
+ self,
859
+ raw: str,
860
+ expected_questions: int,
861
+ trace: TraceRecorder | None,
862
+ ) -> tuple[QuizOutline | None, str]:
863
+ if not raw.strip():
864
+ return None, "Model returned empty output (no JSON)"
865
+ try:
866
+ return self._parse_quiz_outline(raw, expected_questions, trace), ""
867
+ except (json.JSONDecodeError, ValueError) as exc:
868
+ return None, str(exc)
869
+
870
+ def _parse_quiz_outline(
871
+ self,
872
+ raw: str,
873
+ expected_questions: int,
874
+ trace: TraceRecorder | None = None,
875
+ ) -> QuizOutline:
876
+ data = self._sanitize_quiz_data(self._extract_json(raw))
877
+ outline = QuizOutline.model_validate(data)
878
+ original_count = len(outline.questions)
879
+ outline = self._normalize_question_count(outline, expected_questions)
880
+ if trace and original_count != expected_questions:
881
+ trace.log_note(
882
+ "Adjusted question count to match request",
883
+ requested=expected_questions,
884
+ model_returned=original_count,
885
+ final=len(outline.questions),
886
+ )
887
+ return outline
888
+
889
+ @staticmethod
890
+ def _sanitize_quiz_data(data: dict[str, Any]) -> dict[str, Any]:
891
+ title = str(data.get("title") or "Quiz").strip() or "Quiz"
892
+ instructions = str(data.get("instructions") or "").strip()
893
+ questions_in = data.get("questions") or []
894
+ questions_out: list[dict[str, Any]] = []
895
+ for index, question in enumerate(questions_in):
896
+ if not isinstance(question, dict):
897
+ continue
898
+ prompt = str(question.get("prompt") or f"Question {index + 1}?").strip()
899
+ choices_raw = question.get("choices") or []
900
+ if isinstance(choices_raw, str):
901
+ choices_raw = [choices_raw]
902
+ choices = [str(c).strip() for c in choices_raw if str(c).strip()]
903
+ while len(choices) < 4:
904
+ choices.append(f"Option {len(choices) + 1}")
905
+ choices = choices[:4]
906
+ correct_index = int(question.get("correct_index", 0))
907
+ correct_index = max(0, min(3, correct_index))
908
+ questions_out.append(
909
+ {
910
+ "prompt": prompt or f"Question {index + 1}?",
911
+ "choices": choices,
912
+ "correct_index": correct_index,
913
+ "explanation": str(question.get("explanation") or ""),
914
+ }
915
+ )
916
+ if not questions_out:
917
+ questions_out.append(
918
+ {
919
+ "prompt": "Sample question?",
920
+ "choices": ["Answer A", "Answer B", "Answer C", "Answer D"],
921
+ "correct_index": 0,
922
+ "explanation": "",
923
+ }
924
+ )
925
+ return {"title": title, "instructions": instructions, "questions": questions_out}
926
+
927
+ @staticmethod
928
+ def _normalize_question_count(outline: QuizOutline, expected: int) -> QuizOutline:
929
+ questions = list(outline.questions)
930
+ if len(questions) > expected:
931
+ questions = questions[:expected]
932
+ while len(questions) < expected:
933
+ number = len(questions) + 1
934
+ questions.append(
935
+ QuizQuestion(
936
+ prompt=f"Additional question {number} about {outline.title}?",
937
+ choices=["Correct", "Distractor A", "Distractor B", "Distractor C"],
938
+ correct_index=0,
939
+ explanation="",
940
+ )
941
+ )
942
+ return outline.model_copy(update={"questions": questions})
943
+
944
  def _parse_outline_or_error(
945
  self,
946
  raw: str,
 
1097
  def _lesson_doc_ids(
1098
  store: Any,
1099
  session_id: str | None,
1100
+ req: LessonSourceInput,
1101
  ingest: ResearchIngestResult | None,
1102
  ) -> list[str]:
1103
  if req.doc_ids:
 
1133
  def _lesson_retrieve_scope(
1134
  store: Any,
1135
  session_id: str | None,
1136
+ req: LessonSourceInput,
1137
  ingest: ResearchIngestResult | None,
1138
  ) -> tuple[str | None, list[str] | None]:
1139
  from researchmind.scope import resolve_retrieve_scope
libs/agent/src/agent/tools/quiz.py ADDED
@@ -0,0 +1,134 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Quiz export: DOCX worksheet + HTML preview."""
2
+
3
+ from __future__ import annotations
4
+
5
+ import html
6
+ from pathlib import Path
7
+
8
+ from docx import Document
9
+
10
+ from agent.models import QuizOutline, QuizQuestion
11
+
12
+ _CHOICE_LABELS = ("A", "B", "C", "D")
13
+
14
+
15
+ def _add_question_docx(doc: Document, index: int, question: QuizQuestion) -> None:
16
+ doc.add_paragraph(f"{index}. {question.prompt}")
17
+ for label, choice in zip(_CHOICE_LABELS, question.choices, strict=True):
18
+ doc.add_paragraph(f" {label}. {choice}")
19
+ doc.add_paragraph("")
20
+
21
+
22
+ def create_quiz_docx(outline: QuizOutline, path: Path) -> Path:
23
+ """Student worksheet with numbered questions; answer key on final page."""
24
+ path = Path(path)
25
+ path.parent.mkdir(parents=True, exist_ok=True)
26
+
27
+ doc = Document()
28
+ doc.add_heading(outline.title, level=0)
29
+ if outline.instructions.strip():
30
+ doc.add_paragraph(outline.instructions.strip())
31
+ doc.add_paragraph("")
32
+
33
+ for i, question in enumerate(outline.questions, start=1):
34
+ _add_question_docx(doc, i, question)
35
+
36
+ doc.add_page_break()
37
+ doc.add_heading("Answer Key", level=1)
38
+ for i, question in enumerate(outline.questions, start=1):
39
+ label = _CHOICE_LABELS[question.correct_index]
40
+ answer = question.choices[question.correct_index]
41
+ p = doc.add_paragraph()
42
+ run = p.add_run(f"{i}. {label}. {answer}")
43
+ run.bold = True
44
+ if question.explanation.strip():
45
+ doc.add_paragraph(question.explanation.strip(), style="List Bullet")
46
+
47
+ doc.save(str(path))
48
+ return path
49
+
50
+
51
+ def create_quiz_html(outline: QuizOutline, path: Path) -> Path:
52
+ """Printable HTML worksheet with collapsible answer key."""
53
+ path = Path(path)
54
+ path.parent.mkdir(parents=True, exist_ok=True)
55
+
56
+ title = html.escape(outline.title)
57
+ instructions = html.escape(outline.instructions.strip()) if outline.instructions.strip() else ""
58
+
59
+ question_blocks: list[str] = []
60
+ answer_rows: list[str] = []
61
+
62
+ for i, question in enumerate(outline.questions, start=1):
63
+ prompt = html.escape(question.prompt)
64
+ choices_html = "\n".join(
65
+ f'<li><span class="choice-label">{label}.</span> {html.escape(choice)}</li>'
66
+ for label, choice in zip(_CHOICE_LABELS, question.choices, strict=True)
67
+ )
68
+ question_blocks.append(
69
+ f'<section class="question"><h3>{i}. {prompt}</h3><ol class="choices">{choices_html}</ol></section>'
70
+ )
71
+ correct_label = _CHOICE_LABELS[question.correct_index]
72
+ correct_text = html.escape(question.choices[question.correct_index])
73
+ expl = html.escape(question.explanation.strip()) if question.explanation.strip() else ""
74
+ expl_html = f'<p class="explanation">{expl}</p>' if expl else ""
75
+ answer_rows.append(
76
+ f"<tr><td>{i}</td><td><strong>{correct_label}. {correct_text}</strong></td>"
77
+ f"<td>{expl}</td></tr>"
78
+ if expl
79
+ else f"<tr><td>{i}</td><td><strong>{correct_label}. {correct_text}</strong></td><td></td></tr>"
80
+ )
81
+
82
+ body = f"""<!DOCTYPE html>
83
+ <html lang="en">
84
+ <head>
85
+ <meta charset="utf-8">
86
+ <title>{title}</title>
87
+ <style>
88
+ body {{ font-family: Georgia, serif; max-width: 720px; margin: 2rem auto; padding: 0 1rem; line-height: 1.5; }}
89
+ h1 {{ font-size: 1.5rem; margin-bottom: 0.5rem; }}
90
+ .instructions {{ margin-bottom: 1.5rem; color: #333; }}
91
+ .question {{ margin-bottom 1.25rem; page-break-inside: avoid; }}
92
+ .question h3 {{ font-size: 1rem; font-weight: 600; margin: 0 0 0.5rem; }}
93
+ .choices {{ list-style: none; padding-left: 0; margin: 0; }}
94
+ .choices li {{ margin: 0.25rem 0; }}
95
+ .choice-label {{ font-weight: 600; margin-right: 0.35rem; }}
96
+ details.answer-key {{ margin-top: 2rem; border-top: 2px solid #ccc; padding-top: 1rem; }}
97
+ table {{ width: 100%; border-collapse: collapse; font-size: 0.9rem; }}
98
+ th, td {{ border: 1px solid #ccc; padding: 0.4rem 0.6rem; text-align: left; vertical-align: top; }}
99
+ th {{ background: #f5f5f5; }}
100
+ @media print {{
101
+ details.answer-key {{ display: block; }}
102
+ details.answer-key summary {{ display: none; }}
103
+ }}
104
+ </style>
105
+ </head>
106
+ <body>
107
+ <h1>{title}</h1>
108
+ {"<p class='instructions'>" + instructions + "</p>" if instructions else ""}
109
+ {"".join(question_blocks)}
110
+ <details class="answer-key">
111
+ <summary>Answer key (click to expand)</summary>
112
+ <table>
113
+ <thead><tr><th>#</th><th>Answer</th><th>Explanation</th></tr></thead>
114
+ <tbody>
115
+ {"".join(answer_rows)}
116
+ </tbody>
117
+ </table>
118
+ </details>
119
+ </body>
120
+ </html>
121
+ """
122
+ path.write_text(body, encoding="utf-8")
123
+ return path
124
+
125
+
126
+ def create_quiz(outline: QuizOutline, output_dir: Path, stem: str = "quiz") -> dict[str, Path]:
127
+ """Write DOCX and HTML exports; return paths keyed by format."""
128
+ output_dir = Path(output_dir)
129
+ output_dir.mkdir(parents=True, exist_ok=True)
130
+ docx_path = output_dir / f"{stem}.docx"
131
+ html_path = output_dir / f"{stem}.html"
132
+ create_quiz_docx(outline, docx_path)
133
+ create_quiz_html(outline, html_path)
134
+ return {"docx": docx_path, "html": html_path}
libs/agent/src/agent/tools_registry.py CHANGED
@@ -4,8 +4,9 @@ from collections.abc import Callable
4
  from dataclasses import dataclass
5
  from typing import Any
6
 
7
- from agent.models import SlideOutline
8
  from agent.tools.pptx import create_pptx
 
9
  from agent.tools.research_tools import (
10
  tool_extract_and_index,
11
  tool_research_answer,
@@ -30,6 +31,11 @@ class ToolRegistry:
30
  "Create a PowerPoint file from a validated SlideOutline",
31
  self._handle_create_pptx,
32
  )
 
 
 
 
 
33
  self.register(
34
  "suggest_urls",
35
  "Suggest research URLs for a topic using the local LLM",
@@ -72,3 +78,14 @@ class ToolRegistry:
72
  def _handle_create_pptx(self, outline: SlideOutline, run_id: str | None = None) -> str:
73
  path = create_pptx(outline, run_id=run_id)
74
  return str(path)
 
 
 
 
 
 
 
 
 
 
 
 
4
  from dataclasses import dataclass
5
  from typing import Any
6
 
7
+ from agent.models import QuizOutline, SlideOutline
8
  from agent.tools.pptx import create_pptx
9
+ from agent.tools.quiz import create_quiz
10
  from agent.tools.research_tools import (
11
  tool_extract_and_index,
12
  tool_research_answer,
 
31
  "Create a PowerPoint file from a validated SlideOutline",
32
  self._handle_create_pptx,
33
  )
34
+ self.register(
35
+ "create_quiz",
36
+ "Create DOCX and HTML quiz exports from a validated QuizOutline",
37
+ self._handle_create_quiz,
38
+ )
39
  self.register(
40
  "suggest_urls",
41
  "Suggest research URLs for a topic using the local LLM",
 
78
  def _handle_create_pptx(self, outline: SlideOutline, run_id: str | None = None) -> str:
79
  path = create_pptx(outline, run_id=run_id)
80
  return str(path)
81
+
82
+ def _handle_create_quiz(
83
+ self,
84
+ outline: QuizOutline,
85
+ run_id: str | None = None,
86
+ ) -> dict[str, str]:
87
+ from agent.tools.pptx import get_outputs_dir
88
+
89
+ output_dir = get_outputs_dir() / (run_id or "quiz")
90
+ paths = create_quiz(outline, output_dir, stem="quiz")
91
+ return {fmt: str(path) for fmt, path in paths.items()}
libs/agent/tests/test_quiz_maker.py ADDED
@@ -0,0 +1,123 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Tests for quiz-maker skill: JSON parse, fallback, and export smoke."""
2
+
3
+ from pathlib import Path
4
+
5
+ from agent.models import QuizMakerInput, QuizOutline, QuizQuestion
6
+ from agent.prompts import fallback_quiz, quiz_max_tokens, quiz_to_markdown
7
+ from agent.runner import AgentRunner
8
+ from agent.tools.quiz import create_quiz, create_quiz_docx, create_quiz_html
9
+
10
+
11
+ def test_quiz_max_tokens_scales_with_question_count():
12
+ assert quiz_max_tokens(5) == 1020
13
+ assert quiz_max_tokens(3) == 660
14
+ assert quiz_max_tokens(12) == 1536
15
+
16
+
17
+ def test_parse_quiz_outline_normalizes_count():
18
+ runner = AgentRunner()
19
+ raw = (
20
+ '{"title": "Science Quiz", "instructions": "Circle one.", "questions": ['
21
+ '{"prompt": "Q1?", "choices": ["a", "b", "c", "d"], "correct_index": 0, "explanation": "e1"},'
22
+ '{"prompt": "Q2?", "choices": ["a", "b", "c", "d"], "correct_index": 1, "explanation": "e2"},'
23
+ '{"prompt": "Q3?", "choices": ["a", "b", "c", "d"], "correct_index": 2, "explanation": "e3"}'
24
+ "]}"
25
+ )
26
+ outline = runner._parse_quiz_outline(raw, expected_questions=5)
27
+ assert len(outline.questions) == 5
28
+ assert outline.title == "Science Quiz"
29
+
30
+
31
+ def test_parse_quiz_outline_trims_extra_questions():
32
+ runner = AgentRunner()
33
+ questions = ",".join(
34
+ f'{{"prompt": "Q{i}?", "choices": ["a","b","c","d"], "correct_index": 0, "explanation": ""}}'
35
+ for i in range(1, 8)
36
+ )
37
+ raw = f'{{"title": "Long", "questions": [{questions}]}}'
38
+ outline = runner._parse_quiz_outline(raw, expected_questions=5)
39
+ assert len(outline.questions) == 5
40
+
41
+
42
+ def test_parse_quiz_outline_or_error_empty():
43
+ runner = AgentRunner()
44
+ outline, err = runner._parse_quiz_outline_or_error("", 5, None)
45
+ assert outline is None
46
+ assert "empty" in err.lower()
47
+
48
+
49
+ def test_fallback_quiz_has_requested_count():
50
+ req = QuizMakerInput(topic="Fractions", grade="5", question_count=7)
51
+ outline = fallback_quiz(req)
52
+ assert len(outline.questions) == 7
53
+ assert "Fractions" in outline.title
54
+ assert all(len(q.choices) == 4 for q in outline.questions)
55
+
56
+
57
+ def test_quiz_to_markdown_includes_answers():
58
+ outline = QuizOutline(
59
+ title="Test",
60
+ instructions="Read carefully.",
61
+ questions=[
62
+ QuizQuestion(
63
+ prompt="2+2?",
64
+ choices=["4", "3", "5", "6"],
65
+ correct_index=0,
66
+ explanation="Basic addition.",
67
+ ),
68
+ QuizQuestion(
69
+ prompt="3+3?",
70
+ choices=["6", "5", "7", "8"],
71
+ correct_index=0,
72
+ explanation="Also addition.",
73
+ ),
74
+ QuizQuestion(
75
+ prompt="1+1?",
76
+ choices=["2", "1", "3", "4"],
77
+ correct_index=0,
78
+ explanation="Easy.",
79
+ ),
80
+ ],
81
+ )
82
+ md = quiz_to_markdown(outline)
83
+ assert "2+2?" in md
84
+ assert "**Answer:** A" in md
85
+
86
+
87
+ def test_create_quiz_docx_and_html(tmp_path: Path):
88
+ outline = QuizOutline(
89
+ title="Smoke Quiz",
90
+ instructions="Circle the best answer.",
91
+ questions=[
92
+ QuizQuestion(
93
+ prompt="Sample?",
94
+ choices=["Yes", "No", "Maybe", "Sometimes"],
95
+ correct_index=0,
96
+ explanation="Because.",
97
+ ),
98
+ QuizQuestion(
99
+ prompt="Another?",
100
+ choices=["A", "B", "C", "D"],
101
+ correct_index=2,
102
+ explanation="C is correct.",
103
+ ),
104
+ QuizQuestion(
105
+ prompt="Third?",
106
+ choices=["1", "2", "3", "4"],
107
+ correct_index=1,
108
+ explanation="Two.",
109
+ ),
110
+ ],
111
+ )
112
+ docx_path = tmp_path / "quiz.docx"
113
+ html_path = tmp_path / "quiz.html"
114
+ create_quiz_docx(outline, docx_path)
115
+ create_quiz_html(outline, html_path)
116
+ assert docx_path.stat().st_size > 100
117
+ html_text = html_path.read_text(encoding="utf-8")
118
+ assert "Smoke Quiz" in html_text
119
+ assert "Answer key" in html_text
120
+
121
+ paths = create_quiz(outline, tmp_path / "out", stem="worksheet")
122
+ assert paths["docx"].exists()
123
+ assert paths["html"].exists()
research/evals/configs/lm_eval_reasoning.yaml CHANGED
@@ -10,8 +10,8 @@ tasks:
10
  - arc_challenge
11
  - hellaswag
12
 
13
- num_fewshot: 5
14
- limit: 100
15
  seed: 42
16
  batch_size: auto
17
  device: auto
 
10
  - arc_challenge
11
  - hellaswag
12
 
13
+ num_fewshot: null # per-task canonical fewshot (gsm8k 5-shot, MC tasks 0-shot)
14
+ limit: 200 # larger sample -> tighter stderr for gate decisions
15
  seed: 42
16
  batch_size: auto
17
  device: auto
research/evals/configs/lm_eval_science.yaml CHANGED
@@ -9,8 +9,8 @@ tasks:
9
  - openbookqa
10
  - arc_challenge
11
 
12
- num_fewshot: 0
13
- limit: 100
14
  seed: 42
15
  batch_size: auto
16
  device: auto
 
9
  - openbookqa
10
  - arc_challenge
11
 
12
+ num_fewshot: null # per-task canonical fewshot (sciq/openbookqa/arc 0-shot)
13
+ limit: 200 # larger sample -> tighter stderr for gate decisions
14
  seed: 42
15
  batch_size: auto
16
  device: auto
research/modal/_common.py CHANGED
@@ -291,6 +291,31 @@ def primary_metric(task_metrics: dict[str, Any]) -> tuple[str, float] | None:
291
  return None
292
 
293
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
294
  def evaluate_gate(
295
  *,
296
  candidate: dict[str, Any],
@@ -429,6 +454,16 @@ def render_model_card(
429
  base_tasks = (baseline or {}).get("results", {})
430
  base_model = (training_payload or {}).get("model") or BASE_MODEL_ID
431
 
 
 
 
 
 
 
 
 
 
 
432
  lines = [
433
  "---",
434
  "library_name: peft",
@@ -445,7 +480,7 @@ def render_model_card(
445
  f"# {job['name']}",
446
  "",
447
  f"QLoRA adapter for **{job.get('category', 'general')}**, fine-tuned from "
448
- f"`{base_model}` on `{job['dataset']}` (format: `{job['format']}`).",
449
  "",
450
  "Trained, evaluated, and gated on [Modal](https://modal.com/docs/guide) via "
451
  "`research/modal/` (app `slm-finetune-benchmark`).",
@@ -570,16 +605,24 @@ def publish_adapter_files(
570
 
571
  from huggingface_hub import HfApi
572
 
573
- repo_id = publish_cfg["hub_repo"]
574
  private = publish_cfg.get("private", True)
575
 
576
  api = HfApi()
577
- api.create_repo(repo_id=repo_id, repo_type="model", private=private, exist_ok=True)
578
- api.upload_folder(
579
- folder_path=str(adapter_path),
580
- repo_id=repo_id,
581
- repo_type="model",
582
- commit_message=f"Publish {job['name']} (gate passed: {gate_result.get('task')})",
583
- )
 
 
 
584
 
585
- return {"published": True, "repo_id": repo_id, "url": f"https://huggingface.co/{repo_id}"}
 
 
 
 
 
 
291
  return None
292
 
293
 
294
+ def baseline_is_cached(experiment_name: str, config_path: str) -> bool:
295
+ """True if a baseline results.json exists AND its run_meta still matches the
296
+ profile config's tasks/limit/num_fewshot. Config changes (e.g. new guard
297
+ tasks or a higher limit) therefore correctly force a fresh baseline."""
298
+ results = Path(LM_EVAL_OUTPUT) / experiment_name / "results.json"
299
+ if not results.is_file():
300
+ return False
301
+ candidates = [Path(config_path)]
302
+ if not Path(config_path).is_absolute():
303
+ candidates += [REPO_ROOT / config_path, Path("/repo") / config_path]
304
+ cfg_file = next((p for p in candidates if p.is_file()), None)
305
+ if cfg_file is None:
306
+ return False
307
+ try:
308
+ meta = json.loads(results.read_text()).get("run_meta", {})
309
+ cfg = yaml.safe_load(cfg_file.read_text()) or {}
310
+ except Exception:
311
+ return False
312
+ return (
313
+ sorted(meta.get("tasks") or []) == sorted(cfg.get("tasks") or [])
314
+ and meta.get("limit") == cfg.get("limit")
315
+ and meta.get("num_fewshot") == cfg.get("num_fewshot", 0)
316
+ )
317
+
318
+
319
  def evaluate_gate(
320
  *,
321
  candidate: dict[str, Any],
 
454
  base_tasks = (baseline or {}).get("results", {})
455
  base_model = (training_payload or {}).get("model") or BASE_MODEL_ID
456
 
457
+ # A job is either a single dataset (`dataset`/`format`) or a `mix:` of sources.
458
+ if job.get("mix"):
459
+ dataset_desc = " + ".join(
460
+ f"`{s.get('dataset', '?')}`" for s in job["mix"]
461
+ )
462
+ format_desc = "mix"
463
+ else:
464
+ dataset_desc = f"`{job.get('dataset', '?')}`"
465
+ format_desc = job.get("format", "?")
466
+
467
  lines = [
468
  "---",
469
  "library_name: peft",
 
480
  f"# {job['name']}",
481
  "",
482
  f"QLoRA adapter for **{job.get('category', 'general')}**, fine-tuned from "
483
+ f"`{base_model}` on {dataset_desc} (format: `{format_desc}`).",
484
  "",
485
  "Trained, evaluated, and gated on [Modal](https://modal.com/docs/guide) via "
486
  "`research/modal/` (app `slm-finetune-benchmark`).",
 
605
 
606
  from huggingface_hub import HfApi
607
 
608
+ repo_ids = [publish_cfg["hub_repo"], *(publish_cfg.get("mirror_repos") or [])]
609
  private = publish_cfg.get("private", True)
610
 
611
  api = HfApi()
612
+ uploads = []
613
+ for repo_id in dict.fromkeys(repo_ids):
614
+ api.create_repo(repo_id=repo_id, repo_type="model", private=private, exist_ok=True)
615
+ api.upload_folder(
616
+ folder_path=str(adapter_path),
617
+ repo_id=repo_id,
618
+ repo_type="model",
619
+ commit_message=f"Publish {job['name']} (gate passed: {gate_result.get('task')})",
620
+ )
621
+ uploads.append({"repo_id": repo_id, "url": f"https://huggingface.co/{repo_id}"})
622
 
623
+ return {
624
+ "published": True,
625
+ "repo_id": uploads[0]["repo_id"],
626
+ "url": uploads[0]["url"],
627
+ "uploads": uploads,
628
+ }
research/modal/experiments.yaml CHANGED
@@ -30,11 +30,26 @@ defaults:
30
 
31
  finetune:
32
  # --- teaching: lesson-planning agent chat data (Well-Tuned primary) ---
 
 
33
  - name: teaching-lora
34
  category: teaching
35
- dataset: research/data/education-lesson-chat.jsonl
36
- format: chat
37
- description: Lesson-planning agent chat data (local)
 
 
 
 
 
 
 
 
 
 
 
 
 
38
  eval_profile: instructions
39
  goals:
40
  task: ifeval
@@ -42,6 +57,8 @@ finetune:
42
  min_improve: 0.02
43
  publish:
44
  hub_repo: MSGEncrypted/minicpm5-1b-teaching-lora
 
 
45
  private: false
46
 
47
  # --- science: factual + explanatory science tutoring ---
@@ -49,6 +66,9 @@ finetune:
49
  category: science
50
  dataset: research/data/science-tutor-chat.jsonl
51
  format: chat
 
 
 
52
  description: Science tutor Q&A chat data (local)
53
  eval_profile: science
54
  goals:
@@ -60,6 +80,8 @@ finetune:
60
  max_regress: 0.03
61
  publish:
62
  hub_repo: MSGEncrypted/minicpm5-1b-science-lora
 
 
63
  private: false
64
 
65
  # --- math: GSM8K/MATH natural-language CoT augmentation (MetaMathQA) ---
@@ -102,6 +124,8 @@ finetune:
102
  max_regress: 0.03
103
  publish:
104
  hub_repo: MSGEncrypted/minicpm5-1b-math-lora
 
 
105
  private: false
106
 
107
  # --- coding: Python instruction-following code generation ---
@@ -111,6 +135,9 @@ finetune:
111
  format: alpaca
112
  dataset_split: "train[:1000]"
113
  max_samples: 1000
 
 
 
114
  description: Python code instruction tuning (Hub, alpaca columns)
115
  eval_profile: code
116
  goals:
@@ -124,6 +151,8 @@ finetune:
124
  max_regress: 0.03
125
  publish:
126
  hub_repo: MSGEncrypted/minicpm5-1b-coding-lora
 
 
127
  private: false
128
 
129
  # --- reasoning: multi-turn chat with reasoning-heavy conversations ---
@@ -134,6 +163,9 @@ finetune:
134
  dataset_config: all
135
  dataset_split: "train[:500]"
136
  max_samples: 500
 
 
 
137
  description: Multi-turn reasoning/chat subset (Hub)
138
  eval_profile: reasoning
139
  goals:
@@ -145,6 +177,8 @@ finetune:
145
  max_regress: 0.03
146
  publish:
147
  hub_repo: MSGEncrypted/minicpm5-1b-reasoning-lora
 
 
148
  private: false
149
 
150
  # --- general instructions baseline: no goals/publish -> local-only adapter ---
 
30
 
31
  finetune:
32
  # --- teaching: lesson-planning agent chat data (Well-Tuned primary) ---
33
+ # 8 local lesson chats overfit easily; mix in alpaca replay + NEFTune so
34
+ # IFEval clears without washing out the lesson skill signal.
35
  - name: teaching-lora
36
  category: teaching
37
+ max_steps: 150
38
+ mix:
39
+ - dataset: research/data/education-lesson-chat.jsonl
40
+ format: chat
41
+ weight: 20 # ~8 samples β†’ ~160 examples
42
+ - dataset: tatsu-lab/alpaca # instruction-following replay for ifeval
43
+ format: alpaca
44
+ dataset_split: "train[:600]"
45
+ max_samples: 600
46
+ args:
47
+ lora_r: 32
48
+ lora_alpha: 64
49
+ neftune_noise_alpha: 5
50
+ early_stopping_patience: 2 # keep best eval_loss checkpoint, not the last
51
+ val_split: 0.05
52
+ description: Lesson-planning chat + alpaca replay, r=32 + NEFTune
53
  eval_profile: instructions
54
  goals:
55
  task: ifeval
 
57
  min_improve: 0.02
58
  publish:
59
  hub_repo: MSGEncrypted/minicpm5-1b-teaching-lora
60
+ mirror_repos:
61
+ - build-small-hackathon/minicpm5-1b-teaching-lora
62
  private: false
63
 
64
  # --- science: factual + explanatory science tutoring ---
 
66
  category: science
67
  dataset: research/data/science-tutor-chat.jsonl
68
  format: chat
69
+ args:
70
+ early_stopping_patience: 2 # keep best eval_loss checkpoint, not the last
71
+ val_split: 0.05
72
  description: Science tutor Q&A chat data (local)
73
  eval_profile: science
74
  goals:
 
80
  max_regress: 0.03
81
  publish:
82
  hub_repo: MSGEncrypted/minicpm5-1b-science-lora
83
+ mirror_repos:
84
+ - build-small-hackathon/minicpm5-1b-science-lora
85
  private: false
86
 
87
  # --- math: GSM8K/MATH natural-language CoT augmentation (MetaMathQA) ---
 
124
  max_regress: 0.03
125
  publish:
126
  hub_repo: MSGEncrypted/minicpm5-1b-math-lora
127
+ mirror_repos:
128
+ - build-small-hackathon/minicpm5-1b-math-lora
129
  private: false
130
 
131
  # --- coding: Python instruction-following code generation ---
 
135
  format: alpaca
136
  dataset_split: "train[:1000]"
137
  max_samples: 1000
138
+ args:
139
+ early_stopping_patience: 2 # keep best eval_loss checkpoint, not the last
140
+ val_split: 0.05
141
  description: Python code instruction tuning (Hub, alpaca columns)
142
  eval_profile: code
143
  goals:
 
151
  max_regress: 0.03
152
  publish:
153
  hub_repo: MSGEncrypted/minicpm5-1b-coding-lora
154
+ mirror_repos:
155
+ - build-small-hackathon/minicpm5-1b-coding-lora
156
  private: false
157
 
158
  # --- reasoning: multi-turn chat with reasoning-heavy conversations ---
 
163
  dataset_config: all
164
  dataset_split: "train[:500]"
165
  max_samples: 500
166
+ args:
167
+ early_stopping_patience: 2 # keep best eval_loss checkpoint, not the last
168
+ val_split: 0.05
169
  description: Multi-turn reasoning/chat subset (Hub)
170
  eval_profile: reasoning
171
  goals:
 
177
  max_regress: 0.03
178
  publish:
179
  hub_repo: MSGEncrypted/minicpm5-1b-reasoning-lora
180
+ mirror_repos:
181
+ - build-small-hackathon/minicpm5-1b-reasoning-lora
182
  private: false
183
 
184
  # --- general instructions baseline: no goals/publish -> local-only adapter ---
research/modal/server_app.py CHANGED
@@ -53,6 +53,7 @@ from _common import (
53
  HF_CACHE_PATH,
54
  LM_EVAL_OUTPUT,
55
  apply_defaults,
 
56
  build_finetune_cmd,
57
  build_lm_eval_cmd,
58
  check_gate_files,
@@ -285,9 +286,15 @@ class GpuWorker:
285
  baselines_ok: dict[str, bool] = {}
286
  if not eval_only:
287
  for profile in profiles:
 
 
 
 
 
 
288
  result = self.lm_eval.local(
289
- experiment_name=f"{preset}__baseline__{profile}",
290
- config=config_for_profile(profile),
291
  preset=preset,
292
  )
293
  baselines_ok[profile] = bool(result.get("ok"))
 
53
  HF_CACHE_PATH,
54
  LM_EVAL_OUTPUT,
55
  apply_defaults,
56
+ baseline_is_cached,
57
  build_finetune_cmd,
58
  build_lm_eval_cmd,
59
  check_gate_files,
 
286
  baselines_ok: dict[str, bool] = {}
287
  if not eval_only:
288
  for profile in profiles:
289
+ exp = f"{preset}__baseline__{profile}"
290
+ cfg_path = config_for_profile(profile)
291
+ if baseline_is_cached(exp, cfg_path):
292
+ print(f"baseline {exp}: reusing cached results (config unchanged)")
293
+ baselines_ok[profile] = True
294
+ continue
295
  result = self.lm_eval.local(
296
+ experiment_name=exp,
297
+ config=cfg_path,
298
  preset=preset,
299
  )
300
  baselines_ok[profile] = bool(result.get("ok"))
skills/quiz-maker/SKILL.md ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ name: quiz-maker
3
+ description: Create a multiple-choice quiz from a topic and grade level
4
+ task: education
5
+ tools:
6
+ - create_quiz
7
+ model_hints:
8
+ - minicpm5-1b
9
+ ---
10
+
11
+ # Quiz maker
12
+
13
+ Generate a printable multiple-choice quiz (worksheet + answer key) for a topic and grade level.
14
+
15
+ ## Workflow
16
+
17
+ 1. Gather optional source context (web URLs, uploaded files, or session RAG).
18
+ 2. Produce a `QuizOutline` JSON object with 3–12 questions (typically 5–10).
19
+ 3. Export DOCX (student worksheet + answer key page) and HTML preview via `create_quiz`.
20
+
21
+ ## Output rules
22
+
23
+ - Each question has exactly **4** choices labeled A–D.
24
+ - Exactly one correct answer per question (`correct_index` 0–3).
25
+ - Include a short explanation for each answer (teacher reference).
26
+ - Grade-appropriate vocabulary and distractors.
27
+ - Ground content in provided sources when available.
28
+
29
+ See `references/mcq-format.md` for MCQ structure details.
skills/quiz-maker/references/mcq-format.md ADDED
@@ -0,0 +1,22 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Multiple-choice format
2
+
3
+ ## Question structure
4
+
5
+ Each question must include:
6
+
7
+ - `prompt`: clear stem (one sentence or short paragraph)
8
+ - `choices`: array of exactly **4** strings (no A/B/C/D prefixes in JSON)
9
+ - `correct_index`: integer 0–3 (index into `choices`)
10
+ - `explanation`: one or two sentences for teachers (why the answer is correct)
11
+
12
+ ## Distractors
13
+
14
+ - All four options should be plausible at the target grade level.
15
+ - Avoid "all of the above" / "none of the above" unless topic-specific.
16
+ - Keep choice length similar within a question.
17
+
18
+ ## Quiz outline
19
+
20
+ - `title`: quiz title (include topic and grade when helpful)
21
+ - `instructions`: student-facing directions (e.g. "Circle the best answer.")
22
+ - `questions`: 3–12 items; default request is 5–10