GitHub Actions commited on
Commit
7362ed8
Β·
0 Parent(s):

Deploy from 7a84bfa

Browse files
Files changed (11) hide show
  1. README.md +98 -0
  2. app.py +1232 -0
  3. director.py +86 -0
  4. examples/brand_brief.txt +1 -0
  5. examples/demo_product.png +0 -0
  6. frames.py +59 -0
  7. model_runtime.py +138 -0
  8. requirements.txt +3 -0
  9. schemas.py +120 -0
  10. test_app_e2e.py +69 -0
  11. test_smoke.py +47 -0
README.md ADDED
@@ -0,0 +1,98 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: ShotCraft
3
+ emoji: 🎬
4
+ colorFrom: purple
5
+ colorTo: pink
6
+ sdk: gradio
7
+ app_file: app.py
8
+ license: apache-2.0
9
+ short_description: AI art director for e-commerce product shots
10
+ tags:
11
+ - backyard-ai
12
+ - modal
13
+ ---
14
+
15
+ # 🎬 ShotCraft β€” AI Shot Director
16
+
17
+ Upload a product photo β†’ an 8B vision model designs 5 marketing shot
18
+ concepts grounded in your *actual* product β†’ a 12B image model renders
19
+ them as production-ready stills.
20
+
21
+ **Build Small Hackathon 2026** | Models: [MiniCPM-V-2_6](https://huggingface.co/openbmb/MiniCPM-V-2_6) (8B) + [FLUX.1-schnell](https://huggingface.co/black-forest-labs/FLUX.1-schnell) (12B) β€” both ≀32B βœ…
22
+
23
+ ## Pipeline
24
+ 1. Upload product photo + brand description + category + style preset
25
+ 2. MiniCPM-V-2_6 analyzes the photo β†’ 5 shot concepts (scene, camera, lighting, palette, props, marketing angle, optimized prompt)
26
+ 3. Review & edit concepts
27
+ 4. FLUX.1-schnell renders 5 stills β†’ pick hero frames β†’ export ZIP
28
+
29
+ Team: Pawel (Stage 1) Β· Rafal (Stage 2) Β· Stefan (integration)
30
+
31
+ ## Architecture β€” powered by Modal ⚑
32
+
33
+ This Space is the Gradio interface; **all model inference runs on
34
+ [Modal](https://modal.com)** (`modal_backend/shotcraft_inference.py` in the
35
+ [repo](https://github.com/rafalbog/nice-day-pink-panther)):
36
+
37
+ | Stage | Model | Modal GPU |
38
+ |---|---|---|
39
+ | 1 β€” Shot Director | MiniCPM-V-2_6 (8B, bf16) | A10G |
40
+ | 2 β€” Frame Generator | FLUX.1-schnell (12B, bf16, 4-step) | L40S |
41
+
42
+ - One HTTP call per stage; the whole 5-frame reel renders in a single
43
+ batched call, per-frame regen is a 1-frame call (seeded, FR-2.3/2.4).
44
+ - Weights cached on a Modal Volume; containers stay warm 5 min between calls.
45
+ - The Space needs `SHOTCRAFT_API_URL` β†’ the Modal endpoint. The deploy
46
+ workflow can set it automatically from `SHOTCRAFT_MODAL_WORKSPACE`,
47
+ `SHOTCRAFT_MODAL_APP`, and `SHOTCRAFT_MODAL_FUNCTION`.
48
+
49
+ ## Dev notes
50
+ - **Backend deploy:** `modal deploy modal_backend/shotcraft_inference.py`
51
+ (or `modal serve` for a dev loop). Health check: `GET /health`.
52
+ - **Tests:** `python test_smoke.py` (pure logic, no backend) and
53
+ `python test_app_e2e.py` (real end-to-end against the deployed backend β€”
54
+ spends GPU time).
55
+ - **Plan:** see `docs/shotcraft-IMPLEMENTATION_PLAN.md` for the slice-by-slice build order.
56
+
57
+ ## Deployment configuration
58
+
59
+ Keep personal tokens in GitHub Secrets, not in code. This shared repo deploys
60
+ one shared Hugging Face Space, but can point that Space at Rafal's or Pawel's
61
+ Modal backend. Manual workflow runs ask for `deployment_profile` (`rafal` or
62
+ `pawel`). Pushes to `main` still deploy the default `rafal` profile.
63
+
64
+ Rafal uses the existing GitHub Secrets:
65
+ - `HF_TOKEN` β€” Hugging Face write token for the target Space.
66
+ - `MODAL_TOKEN_ID` / `MODAL_TOKEN_SECRET` β€” Modal token for the target workspace.
67
+
68
+ Pawel uses prefixed GitHub Secrets for Modal only:
69
+ - `PAWEL_MODAL_TOKEN_ID`
70
+ - `PAWEL_MODAL_TOKEN_SECRET`
71
+
72
+ Rafal/default GitHub Variables:
73
+ - `HF_USERNAME` β€” username used for HF git auth. Defaults to `rafalbog`.
74
+ - `HF_SPACE_REPO` β€” Space repo id, for example `build-small-hackathon/ai-video-generation`.
75
+ - `SHOTCRAFT_MODAL_WORKSPACE` β€” Modal workspace slug. Defaults to `rafalbogusdxc`.
76
+ - `SHOTCRAFT_MODAL_APP` β€” Modal app name. Defaults to `shotcraft-inference`.
77
+ - `SHOTCRAFT_MODAL_FUNCTION` β€” Modal ASGI function name. Defaults to `api`.
78
+ - `SHOTCRAFT_API_URL` β€” full backend URL override. If unset, the workflow builds
79
+ `https://<workspace>--<app>-<function>.modal.run`.
80
+ - `SHOTCRAFT_HF_SECRET_NAME` β€” Modal secret that contains the HF token for model
81
+ downloads. Defaults to `huggingface-secret`.
82
+ - `SHOTCRAFT_HF_CACHE_VOLUME` β€” Modal volume for cached weights. Defaults to
83
+ `shotcraft-hf-cache`.
84
+
85
+ Pawel-specific GitHub Variables use the same names with a `PAWEL_` prefix:
86
+ - `PAWEL_SHOTCRAFT_MODAL_WORKSPACE`
87
+ - optional: `PAWEL_SHOTCRAFT_API_URL`
88
+ - optional: `PAWEL_SHOTCRAFT_MODAL_APP`
89
+ - optional: `PAWEL_SHOTCRAFT_MODAL_FUNCTION`
90
+ - optional: `PAWEL_SHOTCRAFT_HF_SECRET_NAME`
91
+ - optional: `PAWEL_SHOTCRAFT_HF_CACHE_VOLUME`
92
+
93
+ For Rafal, the existing repository-level secrets/variables can stay as-is. For
94
+ Pawel, add the prefixed Modal secrets/variables under the same repository
95
+ Actions settings page, then run either deploy workflow manually and choose
96
+ `deployment_profile=pawel`. The frontend workflow still uses the existing
97
+ `HF_TOKEN` to update the shared Space at
98
+ `build-small-hackathon/ai-video-generation`.
app.py ADDED
@@ -0,0 +1,1232 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """ShotCraft β€” AI Shot Director for e-commerce.
2
+ HF x Gradio Build Small Hackathon 2026.
3
+ Stage 1: MiniCPM-V-2_6 (8B) | Stage 2: FLUX.1-schnell (12B)
4
+ """
5
+ from __future__ import annotations
6
+ import io, json, os, tempfile, zipfile
7
+ import gradio as gr
8
+ from PIL import Image
9
+ from schemas import CATEGORIES, STYLE_PRESETS
10
+ from director import generate_concepts
11
+ from frames import generate_frame, generate_frames, seed_for
12
+ from model_runtime import API_URL, BackendError, health
13
+
14
+ EXAMPLES_DIR = os.path.join(os.path.dirname(os.path.abspath(__file__)), "examples")
15
+
16
+ # Force Gradio's light palette so stock widgets match the light design system.
17
+ # Plain <script> in head= runs before the app mounts (launch(js=...) proved unreliable).
18
+ FORCE_THEME_HEAD = """
19
+ <script>
20
+ (function () {
21
+ var url = new URL(window.location.href);
22
+ if (url.searchParams.get('__theme') !== 'light') {
23
+ url.searchParams.set('__theme', 'light');
24
+ window.location.replace(url.href);
25
+ }
26
+ })();
27
+ </script>
28
+ <script>
29
+ // Stage rail = primary navigation. The tab pill bar is parked offscreen
30
+ // (NOT display:none - Gradio 6 builds tab buttons from measured width, so a
31
+ // zero-width tablist renders no buttons at all). Rail clicks forward to the
32
+ // real tab buttons; delegated so it survives rail re-renders.
33
+ document.addEventListener('click', function (e) {
34
+ var card = e.target.closest('.stage-card');
35
+ if (!card || !card.dataset.target) return;
36
+ var map = { direct: 0, render: 1, export: 1 };
37
+ var idx = map[card.dataset.target];
38
+ if (idx === undefined) return;
39
+ var tabs = document.querySelectorAll('#shotcraft-tabs [role="tab"]');
40
+ if (tabs[idx]) tabs[idx].click();
41
+ });
42
+ document.addEventListener('keydown', function (e) {
43
+ if (e.key !== 'Enter' && e.key !== ' ') return;
44
+ var card = e.target.closest && e.target.closest('.stage-card');
45
+ if (card) { e.preventDefault(); card.click(); }
46
+ });
47
+ </script>
48
+ """
49
+
50
+ # Native theme base: lets Gradio's own variables style internals custom CSS
51
+ # can't reach (sliders, checkboxes, focus rings) and loads Inter without a
52
+ # CSS @import (no flash of unstyled text). cyan-600 #0891b2 is the single
53
+ # accent β€” closest Tailwind hue to the original brand teal #0fa3b1.
54
+ THEME = gr.themes.Base(
55
+ primary_hue="cyan",
56
+ secondary_hue="cyan",
57
+ neutral_hue="slate",
58
+ radius_size="md",
59
+ font=gr.themes.GoogleFont("Inter"),
60
+ font_mono=gr.themes.GoogleFont("JetBrains Mono"),
61
+ ).set(
62
+ button_primary_background_fill="*primary_600",
63
+ button_primary_background_fill_hover="*primary_500",
64
+ button_primary_text_color="white",
65
+ slider_color="*primary_600",
66
+ )
67
+
68
+ # ShotCraft design system β€” premium light theme, single cyan accent.
69
+ # 1px borders for separation, layered soft shadows, Inter with tight heading
70
+ # tracking, 200ms ease-out micro-interactions, skeleton shimmer on pending
71
+ # outputs, visible dual-ring focus. Lines marked KEEP are load-bearing
72
+ # Gradio fixes β€” do not remove.
73
+ SHOTCRAFT_CSS = """
74
+ :root {
75
+ --sc-bg: #f8fafc;
76
+ --sc-card: #ffffff;
77
+ --sc-ink: #0f172a;
78
+ --sc-body: #475569;
79
+ --sc-muted: #94a3b8;
80
+ --sc-border: rgba(15, 23, 42, 0.08);
81
+ --sc-border-strong: rgba(15, 23, 42, 0.16);
82
+ --sc-accent: #0891b2;
83
+ --sc-accent-strong: #0e7490;
84
+ --sc-accent-soft: rgba(8, 145, 178, 0.10);
85
+ --sc-accent-ring: rgba(8, 145, 178, 0.22);
86
+ --sc-danger: #dc2626;
87
+ --sc-shadow-sm: 0 1px 2px rgba(2, 8, 23, 0.05);
88
+ --sc-shadow-md: 0 1px 2px rgba(2, 8, 23, 0.05), 0 10px 30px rgba(2, 8, 23, 0.07);
89
+ --sc-shadow-lg: 0 2px 4px rgba(2, 8, 23, 0.06), 0 18px 50px rgba(2, 8, 23, 0.10);
90
+ --sc-ease: cubic-bezier(0.22, 1, 0.36, 1);
91
+ }
92
+
93
+ /* KEEP: prevent horizontal scroll on small screens */
94
+ /* clip, not hidden: overflow-x:hidden computes overflow-y to auto, turning
95
+ body into a non-scrolling scroll container that captures position:sticky. */
96
+ html, body {
97
+ overflow-x: clip !important;
98
+ }
99
+
100
+ body,
101
+ .gradio-container {
102
+ background:
103
+ radial-gradient(1100px 480px at 85% -8%, rgba(8, 145, 178, 0.07), transparent 60%),
104
+ linear-gradient(180deg, #fbfcfe 0%, var(--sc-bg) 100%) !important;
105
+ color: var(--sc-body) !important;
106
+ font-family: 'Inter', ui-sans-serif, system-ui, sans-serif !important;
107
+ }
108
+
109
+ .gradio-container {
110
+ max-width: 1180px !important;
111
+ margin: 0 auto !important;
112
+ padding: 28px 22px 48px !important;
113
+ }
114
+
115
+ /* KEEP: let the flex/grid chain shrink below content min-width on small screens */
116
+ .gradio-container, .gradio-container .main, .gradio-container .app,
117
+ .gradio-container .contain, .gradio-container .column, .gradio-container .row,
118
+ .gradio-container .block, .gradio-container .form,
119
+ .gradio-container .tab-container, .gradio-container [role="tablist"] {
120
+ min-width: 0 !important;
121
+ max-width: 100% !important;
122
+ }
123
+
124
+ .prose h1, .prose h2, .prose h3 {
125
+ color: var(--sc-ink) !important;
126
+ letter-spacing: -0.02em;
127
+ font-weight: 700;
128
+ }
129
+
130
+ /* ---------- Interaction primitives ---------- */
131
+ button, [role="tab"], .thumbnail-item, .label-wrap, tbody tr {
132
+ cursor: pointer !important;
133
+ }
134
+
135
+ button {
136
+ transition: transform 150ms var(--sc-ease), box-shadow 200ms var(--sc-ease),
137
+ background 200ms var(--sc-ease), border-color 200ms var(--sc-ease),
138
+ color 200ms var(--sc-ease) !important;
139
+ }
140
+
141
+ button:active {
142
+ transform: scale(0.98);
143
+ }
144
+
145
+ button:focus-visible, [role="tab"]:focus-visible {
146
+ outline: none !important;
147
+ box-shadow: 0 0 0 2px #ffffff, 0 0 0 4px var(--sc-accent) !important;
148
+ }
149
+
150
+ input[type="checkbox"], input[type="radio"] {
151
+ accent-color: var(--sc-accent);
152
+ }
153
+
154
+ /* ---------- Hero ---------- */
155
+ .shotcraft-hero {
156
+ position: relative;
157
+ overflow: hidden;
158
+ padding: 36px 36px 30px;
159
+ border: 1px solid var(--sc-border);
160
+ border-radius: 18px;
161
+ background: rgba(255, 255, 255, 0.82);
162
+ backdrop-filter: blur(12px); /* safe here: hero contains no dropdowns */
163
+ box-shadow: var(--sc-shadow-md);
164
+ }
165
+
166
+ .shotcraft-hero::before {
167
+ content: "";
168
+ position: absolute;
169
+ top: 0; left: 0; right: 0;
170
+ height: 3px;
171
+ background: linear-gradient(90deg, var(--sc-accent), rgba(8, 145, 178, 0.25));
172
+ }
173
+
174
+ .shotcraft-kicker {
175
+ color: var(--sc-accent);
176
+ font-size: 11px;
177
+ font-weight: 600;
178
+ letter-spacing: 0.14em;
179
+ text-transform: uppercase;
180
+ margin-bottom: 14px;
181
+ }
182
+
183
+ .shotcraft-hero h1 {
184
+ max-width: 720px;
185
+ margin: 0;
186
+ color: var(--sc-ink);
187
+ font-weight: 700;
188
+ font-size: clamp(30px, 4.4vw, 46px);
189
+ line-height: 1.08;
190
+ letter-spacing: -0.03em;
191
+ }
192
+
193
+ .shotcraft-hero p {
194
+ max-width: 640px;
195
+ margin: 14px 0 22px;
196
+ color: var(--sc-body);
197
+ font-size: 15px;
198
+ line-height: 1.65;
199
+ }
200
+
201
+ .hero-chips {
202
+ display: flex;
203
+ flex-wrap: wrap;
204
+ gap: 8px;
205
+ }
206
+
207
+ .hero-chip {
208
+ display: inline-flex;
209
+ align-items: center;
210
+ gap: 6px;
211
+ padding: 6px 13px;
212
+ border: 1px solid var(--sc-border);
213
+ border-radius: 999px;
214
+ background: #ffffff;
215
+ color: var(--sc-body);
216
+ font-size: 12px;
217
+ font-weight: 500;
218
+ letter-spacing: 0.01em;
219
+ transition: border-color 200ms var(--sc-ease), box-shadow 200ms var(--sc-ease);
220
+ }
221
+
222
+ .hero-chip:hover {
223
+ border-color: var(--sc-accent-ring);
224
+ box-shadow: var(--sc-shadow-sm);
225
+ }
226
+
227
+ .hero-chip b {
228
+ color: var(--sc-accent);
229
+ font-weight: 700;
230
+ }
231
+
232
+ /* ---------- Status bar ---------- */
233
+ .shotcraft-status {
234
+ display: flex;
235
+ align-items: center;
236
+ gap: 10px;
237
+ margin: 14px 0 16px;
238
+ padding: 10px 14px;
239
+ border-radius: 10px;
240
+ font-size: 12px;
241
+ font-weight: 600;
242
+ letter-spacing: 0.02em;
243
+ }
244
+
245
+ .shotcraft-status.ok {
246
+ border: 1px solid var(--sc-accent-ring);
247
+ background: var(--sc-accent-soft);
248
+ color: var(--sc-accent-strong);
249
+ }
250
+
251
+ .shotcraft-status.fail {
252
+ border: 1px solid rgba(220, 38, 38, 0.30);
253
+ background: rgba(220, 38, 38, 0.07);
254
+ color: var(--sc-danger);
255
+ }
256
+
257
+ .shotcraft-status.checking {
258
+ border: 1px dashed var(--sc-border-strong);
259
+ background: #ffffff;
260
+ color: var(--sc-muted);
261
+ }
262
+
263
+ .status-dot {
264
+ width: 8px;
265
+ height: 8px;
266
+ border-radius: 50%;
267
+ background: currentColor;
268
+ animation: dotPulse 1.8s var(--sc-ease) infinite;
269
+ }
270
+
271
+ @keyframes dotPulse {
272
+ 0%, 100% { opacity: 1; }
273
+ 50% { opacity: 0.45; }
274
+ }
275
+
276
+ /* ---------- Stage rail ---------- */
277
+ .stage-rail {
278
+ display: grid;
279
+ grid-template-columns: repeat(3, 1fr);
280
+ gap: 12px;
281
+ margin: 16px 0 20px;
282
+ }
283
+
284
+ .stage-card {
285
+ position: relative;
286
+ overflow: hidden;
287
+ padding: 16px 18px 14px;
288
+ border: 1px solid var(--sc-border);
289
+ border-radius: 14px;
290
+ background: var(--sc-card);
291
+ box-shadow: var(--sc-shadow-sm);
292
+ transition: transform 200ms var(--sc-ease), box-shadow 200ms var(--sc-ease),
293
+ border-color 200ms var(--sc-ease), opacity 200ms var(--sc-ease);
294
+ }
295
+
296
+ .stage-card::before {
297
+ content: "";
298
+ position: absolute;
299
+ top: 0; left: 0; right: 0;
300
+ height: 3px;
301
+ background: transparent;
302
+ transition: background 200ms var(--sc-ease);
303
+ }
304
+
305
+ .stage-card.pending { opacity: 0.55; }
306
+
307
+ .stage-card.done::before { background: var(--sc-accent-ring); }
308
+ .stage-card.done .stage-label { color: var(--sc-accent-strong); }
309
+
310
+ .stage-card.active {
311
+ transform: translateY(-2px);
312
+ border-color: var(--sc-accent-ring);
313
+ box-shadow: var(--sc-shadow-md), 0 8px 24px rgba(8, 145, 178, 0.12);
314
+ opacity: 1;
315
+ }
316
+
317
+ .stage-card.active::before { background: var(--sc-accent); }
318
+ .stage-card.active .stage-label { color: var(--sc-accent); }
319
+
320
+ .stage-label {
321
+ display: block;
322
+ color: var(--sc-muted);
323
+ font-size: 10px;
324
+ font-weight: 600;
325
+ letter-spacing: 0.12em;
326
+ text-transform: uppercase;
327
+ }
328
+
329
+ .stage-card strong {
330
+ display: block;
331
+ margin: 7px 0 3px;
332
+ color: var(--sc-ink);
333
+ font-size: 16px;
334
+ font-weight: 700;
335
+ letter-spacing: -0.01em;
336
+ }
337
+
338
+ .stage-card span:last-child {
339
+ color: var(--sc-muted);
340
+ font-size: 13px;
341
+ line-height: 1.5;
342
+ }
343
+
344
+ /* ---------- Tabs: segmented control ---------- */
345
+ #shotcraft-tabs [role="tablist"] {
346
+ gap: 4px !important;
347
+ border: 1px solid var(--sc-border) !important;
348
+ border-radius: 12px !important;
349
+ background: #eef2f6 !important;
350
+ padding: 4px !important;
351
+ box-shadow: none !important;
352
+ }
353
+
354
+ #shotcraft-tabs button[role="tab"] {
355
+ border-radius: 9px !important;
356
+ color: var(--sc-muted) !important;
357
+ font-weight: 600 !important;
358
+ font-size: 13px !important;
359
+ letter-spacing: 0 !important;
360
+ border: 1px solid transparent !important;
361
+ transition: all 200ms var(--sc-ease) !important;
362
+ }
363
+
364
+ #shotcraft-tabs button[role="tab"]:hover {
365
+ color: var(--sc-ink) !important;
366
+ background: rgba(255, 255, 255, 0.6) !important;
367
+ }
368
+
369
+ #shotcraft-tabs button[role="tab"][aria-selected="true"] {
370
+ background: #ffffff !important;
371
+ color: var(--sc-ink) !important;
372
+ border-color: var(--sc-border) !important;
373
+ box-shadow: var(--sc-shadow-sm) !important;
374
+ }
375
+
376
+ .stage-tab {
377
+ /* KEEP opacity-only: transform/filter (even identity, kept by fill-mode)
378
+ make the tab a containing block for fixed elements -> dropdown menus
379
+ misplaced. */
380
+ animation: tabReveal 320ms var(--sc-ease);
381
+ }
382
+
383
+ @keyframes tabReveal {
384
+ from { opacity: 0; }
385
+ to { opacity: 1; }
386
+ }
387
+
388
+ /* ---------- Panels & cards ---------- */
389
+ .glass-panel,
390
+ .concept-card,
391
+ .render-panel {
392
+ border: 1px solid var(--sc-border) !important;
393
+ border-radius: 16px !important;
394
+ background: var(--sc-card) !important;
395
+ box-shadow: var(--sc-shadow-md) !important;
396
+ /* KEEP: NO backdrop-filter here β€” it turns the panel into a containing
397
+ block for position:fixed children, which teleports Gradio dropdown
398
+ menus. */
399
+ }
400
+
401
+ .glass-panel, .render-panel {
402
+ padding: 20px !important;
403
+ }
404
+
405
+ .concept-card {
406
+ margin-bottom: 10px !important;
407
+ box-shadow: var(--sc-shadow-sm) !important;
408
+ transition: transform 200ms var(--sc-ease), border-color 200ms var(--sc-ease),
409
+ box-shadow 200ms var(--sc-ease);
410
+ }
411
+
412
+ .concept-card:hover {
413
+ transform: translateY(-1px);
414
+ border-color: var(--sc-border-strong) !important;
415
+ box-shadow: var(--sc-shadow-md) !important;
416
+ }
417
+
418
+ .concept-card > div:first-child {
419
+ border-radius: 14px !important;
420
+ color: var(--sc-ink) !important;
421
+ font-weight: 600 !important;
422
+ }
423
+
424
+ .analysis-card {
425
+ border: 1px solid var(--sc-accent-ring) !important;
426
+ border-left: 3px solid var(--sc-accent) !important;
427
+ border-radius: 12px !important;
428
+ background: var(--sc-accent-soft) !important;
429
+ padding: 14px 16px !important;
430
+ color: var(--sc-body) !important;
431
+ }
432
+
433
+ /* KEEP: Gradio group internals β€” no gray block backgrounds inside cards */
434
+ .glass-panel .block, .glass-panel .form, .glass-panel .styler,
435
+ .render-panel .block, .render-panel .form, .render-panel .styler,
436
+ .concept-card .block, .concept-card .form, .concept-card .styler {
437
+ background: transparent !important;
438
+ }
439
+
440
+ /* ---------- Forms ---------- */
441
+ label,
442
+ .block label,
443
+ .form label {
444
+ color: var(--sc-body) !important;
445
+ font-size: 12px !important;
446
+ font-weight: 600 !important;
447
+ letter-spacing: 0 !important;
448
+ text-transform: none !important;
449
+ }
450
+
451
+ textarea,
452
+ input,
453
+ select {
454
+ border-color: rgba(15, 23, 42, 0.12) !important;
455
+ border-radius: 8px !important;
456
+ background: #ffffff !important;
457
+ color: var(--sc-ink) !important;
458
+ transition: border-color 150ms var(--sc-ease), box-shadow 150ms var(--sc-ease) !important;
459
+ }
460
+
461
+ textarea:focus,
462
+ input:focus {
463
+ border-color: var(--sc-accent) !important;
464
+ box-shadow: 0 0 0 3px var(--sc-accent-ring) !important;
465
+ }
466
+
467
+ /* KEEP scoped to ul.options only β€” [role="listbox"] also matches the dropdown
468
+ INPUT wrap and painted a stray pill border inside the field. */
469
+ ul.options {
470
+ z-index: 9999 !important;
471
+ border: 1px solid var(--sc-border) !important;
472
+ border-radius: 10px !important;
473
+ background: #ffffff !important;
474
+ box-shadow: var(--sc-shadow-lg) !important;
475
+ }
476
+
477
+ ul.options [role="option"], ul.options li {
478
+ background: transparent !important;
479
+ color: var(--sc-ink) !important;
480
+ }
481
+
482
+ ul.options [role="option"]:hover, ul.options li:hover,
483
+ ul.options [role="option"][aria-selected="true"], ul.options li.selected {
484
+ background: var(--sc-accent-soft) !important;
485
+ color: var(--sc-accent-strong) !important;
486
+ }
487
+
488
+ /* Examples table */
489
+ .table-wrap, table, thead, tbody, tr, th, td {
490
+ background: #ffffff !important;
491
+ color: var(--sc-body) !important;
492
+ border-color: var(--sc-border) !important;
493
+ }
494
+
495
+ tbody tr { transition: background 150ms var(--sc-ease) !important; }
496
+ tbody tr:hover, tbody tr:hover td { background: var(--sc-accent-soft) !important; }
497
+
498
+ /* ---------- Buttons ---------- */
499
+ button.primary,
500
+ .shotcraft-primary button,
501
+ button.shotcraft-primary {
502
+ border: 0 !important;
503
+ border-radius: 10px !important;
504
+ background: linear-gradient(180deg, #0aa3c4, var(--sc-accent)) !important;
505
+ color: #ffffff !important;
506
+ font-weight: 600 !important;
507
+ letter-spacing: 0 !important;
508
+ box-shadow: var(--sc-shadow-sm), inset 0 1px 0 rgba(255, 255, 255, 0.18) !important;
509
+ }
510
+
511
+ button.primary:hover,
512
+ .shotcraft-primary button:hover,
513
+ button.shotcraft-primary:hover {
514
+ transform: translateY(-1px);
515
+ background: linear-gradient(180deg, #0db0d2, #0a9cc0) !important;
516
+ box-shadow: 0 8px 22px rgba(8, 145, 178, 0.28), inset 0 1px 0 rgba(255, 255, 255, 0.18) !important;
517
+ }
518
+
519
+ button.primary:active,
520
+ .shotcraft-primary button:active,
521
+ button.shotcraft-primary:active {
522
+ transform: translateY(0) scale(0.98);
523
+ }
524
+
525
+ .shotcraft-secondary button,
526
+ button.shotcraft-secondary {
527
+ border: 1px solid rgba(15, 23, 42, 0.14) !important;
528
+ border-radius: 10px !important;
529
+ background: #ffffff !important;
530
+ color: var(--sc-ink) !important;
531
+ font-weight: 600 !important;
532
+ box-shadow: var(--sc-shadow-sm) !important;
533
+ }
534
+
535
+ .shotcraft-secondary button:hover,
536
+ button.shotcraft-secondary:hover {
537
+ transform: translateY(-1px);
538
+ border-color: var(--sc-accent-ring) !important;
539
+ background: #fbfdfe !important;
540
+ color: var(--sc-accent-strong) !important;
541
+ }
542
+
543
+ .continue-cta {
544
+ margin-top: 16px !important;
545
+ }
546
+
547
+ .continue-cta button,
548
+ button.continue-cta {
549
+ width: 100% !important;
550
+ font-size: 15px !important;
551
+ padding: 13px 22px !important;
552
+ animation: ctaGlow 2.4s var(--sc-ease) infinite;
553
+ }
554
+
555
+ @keyframes ctaGlow {
556
+ 0%, 100% { box-shadow: 0 4px 14px rgba(8, 145, 178, 0.18); }
557
+ 50% { box-shadow: 0 6px 26px rgba(8, 145, 178, 0.38); }
558
+ }
559
+
560
+ /* ---------- Gallery & media ---------- */
561
+ .shotcraft-gallery, .source-preview {
562
+ border-radius: 14px !important;
563
+ background: #ffffff !important;
564
+ }
565
+
566
+ .shotcraft-gallery .thumbnail-item {
567
+ border-radius: 10px !important;
568
+ transition: transform 200ms var(--sc-ease), box-shadow 200ms var(--sc-ease) !important;
569
+ }
570
+
571
+ .shotcraft-gallery .thumbnail-item:hover {
572
+ transform: translateY(-2px);
573
+ box-shadow: var(--sc-shadow-md) !important;
574
+ }
575
+
576
+ .shotcraft-gallery .thumbnail-item.selected {
577
+ outline: 2px solid var(--sc-accent) !important;
578
+ outline-offset: 2px;
579
+ }
580
+
581
+ /* Empty states: dashed, quiet β€” never a bare gray void */
582
+ .gradio-container .empty {
583
+ border: 1.5px dashed rgba(15, 23, 42, 0.14) !important;
584
+ border-radius: 12px !important;
585
+ background: rgba(248, 250, 252, 0.7) !important;
586
+ color: var(--sc-muted) !important;
587
+ }
588
+
589
+ /* ---------- Pending state: skeleton shimmer + status pill ----------
590
+ Hooks Gradio's built-in .generating pending class β€” pure CSS, no JS.
591
+ position:relative is safe for the fixed-position pill (only transform/
592
+ filter/backdrop-filter create containing blocks for fixed children). */
593
+ .generating {
594
+ position: relative !important;
595
+ overflow: hidden !important;
596
+ border-color: var(--sc-accent-ring) !important;
597
+ }
598
+
599
+ .generating::after {
600
+ content: "";
601
+ position: absolute;
602
+ inset: 0;
603
+ z-index: 5;
604
+ pointer-events: none;
605
+ background: linear-gradient(100deg, transparent 30%, rgba(255, 255, 255, 0.55) 50%, transparent 70%);
606
+ background-size: 200% 100%;
607
+ animation: shimmer 1.6s linear infinite;
608
+ }
609
+
610
+ @keyframes shimmer {
611
+ from { background-position: 150% 0; }
612
+ to { background-position: -50% 0; }
613
+ }
614
+
615
+ .generating::before {
616
+ content: "\\25CF Generating";
617
+ position: fixed;
618
+ top: 16px;
619
+ right: 20px;
620
+ z-index: 1000;
621
+ padding: 6px 14px;
622
+ border: 1px solid var(--sc-accent-ring);
623
+ border-radius: 999px;
624
+ background: rgba(255, 255, 255, 0.92);
625
+ color: var(--sc-accent-strong);
626
+ font-size: 12px;
627
+ font-weight: 600;
628
+ letter-spacing: 0.04em;
629
+ box-shadow: var(--sc-shadow-lg);
630
+ animation: dotPulse 1.4s var(--sc-ease) infinite;
631
+ pointer-events: none;
632
+ }
633
+
634
+ /* ---------- Toasts ---------- */
635
+ .toast-body {
636
+ border: 1px solid var(--sc-border) !important;
637
+ border-radius: 12px !important;
638
+ background: rgba(255, 255, 255, 0.94) !important;
639
+ backdrop-filter: blur(10px);
640
+ box-shadow: var(--sc-shadow-lg) !important;
641
+ }
642
+
643
+ .visually-hidden {
644
+ position: absolute !important;
645
+ left: -9999px !important;
646
+ width: 1px !important;
647
+ height: 1px !important;
648
+ overflow: hidden !important;
649
+ }
650
+
651
+ /* ---------- Stage rail = navigation (tab pills hidden) ---------- */
652
+ #shotcraft-tabs [role="tablist"] {
653
+ position: absolute !important;
654
+ left: -9999px !important;
655
+ top: 0 !important;
656
+ width: 600px !important; /* real width: zero-width tablists render no tab buttons */
657
+ pointer-events: none !important;
658
+ }
659
+
660
+ .stage-card {
661
+ cursor: pointer;
662
+ user-select: none;
663
+ }
664
+
665
+ .stage-card:hover {
666
+ transform: translateY(-2px);
667
+ border-color: var(--sc-accent-ring);
668
+ box-shadow: var(--sc-shadow-md);
669
+ }
670
+
671
+ .stage-card:focus-visible {
672
+ outline: 2px solid var(--sc-accent);
673
+ outline-offset: 2px;
674
+ }
675
+
676
+ /* ---------- Sticky continue CTA ---------- */
677
+ /* Gradio sets overflow:hidden on the container, which captures position:sticky
678
+ (a non-scrolling scroll container). overflow:clip clips identically but
679
+ does NOT create a scroll container, so the CTA can stick to the real
680
+ scrollport (Gradio's inner overflow-y:auto wrapper). */
681
+ .gradio-container {
682
+ overflow: clip !important;
683
+ }
684
+
685
+ .continue-cta,
686
+ button.continue-cta {
687
+ position: sticky !important;
688
+ bottom: 14px;
689
+ z-index: 60;
690
+ }
691
+
692
+ /* ---------- Locked product look ---------- */
693
+ .locked-box {
694
+ border-left: 4px solid var(--sc-accent) !important;
695
+ border-radius: 12px !important;
696
+ background: rgba(8, 145, 178, 0.05) !important;
697
+ padding: 4px 10px !important;
698
+ }
699
+
700
+ .locked-box textarea {
701
+ font-weight: 600 !important;
702
+ }
703
+
704
+ /* ---------- Palette chips ---------- */
705
+ .palette-chips {
706
+ display: flex;
707
+ align-items: center;
708
+ gap: 6px;
709
+ padding: 6px 0 2px;
710
+ min-height: 24px;
711
+ }
712
+
713
+ .palette-chip {
714
+ width: 22px;
715
+ height: 22px;
716
+ border-radius: 7px;
717
+ border: 1px solid rgba(15, 23, 42, 0.18);
718
+ box-shadow: inset 0 0 0 2px rgba(255, 255, 255, 0.35);
719
+ display: inline-block;
720
+ }
721
+
722
+ /* Hide the analysis strip until Stage 1 fills it (it's an empty pill before) */
723
+ .analysis-card.hide-container,
724
+ .analysis-card:not(:has(.prose)) {
725
+ display: none !important;
726
+ }
727
+
728
+ .boards-toolbar button {
729
+ padding: 6px 14px !important;
730
+ font-size: 12.5px !important;
731
+ white-space: nowrap !important;
732
+ }
733
+
734
+ /* ---------- Mobile ---------- */
735
+ @media (max-width: 760px) {
736
+ .gradio-container {
737
+ padding: 14px 10px 30px !important;
738
+ }
739
+
740
+ .stage-rail {
741
+ grid-template-columns: 1fr;
742
+ }
743
+
744
+ .shotcraft-hero {
745
+ padding: 22px 18px;
746
+ }
747
+
748
+ .shotcraft-hero h1 {
749
+ font-size: clamp(24px, 8.5vw, 34px);
750
+ overflow-wrap: break-word;
751
+ }
752
+
753
+ .shotcraft-hero p {
754
+ font-size: 14px;
755
+ }
756
+
757
+ .shotcraft-status {
758
+ white-space: normal;
759
+ word-break: break-word;
760
+ }
761
+
762
+ .shotcraft-gallery .grid-container {
763
+ grid-template-columns: repeat(2, minmax(0, 1fr)) !important;
764
+ }
765
+
766
+ .table-wrap {
767
+ overflow-x: auto !important;
768
+ max-width: 100% !important;
769
+ }
770
+
771
+ .prose h3, .prose h2 {
772
+ white-space: normal !important;
773
+ overflow-wrap: break-word;
774
+ }
775
+ }
776
+ """
777
+
778
+
779
+ def _hero_html() -> str:
780
+ return """
781
+ <section class="shotcraft-hero">
782
+ <div class="shotcraft-kicker">ShotCraft / AI campaign console</div>
783
+ <h1>Direct the shot. Render the campaign.</h1>
784
+ <p>
785
+ Upload one product photo, let MiniCPM draft five grounded art-direction
786
+ boards, then move into FLUX rendering with editable prompts and
787
+ deterministic hero-frame control.
788
+ </p>
789
+ <div class="hero-chips">
790
+ <span class="hero-chip"><b>01</b> Vision director</span>
791
+ <span class="hero-chip"><b>02</b> Five shot concepts</span>
792
+ <span class="hero-chip"><b>03</b> FLUX render reel</span>
793
+ <span class="hero-chip"><b>04</b> Export package</span>
794
+ </div>
795
+ </section>
796
+ """
797
+
798
+
799
+ def _backend_status_html(status: dict) -> str:
800
+ if status.get("status") == "ok":
801
+ return (
802
+ '<div class="shotcraft-status ok"><span class="status-dot"></span>'
803
+ f'BACKEND ONLINE / {status.get("stage1", "?")} / '
804
+ f'{status.get("stage2", "?")} / MODAL</div>'
805
+ )
806
+ return (
807
+ '<div class="shotcraft-status fail"><span class="status-dot"></span>'
808
+ f'BACKEND UNREACHABLE / {API_URL} / deploy Modal before generating</div>'
809
+ )
810
+
811
+
812
+ def _status_checking_html() -> str:
813
+ return (
814
+ '<div class="shotcraft-status checking"><span class="status-dot"></span>'
815
+ 'CHECKING BACKEND…</div>'
816
+ )
817
+
818
+
819
+ def _stage_rail(active: str = "direct") -> str:
820
+ steps = [
821
+ ("direct", "01", "Direct", "Analyze product and draft concepts"),
822
+ ("render", "02", "Render", "Generate the five-frame reel"),
823
+ ("export", "03", "Package", "Pick heroes and download assets"),
824
+ ]
825
+ active_index = next((i for i, step in enumerate(steps) if step[0] == active), 0)
826
+ cards = []
827
+ for index, (key, num, title, desc) in enumerate(steps):
828
+ state = "active" if key == active else "done" if index < active_index else "pending"
829
+ cards.append(
830
+ f'<div class="stage-card {state}" data-target="{key}" role="button" '
831
+ f'tabindex="0" title="Go to {title}">'
832
+ f'<span class="stage-label">{num} / {state}</span>'
833
+ f'<strong>{title}</strong>'
834
+ f'<span>{desc}</span>'
835
+ '</div>'
836
+ )
837
+ return '<div class="stage-rail">' + "".join(cards) + "</div>"
838
+
839
+
840
+ def _example_brand() -> str:
841
+ try:
842
+ with open(os.path.join(EXAMPLES_DIR, "brand_brief.txt")) as f:
843
+ return f.read().strip()
844
+ except OSError:
845
+ return ""
846
+
847
+ def run_stage1(image, brand, category, preset):
848
+ if image is None:
849
+ raise gr.Error("Please upload a product photo first.")
850
+ if not brand or not brand.strip():
851
+ raise gr.Error("Please add a short brand description.")
852
+ if getattr(image, "width", 0) * getattr(image, "height", 0) > 36_000_000:
853
+ raise gr.Error("Image too large β€” please use a photo under ~36 megapixels.")
854
+ try:
855
+ pkg = generate_concepts(image, brand.strip()[:300], category, preset)
856
+ except (RuntimeError, BackendError) as e:
857
+ raise gr.Error(f"Concept generation failed: {e}")
858
+ pa = pkg.product_analysis
859
+ analysis = (f"**Detected:** {pa.product_type} | **Materials:** {pa.materials} | "
860
+ f"**Colors:** {', '.join(pa.colors)} | {pa.distinguishing_features}")
861
+ fields = []
862
+ for s in pkg.shots:
863
+ fields.extend([s.concept_name, s.scene, s.camera_angle, s.lighting,
864
+ ", ".join(s.color_palette), s.props, s.marketing_angle,
865
+ s.image_prompt])
866
+ return (pkg, analysis, *fields, gr.update(interactive=True))
867
+
868
+ FIELD_KEYS = ["concept_name", "scene", "camera_angle", "lighting",
869
+ "color_palette", "props", "marketing_angle", "image_prompt"]
870
+
871
+ def apply_edits(pkg, values):
872
+ """FR-1.3 + Q-3: write every edited concept field back into the package."""
873
+ n = len(FIELD_KEYS)
874
+ for i, s in enumerate(pkg.shots):
875
+ chunk = values[i * n:(i + 1) * n]
876
+ if len(chunk) < n:
877
+ break
878
+ for key, val in zip(FIELD_KEYS, chunk):
879
+ if val is None or not str(val).strip():
880
+ continue
881
+ v = str(val).strip()
882
+ if key == "color_palette":
883
+ setattr(s, key, [c.strip() for c in v.split(",") if c.strip()])
884
+ else:
885
+ setattr(s, key, v)
886
+
887
+ def run_stage2(pkg, preset, aspect, *field_values, progress=gr.Progress()):
888
+ if pkg is None:
889
+ raise gr.Error("Generate shot concepts first (Step 1).")
890
+ apply_edits(pkg, field_values) # FR-1.3/FR-2.1 + Q-3: all edits persist
891
+ progress((0, 5), desc="Rendering 5 shots in one GPU lease…")
892
+ frames = generate_frames(pkg.shots, preset, aspect,
893
+ product_desc=getattr(pkg.product_analysis,
894
+ "canonical_description", "")) # ONE backend call
895
+ images = [(img, f"Shot {s.id}: {s.concept_name}")
896
+ for s, img in zip(pkg.shots, frames)]
897
+ regen_state = {s.id: 0 for s in pkg.shots}
898
+ progress((5, 5), desc="Done")
899
+ return images, pkg, regen_state
900
+
901
+
902
+ def _palette_chips(palette_str) -> str:
903
+ """Small color swatches next to the palette field (hex or CSS color names)."""
904
+ tokens = [t.strip() for t in str(palette_str or "").split(",") if t.strip()][:6]
905
+ if not tokens:
906
+ return ""
907
+ spans = "".join(
908
+ f'<span class="palette-chip" title="{t}" style="background:{t}"></span>'
909
+ for t in tokens)
910
+ return f'<div class="palette-chips">{spans}</div>'
911
+
912
+
913
+ def _shot_summary(s) -> str:
914
+ """One-line accordion header so collapsed boards stay scannable."""
915
+ angle = (s.marketing_angle or s.scene or "").strip()
916
+ if len(angle) > 70:
917
+ angle = angle[:67].rstrip() + "…"
918
+ return f"Shot {s.id} β€” {s.concept_name}" + (f" Β· {angle}" if angle else "")
919
+
920
+
921
+ def _apply_locked(pkg, locked):
922
+ """Editable locked-look box is the source of truth for the render prefix."""
923
+ if pkg is not None and locked and str(locked).strip():
924
+ pkg.product_analysis.canonical_description = str(locked).strip()
925
+
926
+
927
+ def run_stage1_ui(image, brand, category, preset, progress=gr.Progress()):
928
+ """UI wrapper: preserve run_stage1() test contract while revealing the next-step CTA.
929
+
930
+ Stays on the Direct tab so concepts can be reviewed/edited, then the
931
+ 'Continue to Render β†’' CTA advances the flow (no manual tab hunting).
932
+ Also mirrors the uploaded product photo into the Render tab, fills the
933
+ editable locked-look box, retitles the concept accordions with one-line
934
+ summaries, and paints the palette swatches.
935
+ """
936
+ progress(0.1, desc="MiniCPM is reading your product…")
937
+ out = run_stage1(image, brand, category, preset)
938
+ progress(1.0, desc="Concepts ready")
939
+ gr.Info("Five shot concepts drafted β€” review and edit them below.")
940
+ pkg = out[0]
941
+ locked = getattr(pkg.product_analysis, "canonical_description", "")
942
+ acc_updates = [gr.update(label=_shot_summary(s), open=False) for s in pkg.shots]
943
+ chip_updates = [_palette_chips(", ".join(s.color_palette)) for s in pkg.shots]
944
+ return (*out, gr.update(visible=True), image,
945
+ gr.update(value=locked, visible=True), *acc_updates, *chip_updates)
946
+
947
+
948
+ def advance_to_render(reached):
949
+ """CTA handler: smooth-advance the workflow chrome from Direct to Render."""
950
+ new = "export" if reached == "export" else "render"
951
+ return gr.update(selected="render"), _stage_rail(new), new
952
+
953
+
954
+ def run_stage2_ui(pkg, preset, aspect, locked, *field_values, progress=gr.Progress()):
955
+ """UI wrapper: preserve run_stage2() test contract while advancing the stage chrome."""
956
+ _apply_locked(pkg, locked)
957
+ images, updated_pkg, regen_state = run_stage2(
958
+ pkg, preset, aspect, *field_values, progress=progress
959
+ )
960
+ gr.Info("Reel rendered β€” click any frame to inspect its concept.")
961
+ return images, updated_pkg, regen_state, _stage_rail("export"), "export"
962
+
963
+
964
+ def sync_stage_rail(reached, evt: gr.SelectData):
965
+ """Keep the stage rail honest when the user clicks tabs directly."""
966
+ active = "direct" if evt.index == 0 else ("export" if reached == "export" else "render")
967
+ return _stage_rail(active)
968
+
969
+
970
+ def regen_one(pkg, preset, aspect, shot_id, gallery, regen_state):
971
+ if pkg is None or not gallery:
972
+ raise gr.Error("Nothing to regenerate yet.")
973
+ sid = int(shot_id)
974
+ shot = next((s for s in pkg.shots if s.id == sid), None)
975
+ if shot is None:
976
+ raise gr.Error(f"Shot {sid} not found.")
977
+ regen_state = dict(regen_state or {})
978
+ regen_state[sid] = regen_state.get(sid, 0) + 1
979
+ img = generate_frame(shot, preset, aspect, regen_counter=regen_state[sid],
980
+ product_desc=getattr(pkg.product_analysis,
981
+ "canonical_description", ""))
982
+ gallery = list(gallery)
983
+ gallery[sid - 1] = (img, f"Shot {sid}: {shot.concept_name}")
984
+ return gallery, regen_state
985
+
986
+
987
+ def regen_one_ui(pkg, preset, aspect, locked, shot_id, gallery, regen_state):
988
+ """UI wrapper: preserve regen_one() test contract, add toast feedback."""
989
+ _apply_locked(pkg, locked)
990
+ gallery, regen_state = regen_one(pkg, preset, aspect, shot_id, gallery, regen_state)
991
+ gr.Info(f"Shot {int(shot_id)} rerolled with a fresh seed.")
992
+ return gallery, regen_state
993
+
994
+
995
+ def show_detail(pkg, evt: gr.SelectData):
996
+ """FR-2.5: click a gallery frame -> full concept card + prompt.
997
+
998
+ Also points the regenerate picker at the clicked shot, so
999
+ 'click frame -> Regenerate' needs no manual dropdown hunting.
1000
+ """
1001
+ if pkg is None:
1002
+ return "", gr.update()
1003
+ idx = evt.index if isinstance(evt.index, int) else evt.index[0]
1004
+ if idx is None or idx >= len(pkg.shots):
1005
+ return "", gr.update()
1006
+ s = pkg.shots[idx]
1007
+ detail = (f"### Shot {s.id}: {s.concept_name}\n"
1008
+ f"**Scene:** {s.scene}\n\n**Camera:** {s.camera_angle} Β· "
1009
+ f"**Lighting:** {s.lighting} Β· **Palette:** {', '.join(s.color_palette)}\n\n"
1010
+ f"**Props:** {s.props}\n\n**Marketing angle:** {s.marketing_angle}\n\n"
1011
+ f"**Prompt:** `{s.image_prompt}`")
1012
+ return detail, gr.update(value=s.id)
1013
+
1014
+ def export_zip(pkg, gallery, heroes, regen_state):
1015
+ """FR-3.1: concepts.md + concepts.json + 5 PNGs + selection_manifest.json."""
1016
+ if pkg is None or not gallery:
1017
+ raise gr.Error("Generate frames before exporting.")
1018
+ regen_state = regen_state or {}
1019
+ buf = io.BytesIO()
1020
+ with zipfile.ZipFile(buf, "w", zipfile.ZIP_DEFLATED) as z:
1021
+ z.writestr("concepts.json", pkg.to_json())
1022
+ md = ["# ShotCraft package\n"]
1023
+ for s in pkg.shots:
1024
+ md.append(f"## Shot {s.id}: {s.concept_name}\n{s.scene}\n\nPrompt: {s.image_prompt}\n")
1025
+ z.writestr("concepts.md", "\n".join(md))
1026
+ def _rc(sid):
1027
+ return int(regen_state.get(sid, regen_state.get(str(sid), 0)))
1028
+ manifest = {
1029
+ "hero_frames": heroes or [],
1030
+ "shots": [
1031
+ {
1032
+ "id": s.id,
1033
+ "concept_name": s.concept_name,
1034
+ "file": f"shot_{s.id}.png",
1035
+ "image_prompt": s.image_prompt, # post-edit (Q-3: edits persist)
1036
+ "regen_count": _rc(s.id),
1037
+ "seed": seed_for(s.id, _rc(s.id)),
1038
+ "is_hero": s.id in (heroes or []),
1039
+ }
1040
+ for s in pkg.shots
1041
+ ],
1042
+ "backend": API_URL,
1043
+ }
1044
+ z.writestr("selection_manifest.json", json.dumps(manifest, indent=2))
1045
+ for i, item in enumerate(gallery, start=1):
1046
+ img = item[0] if isinstance(item, (tuple, list)) else item
1047
+ ib = io.BytesIO(); img.save(ib, format="PNG")
1048
+ z.writestr(f"shot_{i}.png", ib.getvalue())
1049
+ fd, path = tempfile.mkstemp(prefix="shotcraft_pkg_", suffix=".zip")
1050
+ os.write(fd, buf.getvalue())
1051
+ os.close(fd)
1052
+ return path
1053
+
1054
+
1055
+ def export_zip_ui(pkg, gallery, heroes, regen_state):
1056
+ """UI wrapper: preserve export_zip() test contract, reveal the download card."""
1057
+ path = export_zip(pkg, gallery, heroes, regen_state)
1058
+ gr.Info("Package ready β€” your ZIP is below.")
1059
+ return gr.update(value=path, visible=True)
1060
+
1061
+
1062
+ with gr.Blocks(title="ShotCraft β€” AI Shot Director") as demo:
1063
+ gr.HTML(_hero_html())
1064
+ # Live status per page load (demo.load) β€” a build-time health() call would
1065
+ # freeze the banner at whatever was true when the Space booted.
1066
+ backend_status = gr.HTML(_status_checking_html())
1067
+ stage_rail = gr.HTML(_stage_rail("direct"), elem_id="stage-rail")
1068
+ pkg_state = gr.State(None)
1069
+ regen_state = gr.State({})
1070
+ reached_stage = gr.State("direct")
1071
+
1072
+ with gr.Tabs(selected="direct", elem_id="shotcraft-tabs",
1073
+ elem_classes=["workflow-tabs"]) as workflow_tabs:
1074
+ with gr.Tab("01 Direct", id="direct", elem_classes=["stage-tab"]):
1075
+ with gr.Group(elem_classes=["glass-panel"]):
1076
+ gr.Markdown("### Product briefing room")
1077
+ with gr.Row():
1078
+ photo = gr.Image(label="Product photo", type="pil")
1079
+ with gr.Column():
1080
+ brand = gr.Textbox(
1081
+ label="Brand description",
1082
+ lines=2,
1083
+ max_lines=4,
1084
+ max_length=300,
1085
+ placeholder="Materials, audience, vibe β€” e.g. Handmade ceramic mugs, "
1086
+ "Scandinavian style, eco-friendly, for slow-morning coffee people",
1087
+ info="Up to 300 characters β€” used to ground every concept",
1088
+ )
1089
+ category = gr.Dropdown(
1090
+ CATEGORIES, label="Category", value=CATEGORIES[0],
1091
+ info="Guides scene and prop choices",
1092
+ )
1093
+ preset = gr.Dropdown(
1094
+ STYLE_PRESETS, label="Style preset", value=STYLE_PRESETS[0],
1095
+ info="Sets the overall art direction for every render",
1096
+ )
1097
+ go1 = gr.Button(
1098
+ "Direct my shots",
1099
+ variant="primary",
1100
+ elem_classes=["shotcraft-primary"],
1101
+ )
1102
+ demo_photo = os.path.join(EXAMPLES_DIR, "demo_product.png")
1103
+ if os.path.exists(demo_photo): # FR-3.2 pre-loaded example
1104
+ demo_btn = gr.Button(
1105
+ "✨ Try the demo product",
1106
+ size="sm",
1107
+ elem_classes=["shotcraft-secondary"],
1108
+ )
1109
+ demo_btn.click(
1110
+ lambda: (Image.open(demo_photo), _example_brand(), "Home", "Minimal"),
1111
+ None, [photo, brand, category, preset],
1112
+ )
1113
+ analysis_md = gr.Markdown(elem_classes=["analysis-card"])
1114
+ locked_box = gr.Textbox(
1115
+ label="πŸ”’ Locked product look β€” prefixed to every render",
1116
+ visible=False,
1117
+ lines=2,
1118
+ max_lines=3,
1119
+ info="Correct it if the analysis got a color or detail wrong β€” "
1120
+ "this exact description pins the product in all 5 shots",
1121
+ elem_classes=["locked-box"],
1122
+ )
1123
+ with gr.Row(elem_classes=["boards-toolbar"]):
1124
+ gr.Markdown("### Editable concept boards\nEvery field is live. "
1125
+ "Edits carry into rendering and export.", scale=5)
1126
+ expand_all = gr.Button("Expand all", size="sm", scale=0, min_width=120,
1127
+ elem_classes=["shotcraft-secondary"])
1128
+ collapse_all = gr.Button("Collapse all", size="sm", scale=0, min_width=120,
1129
+ elem_classes=["shotcraft-secondary"])
1130
+ field_boxes = []
1131
+ accordions = []
1132
+ chip_htmls = []
1133
+ for i in range(5):
1134
+ with gr.Accordion(f"Shot {i + 1}", open=False,
1135
+ elem_classes=["concept-card"]) as acc:
1136
+ with gr.Row():
1137
+ nb = gr.Textbox(label="Concept name")
1138
+ cb = gr.Textbox(label="Camera angle")
1139
+ lb = gr.Textbox(label="Lighting")
1140
+ sb = gr.Textbox(label="Scene", max_lines=2)
1141
+ with gr.Row():
1142
+ with gr.Column(scale=1):
1143
+ pb = gr.Textbox(label="Palette (comma-separated hex)")
1144
+ chips = gr.HTML()
1145
+ rb = gr.Textbox(label="Props")
1146
+ mb = gr.Textbox(label="Marketing angle")
1147
+ ib = gr.Textbox(label="Image prompt (FLUX)", max_lines=3)
1148
+ field_boxes.extend([nb, sb, cb, lb, pb, rb, mb, ib])
1149
+ accordions.append(acc)
1150
+ chip_htmls.append(chips)
1151
+ pb.change(_palette_chips, pb, chips, show_progress="hidden")
1152
+ expand_all.click(lambda: [gr.update(open=True)] * 5, None, accordions)
1153
+ collapse_all.click(lambda: [gr.update(open=False)] * 5, None, accordions)
1154
+ continue_btn = gr.Button(
1155
+ "Continue to Render β€” generate 5 shots β†’",
1156
+ variant="primary",
1157
+ visible=False,
1158
+ elem_classes=["shotcraft-primary", "continue-cta"],
1159
+ )
1160
+
1161
+ with gr.Tab("02 Render", id="render", elem_classes=["stage-tab"]):
1162
+ with gr.Group(elem_classes=["render-panel"]):
1163
+ gr.Markdown("### Render reel")
1164
+ with gr.Row():
1165
+ aspect = gr.Radio(
1166
+ ["1:1", "16:9"], value="1:1", label="Aspect ratio",
1167
+ info="1:1 for product tiles Β· 16:9 for banners",
1168
+ )
1169
+ go2 = gr.Button(
1170
+ "Re-render 5 shots",
1171
+ variant="primary",
1172
+ interactive=False,
1173
+ elem_classes=["shotcraft-primary"],
1174
+ )
1175
+ with gr.Row(equal_height=False):
1176
+ with gr.Column(scale=1, min_width=190):
1177
+ src_preview = gr.Image(
1178
+ label="Source product",
1179
+ type="pil",
1180
+ interactive=False,
1181
+ height=360,
1182
+ buttons=[],
1183
+ elem_classes=["source-preview"],
1184
+ )
1185
+ with gr.Column(scale=4):
1186
+ gallery = gr.Gallery(
1187
+ label="Storyboard reel",
1188
+ columns=4,
1189
+ height=360,
1190
+ format="png",
1191
+ buttons=["download", "fullscreen"],
1192
+ elem_classes=["shotcraft-gallery"],
1193
+ )
1194
+ detail_md = gr.Markdown()
1195
+ with gr.Row():
1196
+ regen_id = gr.Dropdown(
1197
+ [1, 2, 3, 4, 5], value=1, label="Shot #",
1198
+ info="Click a frame above to pick it here",
1199
+ )
1200
+ regen_btn = gr.Button("Regenerate this shot", elem_classes=["shotcraft-secondary"])
1201
+ heroes = gr.CheckboxGroup(
1202
+ [1, 2, 3, 4, 5], label="Hero frames",
1203
+ info="Lead frames for the campaign β€” recorded in the export manifest",
1204
+ )
1205
+ export_btn = gr.Button("Download package ZIP", elem_classes=["shotcraft-secondary"])
1206
+ zip_out = gr.File(label="Package", visible=False)
1207
+
1208
+ demo.load(lambda: _backend_status_html(health()), None, backend_status)
1209
+ go1.click(run_stage1_ui, [photo, brand, category, preset],
1210
+ [pkg_state, analysis_md, *field_boxes, go2, continue_btn, src_preview,
1211
+ locked_box, *accordions, *chip_htmls])
1212
+ # One-click flow: the CTA switches to the Render tab, then immediately
1213
+ # kicks off the 5-shot render. go2 remains as a manual re-render after
1214
+ # concept edits (or if the auto-render failed on a cold backend).
1215
+ continue_btn.click(advance_to_render, [reached_stage],
1216
+ [workflow_tabs, stage_rail, reached_stage],
1217
+ js="() => window.scrollTo({top: 0, behavior: 'smooth'})"
1218
+ ).then(run_stage2_ui, [pkg_state, preset, aspect, locked_box, *field_boxes],
1219
+ [gallery, pkg_state, regen_state, stage_rail, reached_stage])
1220
+ workflow_tabs.select(sync_stage_rail, [reached_stage], [stage_rail])
1221
+ go2.click(run_stage2_ui, [pkg_state, preset, aspect, locked_box, *field_boxes],
1222
+ [gallery, pkg_state, regen_state, stage_rail, reached_stage])
1223
+ gallery.select(show_detail, [pkg_state], [detail_md, regen_id])
1224
+ regen_btn.click(regen_one_ui, [pkg_state, preset, aspect, locked_box, regen_id, gallery, regen_state],
1225
+ [gallery, regen_state])
1226
+ export_btn.click(export_zip_ui, [pkg_state, gallery, heroes, regen_state], [zip_out])
1227
+
1228
+ if __name__ == "__main__":
1229
+ # ssr_mode=False: Gradio SSR (Node proxy) restores stale sessions across
1230
+ # Space rebuilds -> dead /tmp/gradio file refs crash Image preprocess.
1231
+ demo.launch(max_file_size="10mb", ssr_mode=False, theme=THEME,
1232
+ css=SHOTCRAFT_CSS, head=FORCE_THEME_HEAD) # FR-1.1 upload cap
director.py ADDED
@@ -0,0 +1,86 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """ShotCraft Stage 1 β€” Shot Director (MiniCPM-V-2_6, 8B). Owner: Pawel.
2
+
3
+ Inference runs on the Modal backend (model_runtime.minicpm_chat). This
4
+ module owns the prompt, JSON validation, and the one auto-retry (FR-1.4).
5
+ """
6
+ from __future__ import annotations
7
+ import json
8
+ import re
9
+ from model_runtime import minicpm_chat
10
+ from schemas import ConceptPackage, validate_package, STYLE_SUFFIXES
11
+
12
+ MODEL_ID = "openbmb/MiniCPM-V-2_6"
13
+ TEMPERATURE = 0.6 # FR-1.4
14
+
15
+ SYSTEM_PROMPT = """You are ShotCraft, an expert e-commerce art director.
16
+ Analyze the uploaded PRODUCT PHOTO carefully: identify the product type,
17
+ materials, exact colors, and distinguishing features you can SEE.
18
+ Then design exactly 5 distinct marketing shot concepts grounded in those
19
+ real attributes.
20
+
21
+ CRITICAL - PRODUCT CONSISTENCY:
22
+ The SAME physical product must appear, unchanged, in all 5 shots. Only the
23
+ scene around it changes. To guarantee this:
24
+ 1. Write "canonical_description": ONE sentence (max 40 words) describing the
25
+ product EXACTLY as photographed: product type, shape, every visible color
26
+ and which part it is on (e.g. body, sole, laces, lid, label), materials,
27
+ and any logo or branding placement. Use plain English color names
28
+ ("off-white", "charcoal black", "gum brown") - NEVER hex codes here.
29
+ 2. Each "image_prompt" must describe ONLY the scene: setting, surfaces,
30
+ props, camera angle, lighting, mood. Refer to the product simply as
31
+ "the product". NEVER re-describe, recolor, restyle or redesign the
32
+ product itself - no color adjectives for the product, no "colorway",
33
+ no "variant", no "matching the palette". The pipeline automatically
34
+ prefixes your canonical_description to every render prompt.
35
+ 3. "color_palette" is for the BACKDROP and props only, never the product.
36
+ Even in bold/colorful styles, the colors go into the background and
37
+ props while the product keeps its original colors.
38
+
39
+ Return ONLY valid JSON, no markdown fences, matching this schema:
40
+ {
41
+ "product_analysis": {
42
+ "product_type": str, "materials": str,
43
+ "colors": [str hex], "distinguishing_features": str,
44
+ "canonical_description": str // the locked product identity, rule 1
45
+ },
46
+ "shots": [ // exactly 5
47
+ {
48
+ "id": int, "concept_name": str, "scene": str, "camera_angle": str,
49
+ "lighting": str, "color_palette": [str hex], "props": str,
50
+ "marketing_angle": str,
51
+ "image_prompt": str // English, FLUX-optimized, scene only (rule 2)
52
+ }
53
+ ]
54
+ }"""
55
+
56
+ def build_user_prompt(brand_desc: str, category: str, style_preset: str) -> str:
57
+ return (f"Brand description: {brand_desc}\n"
58
+ f"Product category: {category}\n"
59
+ f"Style preset: {style_preset} "
60
+ f"(style keywords: {STYLE_SUFFIXES.get(style_preset, '')})\n"
61
+ f"Design 5 shot concepts for this product.")
62
+
63
+ def _strip_fences(text: str) -> str:
64
+ t = text.strip()
65
+ if t.startswith("```"):
66
+ t = t.split("\n", 1)[1] if "\n" in t else t
67
+ t = t.rsplit("```", 1)[0]
68
+ # Repair: MiniCPM sometimes emits single-quoted hex colors, e.g. ['#AABBCC'].
69
+ # Only rewrite single-quoted hex tokens so apostrophes in prose stay intact.
70
+ t = re.sub(r"'(#[0-9A-Fa-f]{3,8})'", r'"\1"', t)
71
+ return t.strip()
72
+
73
+ def generate_concepts(image, brand_desc: str, category: str, style_preset: str) -> ConceptPackage:
74
+ """Run MiniCPM-V-2_6 on the product photo. One auto-retry on invalid JSON (FR-1.4)."""
75
+ user_prompt = build_user_prompt(brand_desc, category, style_preset)
76
+ last_err = None
77
+ for attempt in range(2): # initial + 1 retry
78
+ raw = minicpm_chat(image=image, system=SYSTEM_PROMPT, user=user_prompt,
79
+ temperature=TEMPERATURE)
80
+ try:
81
+ return validate_package(_strip_fences(raw))
82
+ except (ValueError, json.JSONDecodeError) as e:
83
+ last_err = e
84
+ user_prompt += ("\n\nYour previous reply was invalid "
85
+ f"({e}). Return ONLY the corrected JSON.")
86
+ raise RuntimeError(f"Stage 1 failed after retry: {last_err}")
examples/brand_brief.txt ADDED
@@ -0,0 +1 @@
 
 
1
+ Handmade ceramic mugs from a small KrakΓ³w pottery studio. Scandinavian-inspired, eco-friendly stoneware with speckled sage glazes. Each piece is wheel-thrown and slightly imperfect on purpose.
examples/demo_product.png ADDED
frames.py ADDED
@@ -0,0 +1,59 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """ShotCraft Stage 2 β€” Frame Generator (FLUX.1-schnell, 12B). Owner: Rafal.
2
+
3
+ Rendering runs on the Modal backend (model_runtime.flux_generate_batch):
4
+ the whole reel is one backend call; a single-frame regen is a 1-element batch.
5
+ """
6
+ from __future__ import annotations
7
+
8
+ from model_runtime import flux_generate_batch
9
+ from schemas import ConceptPackage, STYLE_SUFFIXES
10
+
11
+ MODEL_ID = "black-forest-labs/FLUX.1-schnell"
12
+ NUM_STEPS = 4 # schnell sweet spot, fixed server-side (NFR-1)
13
+ SEED_BASE = 42 # FR-2.3: deterministic seed base
14
+ SIZES = {"1:1": (1024, 1024), "16:9": (1024, 576)} # FR-2.6
15
+
16
+ def build_prompt(shot, style_preset: str, product_desc: str = "") -> str:
17
+ """FR-2.3: prompt + palette keywords + shared style suffix.
18
+
19
+ product_desc: locked canonical product description (Stage 1). FLUX weighs
20
+ the START and END of a prompt most, so the locked product identity is
21
+ stated FIRST and repeated LAST (after the style keywords), and the scene
22
+ palette is explicitly confined to backdrop/props. This pins the product's
23
+ own colors even in bold/colorful styles.
24
+ """
25
+ palette = ", ".join(shot.color_palette[:5])
26
+ suffix = STYLE_SUFFIXES.get(style_preset, "")
27
+ if not product_desc:
28
+ return (f"{shot.image_prompt}. "
29
+ f"Backdrop and props color palette (do not recolor the product): {palette}. {suffix}")
30
+ desc = product_desc.strip().rstrip(".")
31
+ return (f"Commercial product photo of {desc}. "
32
+ f"{shot.image_prompt}. "
33
+ f"Colorful palette for the backdrop, surfaces and props ONLY: {palette} - "
34
+ f"the background may be colorful, the product may not change. {suffix}. "
35
+ f"The product itself stays in its exact original colors: {desc}, "
36
+ f"identical shape, materials and branding, unaltered")
37
+
38
+ def seed_for(shot_id: int, regen_counter: int = 0) -> int:
39
+ """FR-2.3/FR-2.4: stable per-shot seed; bump regen_counter to reroll one frame."""
40
+ return SEED_BASE + shot_id * 1000 + regen_counter
41
+
42
+ def generate_frames(shots, style_preset: str, aspect: str = "1:1",
43
+ regen_counters: dict | None = None,
44
+ product_desc: str = "") -> list:
45
+ """Render the given shots in ONE backend call. Returns PIL.Images in order.
46
+ regen_counters: {shot_id: int} β€” bumps the seed of rerolled frames (FR-2.4).
47
+ product_desc: locked product identity prefixed to every prompt."""
48
+ w, h = SIZES.get(aspect, SIZES["1:1"])
49
+ regen_counters = regen_counters or {}
50
+ seeds = [seed_for(s.id, regen_counters.get(s.id, 0)) for s in shots]
51
+ prompts = [build_prompt(s, style_preset, product_desc) for s in shots]
52
+ return flux_generate_batch(prompts, w, h, seeds)
53
+
54
+ def generate_frame(shot, style_preset: str, aspect: str = "1:1", regen_counter: int = 0,
55
+ product_desc: str = ""):
56
+ """Render one shot (used by per-frame regen). Returns PIL.Image."""
57
+ return generate_frames([shot], style_preset, aspect,
58
+ regen_counters={shot.id: regen_counter},
59
+ product_desc=product_desc)[0]
model_runtime.py ADDED
@@ -0,0 +1,138 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """ShotCraft β€” model runtime client (Modal backend).
2
+
3
+ ARCHITECTURE (2026-06-11, per team decision + hackathon rules check):
4
+ all model inference runs on Modal (modal_backend/shotcraft_inference.py):
5
+ Stage 1 MiniCPM-V-2_6 (8B) β€” A10G
6
+ Stage 2 FLUX.1-schnell (12B) β€” L40S
7
+ The Gradio Space is the interface (hackathon REQ-02); Modal is the runtime
8
+ (explicitly allowed per field-guide FAQ, qualifies for "Best Use of Modal").
9
+ No ZeroGPU, no @spaces.GPU, no mock fallbacks β€” the app runs end to end
10
+ against the real backend or fails loudly with an actionable error.
11
+
12
+ Space configuration (Settings -> Variables and secrets):
13
+ SHOTCRAFT_API_URL Explicit Modal endpoint, e.g.
14
+ https://<workspace>--shotcraft-inference-api.modal.run
15
+ SHOTCRAFT_MODAL_WORKSPACE Modal workspace slug; used when SHOTCRAFT_API_URL is unset.
16
+ SHOTCRAFT_MODAL_APP Modal app name; defaults to shotcraft-inference.
17
+ SHOTCRAFT_MODAL_FUNCTION Modal ASGI function name; defaults to api.
18
+ """
19
+ from __future__ import annotations
20
+
21
+ import base64
22
+ import io
23
+ import os
24
+
25
+ import httpx
26
+
27
+ DEFAULT_MODAL_WORKSPACE = "rafalbogusdxc"
28
+ DEFAULT_MODAL_APP = "shotcraft-inference"
29
+ DEFAULT_MODAL_FUNCTION = "api"
30
+
31
+
32
+ def _modal_api_url() -> str:
33
+ explicit_url = os.environ.get("SHOTCRAFT_API_URL", "").strip()
34
+ if explicit_url:
35
+ return explicit_url.rstrip("/")
36
+
37
+ workspace = os.environ.get("SHOTCRAFT_MODAL_WORKSPACE", DEFAULT_MODAL_WORKSPACE).strip()
38
+ app_name = os.environ.get("SHOTCRAFT_MODAL_APP", DEFAULT_MODAL_APP).strip()
39
+ function_name = os.environ.get("SHOTCRAFT_MODAL_FUNCTION", DEFAULT_MODAL_FUNCTION).strip()
40
+ return f"https://{workspace}--{app_name}-{function_name}.modal.run"
41
+
42
+
43
+ API_URL = _modal_api_url()
44
+
45
+ MINICPM_ID = "openbmb/MiniCPM-V-2_6"
46
+ FLUX_ID = "black-forest-labs/FLUX.1-schnell"
47
+
48
+ # Cold start can pull weights onto the GPU container; keep timeouts generous.
49
+ STAGE1_TIMEOUT_S = 900
50
+ STAGE2_TIMEOUT_S = 900
51
+
52
+
53
+ class BackendError(RuntimeError):
54
+ """Inference backend unreachable or returned an error."""
55
+
56
+
57
+ def _pil_to_b64(img) -> str:
58
+ buf = io.BytesIO()
59
+ img.save(buf, format="PNG")
60
+ return base64.b64encode(buf.getvalue()).decode()
61
+
62
+
63
+ def _b64_to_pil(data: str):
64
+ from PIL import Image
65
+
66
+ return Image.open(io.BytesIO(base64.b64decode(data))).convert("RGB")
67
+
68
+
69
+ def _post(path: str, payload: dict, timeout: float) -> dict:
70
+ url = f"{API_URL}{path}"
71
+ try:
72
+ # follow_redirects: Modal answers long-running calls with a 303
73
+ # redirect to a poll URL (?__modal_function_call_id=...) when the
74
+ # request exceeds ~150 s (e.g. cold start pulling model weights).
75
+ resp = httpx.post(url, json=payload, timeout=timeout,
76
+ follow_redirects=True)
77
+ resp.raise_for_status()
78
+ return resp.json()
79
+ except httpx.ConnectError as e:
80
+ raise BackendError(
81
+ f"Cannot reach inference backend at {API_URL} β€” is the Modal app "
82
+ f"deployed? ({e})"
83
+ ) from e
84
+ except httpx.ReadTimeout as e:
85
+ raise BackendError(
86
+ "Inference backend timed out β€” likely a cold start pulling model "
87
+ "weights. Try again in ~1 minute."
88
+ ) from e
89
+ except httpx.HTTPStatusError as e:
90
+ raise BackendError(
91
+ f"Backend error {e.response.status_code}: {e.response.text[:300]}"
92
+ ) from e
93
+
94
+
95
+ def health() -> dict:
96
+ """GET /health β€” used by the app banner at startup."""
97
+ try:
98
+ resp = httpx.get(f"{API_URL}/health", timeout=10, follow_redirects=True)
99
+ resp.raise_for_status()
100
+ return resp.json()
101
+ except Exception as e: # noqa: BLE001 β€” banner only, never crash the UI
102
+ return {"status": "unreachable", "error": str(e), "url": API_URL}
103
+
104
+
105
+ def minicpm_chat(image, system: str, user: str, temperature: float = 0.6) -> str:
106
+ """Stage 1: vision analysis + concept generation on Modal (MiniCPM-V-2_6)."""
107
+ data = _post(
108
+ "/minicpm",
109
+ {
110
+ "image_b64": _pil_to_b64(image),
111
+ "system": system,
112
+ "user": user,
113
+ "temperature": temperature,
114
+ },
115
+ STAGE1_TIMEOUT_S,
116
+ )
117
+ return data["text"]
118
+
119
+
120
+ def flux_generate_batch(prompts: list, width: int, height: int, seeds: list) -> list:
121
+ """Stage 2: render N frames in one backend call (N=5 reel, N=1 regen).
122
+ Returns PIL.Images in input order. Seeded per FR-2.3."""
123
+ data = _post(
124
+ "/flux",
125
+ {
126
+ "prompts": list(prompts),
127
+ "width": int(width),
128
+ "height": int(height),
129
+ "seeds": [int(s) for s in seeds],
130
+ },
131
+ STAGE2_TIMEOUT_S,
132
+ )
133
+ return [_b64_to_pil(b) for b in data["images_b64"]]
134
+
135
+
136
+ def flux_generate(prompt: str, width: int, height: int, steps: int, seed: int):
137
+ """Back-compat single-frame API; steps is fixed at 4 server-side."""
138
+ return flux_generate_batch([prompt], width, height, [seed])[0]
requirements.txt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ gradio>=6.0 # buttons= API on Gallery/Image and theme= on launch() are Gradio 6+
2
+ httpx>=0.27
3
+ Pillow
schemas.py ADDED
@@ -0,0 +1,120 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """ShotCraft β€” JSON schemas & validation for Stage 1 output (FR-1.2, FR-1.4)."""
2
+ from __future__ import annotations
3
+ import json
4
+ from dataclasses import dataclass, field, asdict
5
+
6
+ CATEGORIES = ["Fashion", "Beauty", "Food & Beverage", "Electronics", "Home", "Jewelry", "Other"]
7
+ STYLE_PRESETS = ["Minimal", "Luxury", "Lifestyle", "Bold & Colorful", "Editorial"]
8
+
9
+ # Shared style suffixes injected into every FLUX prompt (FR-2.3)
10
+ STYLE_SUFFIXES = {
11
+ "Minimal": "clean minimal composition, soft studio light, negative space, commercial product photography",
12
+ "Luxury": "premium editorial look, dramatic rim light, rich shadows, high-end advertising photography",
13
+ "Lifestyle": "natural candid setting, warm daylight, shallow depth of field, lifestyle product photography",
14
+ "Bold & Colorful": "vivid saturated backdrop colors, hard light, playful geometric backdrop, pop-art product photography",
15
+ "Editorial": "magazine editorial styling, cinematic lighting, artful composition, fashion campaign photography",
16
+ }
17
+
18
+ @dataclass
19
+ class Shot:
20
+ id: int
21
+ concept_name: str
22
+ scene: str
23
+ camera_angle: str
24
+ lighting: str
25
+ color_palette: list[str] # hex strings
26
+ props: str
27
+ marketing_angle: str
28
+ image_prompt: str # FLUX-optimized, English
29
+
30
+ @dataclass
31
+ class ProductAnalysis:
32
+ product_type: str
33
+ materials: str
34
+ colors: list[str]
35
+ distinguishing_features: str
36
+ # Locked one-sentence product identity, prefixed to every FLUX prompt so
37
+ # the SAME product appears in all shots. Backfilled when the model omits it.
38
+ canonical_description: str = ""
39
+
40
+ @dataclass
41
+ class ConceptPackage:
42
+ product_analysis: ProductAnalysis
43
+ shots: list[Shot] = field(default_factory=list)
44
+
45
+ def to_json(self) -> str:
46
+ return json.dumps(asdict(self), indent=2, ensure_ascii=False)
47
+
48
+ REQUIRED_SHOT_KEYS = {"id", "concept_name", "scene", "camera_angle", "lighting",
49
+ "color_palette", "props", "marketing_angle", "image_prompt"}
50
+
51
+ def validate_package(raw: str | dict) -> ConceptPackage:
52
+ """Parse + validate Stage 1 model output. Raises ValueError with a readable message."""
53
+ data = json.loads(raw) if isinstance(raw, str) else raw
54
+ pa = data.get("product_analysis")
55
+ if not isinstance(pa, dict):
56
+ raise ValueError("missing product_analysis object")
57
+ shots = data.get("shots")
58
+ if not isinstance(shots, list) or len(shots) != 5:
59
+ raise ValueError(f"expected exactly 5 shots, got {len(shots) if isinstance(shots, list) else 'none'}")
60
+ parsed = []
61
+ for i, s in enumerate(shots):
62
+ # Tolerate a missing color_palette (observed MiniCPM flake): backfill
63
+ # from the detected product colors instead of failing the package.
64
+ if "color_palette" not in s and isinstance(pa.get("colors"), list):
65
+ s["color_palette"] = list(pa["colors"])
66
+ missing = REQUIRED_SHOT_KEYS - set(s)
67
+ if missing:
68
+ raise ValueError(f"shot {i+1} missing keys: {sorted(missing)}")
69
+ parsed.append(Shot(**{k: s[k] for k in REQUIRED_SHOT_KEYS}))
70
+ return ConceptPackage(
71
+ product_analysis=ProductAnalysis(
72
+ product_type=pa.get("product_type", ""),
73
+ materials=pa.get("materials", ""),
74
+ colors=pa.get("colors", []),
75
+ distinguishing_features=pa.get("distinguishing_features", ""),
76
+ canonical_description=_canonical_description(pa),
77
+ ),
78
+ shots=parsed,
79
+ )
80
+
81
+ def _canonical_description(pa: dict) -> str:
82
+ """Locked product identity. Backfill from the analysis fields when the
83
+ model omits it so Stage 2 can always pin the product look."""
84
+ desc = str(pa.get("canonical_description") or "").strip()
85
+ if desc:
86
+ return desc
87
+ bits = [str(pa.get("product_type") or "").strip()]
88
+ colors = pa.get("colors")
89
+ if isinstance(colors, list) and colors:
90
+ names = [_color_name(str(c)) for c in colors[:5]]
91
+ bits.append("in " + ", ".join(dict.fromkeys(names))) # dedupe, keep order
92
+ if pa.get("materials"):
93
+ bits.append(f"made of {pa['materials']}")
94
+ if pa.get("distinguishing_features"):
95
+ bits.append(str(pa["distinguishing_features"]).strip())
96
+ return ", ".join(b for b in bits if b)
97
+
98
+ # FLUX barely understands hex codes - name them for the backfilled description.
99
+ _NAMED_COLORS = [
100
+ ("white", (255, 255, 255)), ("off-white", (240, 238, 230)),
101
+ ("light grey", (200, 200, 200)), ("grey", (128, 128, 128)),
102
+ ("charcoal", (60, 60, 60)), ("black", (10, 10, 10)),
103
+ ("red", (220, 40, 40)), ("orange", (240, 140, 30)),
104
+ ("yellow", (245, 200, 40)), ("gum brown", (170, 120, 70)),
105
+ ("brown", (110, 70, 40)), ("green", (60, 160, 70)),
106
+ ("teal", (20, 160, 170)), ("blue", (50, 90, 200)),
107
+ ("navy blue", (25, 35, 80)), ("purple", (130, 70, 190)),
108
+ ("pink", (235, 120, 170)), ("beige", (220, 200, 170)),
109
+ ]
110
+
111
+ def _color_name(hex_str: str) -> str:
112
+ """Nearest plain-English name for a hex color; passes through non-hex."""
113
+ h = hex_str.strip().lstrip("#")
114
+ if len(h) == 3:
115
+ h = "".join(ch * 2 for ch in h)
116
+ if len(h) < 6 or any(ch not in "0123456789aAbBcCdDeEfF" for ch in h[:6]):
117
+ return hex_str # already a name or unparseable - keep as-is
118
+ r, g, b = int(h[0:2], 16), int(h[2:4], 16), int(h[4:6], 16)
119
+ return min(_NAMED_COLORS,
120
+ key=lambda nc: (nc[1][0]-r)**2 + (nc[1][1]-g)**2 + (nc[1][2]-b)**2)[0]
test_app_e2e.py ADDED
@@ -0,0 +1,69 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """End-to-end test through the real app handlers against the REAL Modal backend.
2
+
3
+ Requires the backend to be deployed:
4
+ modal deploy modal_backend/shotcraft_inference.py
5
+ Optionally point elsewhere:
6
+ SHOTCRAFT_API_URL=https://... python test_app_e2e.py
7
+
8
+ This spends real GPU time (1 MiniCPM call + 6 FLUX frames) β€” run it for
9
+ Slice 4/5 validation, not in a tight loop.
10
+ """
11
+ import json
12
+ import zipfile
13
+
14
+ from PIL import Image
15
+
16
+ import app
17
+ from model_runtime import health
18
+
19
+ h = health()
20
+ assert h.get("status") == "ok", f"Backend not reachable: {h}"
21
+ print("backend OK β€”", h)
22
+
23
+ photo = Image.open("examples/demo_product.png")
24
+
25
+ # Stage 1 handler β€” real MiniCPM
26
+ out = app.run_stage1(photo, "Handmade ceramic mugs from Krakow", "Home", "Minimal")
27
+ pkg, analysis = out[0], out[1]
28
+ fields = list(out[2:-1])
29
+ assert len(pkg.shots) == 5 and len(fields) == 40
30
+ print("run_stage1 OK β€”", analysis[:60], "...")
31
+
32
+ # Edit prompt 3 (AC-2) + non-prompt field edit (FR-1.3)
33
+ fields[2 * 8 + 7] = ("EDITED: lifestyle close-up of the speckled ceramic mug "
34
+ "on a linen tablecloth, morning light")
35
+ fields[1 * 8 + 0] = "Renamed Concept"
36
+
37
+ # Stage 2 handler β€” real FLUX, 5 frames in one backend call
38
+ class P: # stub gr.Progress
39
+ def __call__(self, *a, **k): pass
40
+
41
+ gallery, pkg2, regen_state = app.run_stage2(pkg, "Minimal", "1:1", *fields, progress=P())
42
+ assert len(gallery) == 5
43
+ assert all(img.size == (1024, 1024) for img, _ in gallery)
44
+ assert pkg2.shots[1].concept_name == "Renamed Concept"
45
+ assert pkg2.shots[2].image_prompt.startswith("EDITED:")
46
+ print("run_stage2 OK β€” 5 real frames, edits persisted")
47
+
48
+ # Regen one (AC-4) β€” only shot 2 changes
49
+ import io
50
+ before = io.BytesIO(); gallery[1][0].save(before, "PNG")
51
+ gallery2, regen_state2 = app.regen_one(pkg2, "Minimal", "1:1", 2, gallery, regen_state)
52
+ after = io.BytesIO(); gallery2[1][0].save(after, "PNG")
53
+ assert regen_state2[2] == 1
54
+ assert before.getvalue() != after.getvalue(), "regen frame should differ"
55
+ print("regen_one OK β€” shot 2 rerolled, counter:", regen_state2[2])
56
+
57
+ # Export (AC-5/AC-6)
58
+ path = app.export_zip(pkg2, gallery2, [1, 4], regen_state2)
59
+ zf = zipfile.ZipFile(path)
60
+ names = sorted(zf.namelist())
61
+ m = json.loads(zf.read("selection_manifest.json"))
62
+ assert "selection_manifest.json" in names
63
+ assert len([n for n in names if n.endswith(".png")]) == 5
64
+ assert m["hero_frames"] == [1, 4]
65
+ assert m["shots"][2]["image_prompt"].startswith("EDITED:")
66
+ assert m["shots"][1]["regen_count"] == 1
67
+ print("export_zip OK β€”", names)
68
+ print()
69
+ print("REAL E2E PASSED")
test_smoke.py ADDED
@@ -0,0 +1,47 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Unit tests for pure logic β€” no backend required.
2
+ Run: python test_smoke.py
3
+ """
4
+ import json
5
+
6
+ from frames import build_prompt, seed_for, SIZES
7
+ from schemas import validate_package
8
+
9
+ # --- seeding (FR-2.3 / FR-2.4) ---
10
+ assert seed_for(1, 0) == 1042
11
+ assert seed_for(2, 0) == 2042
12
+ assert seed_for(2, 1) == 2043 # regen bumps seed
13
+ assert seed_for(2, 0) != seed_for(2, 1)
14
+ assert len({seed_for(i, 0) for i in range(1, 6)}) == 5 # all distinct
15
+ print("seed_for OK")
16
+
17
+ # --- sizes (FR-2.6) ---
18
+ assert SIZES["1:1"] == (1024, 1024)
19
+ assert SIZES["16:9"] == (1024, 576)
20
+ print("SIZES OK")
21
+
22
+ # --- schema validation round-trip ---
23
+ RAW = {
24
+ "product_analysis": {
25
+ "product_type": "ceramic mug", "materials": "stoneware",
26
+ "colors": ["#4A6B5D"], "distinguishing_features": "speckled glaze",
27
+ },
28
+ "shots": [
29
+ {"id": i, "concept_name": f"Concept {i}", "scene": "studio",
30
+ "camera_angle": "front", "lighting": "soft",
31
+ "color_palette": ["#4A6B5D", "#E8E2D5"], "props": "none",
32
+ "marketing_angle": "quality", "image_prompt": f"prompt {i}"}
33
+ for i in range(1, 6)
34
+ ],
35
+ }
36
+ pkg = validate_package(RAW)
37
+ assert len(pkg.shots) == 5
38
+ assert json.loads(pkg.to_json())["shots"][0]["id"] == 1
39
+ print("validate_package OK")
40
+
41
+ # --- prompt building (FR-2.3) ---
42
+ p = build_prompt(pkg.shots[0], "Minimal")
43
+ assert "prompt 1" in p and "#4A6B5D" in p
44
+ print("build_prompt OK")
45
+
46
+ print()
47
+ print("ALL SMOKE TESTS PASSED")