--- title: Celebrity Deathmatch emoji: ๐ŸฅŠ colorFrom: red colorTo: yellow sdk: gradio sdk_version: "6.0.0" app_file: app.py pinned: true short_description: Two photos in, a claymation celebrity death match out tags: - track:wood - sponsor:modal - sponsor:openbmb - achievement:offbrand - thousand-token-wood - off-brand - best-demo - text-to-video - image-generation - text-to-speech - gradio - modal models: - openbmb/MiniCPM-V-2_6 - black-forest-labs/FLUX.1-schnell - Lightricks/LTX-Video - Qwen/Qwen3-TTS-12Hz-1.7B-VoiceDesign --- ![Track: Thousand Token Wood](https://img.shields.io/badge/track-Thousand%20Token%20Wood-7c3aed) ![Badge: Off Brand](https://img.shields.io/badge/badge-Off%20Brand-e0303c) ![Badge: Best Demo](https://img.shields.io/badge/badge-Best%20Demo-f2b030) ![Models < 32B](https://img.shields.io/badge/all%20models-%3C32B-2ea44f) ![Gradio 6](https://img.shields.io/badge/Gradio-6.x-orange) ![Backend: Modal](https://img.shields.io/badge/backend-Modal%20GPU-0b6bcb) # ๐ŸฅŠ Celebrity Deathmatch **Upload two photos. Our AI ring director books the brawl** โ€” a claymation fight script, a rendered keyframe reel, a declared winner โ€” then turns it into one continuous fight video with **two ring announcers screaming over the action** and the crowd going wild. It's MTV's *Celebrity Deathmatch* as an AI-native toy: pure spectacle, zero practical value, maximum fun. (That's the [Thousand Token Wood](#) track in one sentence.) > โš ๏ธ **Parody.** Every visual is an AI-generated claymation **caricature** of a > public figure, for comedic effect. Not real. No real people were harmed. ## โ–ถ๏ธ See it in action [![Watch the Celebrity Deathmatch demo](https://img.youtube.com/vi/JNl-N7NN8oI/hqdefault.jpg)](https://youtu.be/JNl-N7NN8oI) ๐ŸŽฌ **[Watch the 60-second demo](https://youtu.be/JNl-N7NN8oI)** ยท ๐ŸฅŠ **[Try it live](https://huggingface.co/spaces/build-small-hackathon/deathmatch)** ยท ๐Ÿ”— **[Launch post](https://www.linkedin.com/posts/pawel-pisarski_buildsmall-huggingace-modal-activity-7472333779378921473-6r2w)** ## Why it's worth a look - **๐ŸŽ™๏ธ Two-announcer voiceover, not a silent clip.** Every beat is called by **Nick** (dry, sarcastic) and **Johnny** (loud, over-excited) โ€” two *designed* voices from Qwen3-TTS VoiceDesign โ€” mixed over a bell, a crowd murmur bed, and a winner roar. The fight has a soundtrack, like the real show. - **๐ŸŽจ Off-brand UI.** No default Gradio look: a custom claymation-fight art direction โ€” Anton display type, fire-and-clay palette, tale-of-the-tape stat bars, animated winner banner. - **๐Ÿงฑ Real caricatures, real stakes.** MiniCPM *reads both photos*, invents a fighting persona, signature move, and stat line for each, then choreographs a 5-beat arc and picks a winner. ## How it works A four-model pipeline โ€” **every model under the hackathon's 32B cap**: | Stage | Role | Model | Params | |------|------|-------|--------| | 1 | Fight director โ€” reads **both** photos โ†’ fight card JSON | **MiniCPM-V-2_6** (OpenBMB) | 8B | | 2 | Claymation keyframe reel | **FLUX.1-schnell** (BFL) | 12B | | 3 | Keyframes โ†’ continuous fight video (opt-in) | **LTX-Video** (Lightricks) | 2B | | 3 | Two-announcer voiceover | **Qwen3-TTS VoiceDesign** (Qwen) | 1.7B | **The entire fight pipeline runs on small models โ€” all โ‰ค12B**, well under the 32B cap. No giant foundation model anywhere: a clever chain of small specialists (read โ†’ draw โ†’ animate โ†’ voice) does the whole show. That's the Build Small ethos. ``` photo A + photo B โ””โ”€โ–ถ Stage 1 MiniCPM-V-2_6 โ†’ fight card (fighters, 5 beats, 2-announcer commentary, winner) โ””โ”€โ–ถ Stage 2 FLUX.1-schnell โ†’ 5 claymation keyframes โ””โ”€โ–ถ Stage 3 (Animate) LTX-Video โ†’ chained clips + Qwen3-TTS โ†’ Nick/Johnny voiceover + crowd SFX + burned-in captions โ†’ one MP4 ``` The reel (Stages 1โ€“2) is the fast default; **Animate** (Stage 3) is opt-in because it's the GPU-heavy step. ## Tech - **Frontend:** this Gradio 6 Space (custom CSS, no template look). - **Backend:** two Modal GPU apps โ€” `deathmatch` (MiniCPM A10G + FLUX L40S + ASGI gateway) and `deathmatch-video` (ComfyUI + LTX on H100, Qwen3-TTS on A10G). - **Audio/video post:** pure `ffmpeg` โ€” xfade stitch, fit-to-beat TTS mixing, synthesized crowd SFX, burned-in captions. - Wired to the backend via `DEATHMATCH_API_URL` (set automatically on deploy). ## Sponsors we built on - **๐Ÿค— OpenBMB โ€” MiniCPM-V-2_6.** The whole show starts here: one 8B vision-language model reads *both* fighter photos in a single call and returns the entire fight card as JSON โ€” names, personas, signature moves, stat lines, a 5-beat script, two-announcer commentary, and a winner. It *is* the ring director. - **โ–ฒ Modal.** Every GPU stage runs on Modal โ€” MiniCPM (A10G), FLUX (L40S), and the ComfyUI + LTX-Video + Qwen3-TTS video app (H100 / A10G) โ€” behind warm-pooled `@app.cls` containers and an ASGI gateway. The HF Space stays CPU-only and calls Modal over HTTP; one `modal deploy` ships each app. ## Run it locally (no GPU) ```sh DEATHMATCH_MOCK=1 python app.py ``` Mock mode exercises the **entire UX** on CPU โ€” canned fight card, placeholder keyframe reel, and a stand-in fight video โ€” so you can see the whole flow without burning a single GPU minute.