Modal Inference Results: Fire Boy MiniCPM-V Router
Date: 2026-06-15
Live Endpoint
Modal app: fireboy-vla-router
URL: https://sanjuhs123--fireboy-vla-router.modal.run
GPU: L40S
idle scaledown window: 60 seconds
checkpoint: Fireboy-training-policy-vla/runpod-artifacts/checkpoints/fireboy_minicpm_vla_skill_param_head/minicpm_vla_skill_param_head.pt
model: openbmb/MiniCPM-V-4.6
policy_kind: minicpm_vla_frozen_encoder_skill_param_head_v1
This endpoint serves the promoted frozen MiniCPM-V skill/parameter router with a custom PyTorch action head. It is not served through vLLM because the router needs MiniCPM hidden states plus a custom continuous head, not token generation.
Local Website Wiring
TOYBOX_VLA_ROUTER_URL=https://sanjuhs123--fireboy-vla-router.modal.run
TOYBOX_VLA_ROUTER_ACTION=1
local app: http://127.0.0.1:65373
policy gallery: http://127.0.0.1:65373/fireboy-policy-gallery
The Toy Room path is:
browser command -> /api/pet-action
-> Modal /route
-> MiniCPM-V frozen encoder + skill/parameter head
-> MuJoCo policy registry dispatch
-> Toy Room animation/result JSON
Verification Matrix
All commands below were tested through the local website API on 2026-06-15.
The VLA router ran on Modal with device: cuda.
walk to the yellow marker
served skill: walk_to
dispatch: registry:walk_to
/api/pet-action: success true
animation: walk
run around
served skill: run_around
dispatch: registry:run_around
/api/pet-action: success true
animation: run
pick up the berry
served skill: pick_up
dispatch: registry:pick_up
/api/pet-action: success true
animation: hold
go find berry and eat it
served skill: find_and_eat_berry
dispatch: registry:find_and_eat_berry
/api/pet-action: success true
animation: hold
Important Runtime Guard
With a blank/generated camera frame, the raw neural skill head can become
overconfident toward find_and_eat_berry. The live endpoint therefore exposes:
neural_skill: raw MiniCPM-V head prediction
skill: command/scene-stabilized served skill
raw_params: raw continuous head output
params: scene-grounded served parameters
This keeps the demo reliable while preserving transparency. If the browser sends
a real camera frame and full robot state, the same endpoint can be tested with
force_neural_skill: true to inspect the pure neural decision.
Proof Screenshot
Fireboy-training-policy-vla/proofs/modal-vla-router-policy-gallery.png
Repeatable Final Smoke Gate
Run this before submission:
PYTHONPATH=fireboy-vla-physics/src \
fireboy-vla-physics/.venv/bin/python \
fireboy-vla-physics/src/final_vla_demo_smoke.py \
--out Fireboy-training-policy-vla/proofs/final-vla-demo-smoke.json
Latest result:
ok: true
route checks: walk_to, run_around, pick_up, find_and_eat_berry all passed on cuda
pet-action checks: all four commands dispatched through Modal VLA + MuJoCo successfully
registry validation: checked_paths 49, missing_count 0
RunPod pods in proof: []
Proof JSON:
Fireboy-training-policy-vla/proofs/final-vla-demo-smoke.json