Spaces:
Running on A100
Running on A100
metadata
title: Asset Harvester
emoji: π
colorFrom: green
colorTo: indigo
sdk: docker
app_port: 7860
pinned: false
short_description: Image-to-3D for autonomous-vehicle simulation assets
Asset Harvester
Paper | Project Page | Code | Model | Data
Upload one image of a single object (vehicle, pedestrian, cyclist, or other road object) and get back a complete 3D Gaussian splat asset ready for simulation.
Pipeline
upload ββΆ image guard (optional) ββΆ object segmentation ββΆ recenter + pad
β
βΌ
3D Gaussian splat βββ TokenGS lifting βββ multiview diffusion βββ camera estimation
- Object segmentation (
AH_object_seg_jit.pt) β Mask2Former JIT produces a binary mask of the foreground object at the uploaded image's native resolution. - Camera estimation (
AH_camera_estimator.safetensors) β predicts camera pose, distance, FOV, and object dimensions (LWH). Shares the C-RADIO backbone with multiview diffusion to avoid loading it twice. - Multiview diffusion (
AH_multiview_diffusion.safetensors) β SparseViewDiT generates 16 novel orbit views conditioned on the input image. - TokenGS lifting (
AH_tokengs_lifting.safetensors) β feed-forward 3D Gaussian reconstructor lifts the 16 views to a full 3DGS asset.
Outputs
- Multiview MP4 (16-frame orbit at 5fps).
- 3D Gaussian orbit render (MP4).
- Gaussian splat (PLY) ready for simulation engines.
Hardware
Single NVIDIA GPU with compute capability β₯ 8.0 and β₯ 30 GB VRAM. Typical end-to-end runtime: 1-2 minutes per image on A100/H100.
Limitations
- Single-object only β images with multiple distinct subjects will use the largest mask and discard the rest.
- Heavily occluded objects or out-of-distribution subjects (e.g., objects not seen in driving logs) may produce hallucinated geometry.
- Image guard uses
meta-llama/Llama-Guard-3-11B-Visionβ enabling it adds ~20-30 s per run.
Local deployment
docker build --build-arg HF_TOKEN=$HF_TOKEN -t asset-harvester .
docker run --gpus all -e HF_TOKEN=$HF_TOKEN -p 7860:7860 asset-harvester
Checkpoints are downloaded from nvidia/asset-harvester on first run. HF_TOKEN must have access to that repo.
Governing terms
Use of this system is governed by the NVIDIA Open Model License Agreement.