Collections
Discover the best community collections!
Collections including paper arxiv:2604.08626
-
WildDet3D: Scaling Promptable 3D Detection in the Wild
Paper • 2604.08626 • Published • 247 -
RefineAnything: Multimodal Region-Specific Refinement for Perfect Local Details
Paper • 2604.06870 • Published • 43 -
ClawBench: Can AI Agents Complete Everyday Online Tasks?
Paper • 2604.08523 • Published • 264
-
Towards Scalable and Consistent 3D Editing
Paper • 2510.02994 • Published • 6 -
UP2You: Fast Reconstruction of Yourself from Unconstrained Photo Collections
Paper • 2509.24817 • Published • 9 -
NANO3D: A Training-Free Approach for Efficient 3D Editing Without Masks
Paper • 2510.15019 • Published • 65 -
Skyfall-GS: Synthesizing Immersive 3D Urban Scenes from Satellite Imagery
Paper • 2510.15869 • Published • 50
-
VOID: Video Object and Interaction Deletion
Paper • 2604.02296 • Published • 56 -
OneVL: One-Step Latent Reasoning and Planning with Vision-Language Explanation
Paper • 2604.18486 • Published • 95 -
WildDet3D: Scaling Promptable 3D Detection in the Wild
Paper • 2604.08626 • Published • 247 -
UniT: Toward a Unified Physical Language for Human-to-Humanoid Policy Learning and World Modeling
Paper • 2604.19734 • Published • 33
-
UNIDOC-BENCH: A Unified Benchmark for Document-Centric Multimodal RAG
Paper • 2510.03663 • Published • 17 -
LLM-guided Hierarchical Retrieval
Paper • 2510.13217 • Published • 21 -
AnyUp: Universal Feature Upsampling
Paper • 2510.12764 • Published • 13 -
katanemo/Arch-Router-1.5B
Text Generation • 2B • Updated • 1.95k • 267
-
WildDet3D: Scaling Promptable 3D Detection in the Wild
Paper • 2604.08626 • Published • 247 -
RefineAnything: Multimodal Region-Specific Refinement for Perfect Local Details
Paper • 2604.06870 • Published • 43 -
ClawBench: Can AI Agents Complete Everyday Online Tasks?
Paper • 2604.08523 • Published • 264
-
VOID: Video Object and Interaction Deletion
Paper • 2604.02296 • Published • 56 -
OneVL: One-Step Latent Reasoning and Planning with Vision-Language Explanation
Paper • 2604.18486 • Published • 95 -
WildDet3D: Scaling Promptable 3D Detection in the Wild
Paper • 2604.08626 • Published • 247 -
UniT: Toward a Unified Physical Language for Human-to-Humanoid Policy Learning and World Modeling
Paper • 2604.19734 • Published • 33
-
UNIDOC-BENCH: A Unified Benchmark for Document-Centric Multimodal RAG
Paper • 2510.03663 • Published • 17 -
LLM-guided Hierarchical Retrieval
Paper • 2510.13217 • Published • 21 -
AnyUp: Universal Feature Upsampling
Paper • 2510.12764 • Published • 13 -
katanemo/Arch-Router-1.5B
Text Generation • 2B • Updated • 1.95k • 267
-
Towards Scalable and Consistent 3D Editing
Paper • 2510.02994 • Published • 6 -
UP2You: Fast Reconstruction of Yourself from Unconstrained Photo Collections
Paper • 2509.24817 • Published • 9 -
NANO3D: A Training-Free Approach for Efficient 3D Editing Without Masks
Paper • 2510.15019 • Published • 65 -
Skyfall-GS: Synthesizing Immersive 3D Urban Scenes from Satellite Imagery
Paper • 2510.15869 • Published • 50