-
Astrolabe: Steering Forward-Process Reinforcement Learning for Distilled Autoregressive Video Models
Paper • 2603.17051 • Published • 109 -
Versatile Editing of Video Content, Actions, and Dynamics without Training
Paper • 2603.17989 • Published • 18 -
Generation Models Know Space: Unleashing Implicit 3D Priors for Scene Understanding
Paper • 2603.19235 • Published • 95 -
3DreamBooth: High-Fidelity 3D Subject-Driven Video Generation Model
Paper • 2603.18524 • Published • 58
Collections
Discover the best community collections!
Collections including paper arxiv:2603.19235
-
GigaBrain-0.5M*: a VLA That Learns From World Model-Based Reinforcement Learning
Paper • 2602.12099 • Published • 62 -
When to Memorize and When to Stop: Gated Recurrent Memory for Long-Context Reasoning
Paper • 2602.10560 • Published • 31 -
G-LNS: Generative Large Neighborhood Search for LLM-Based Automatic Heuristic Design
Paper • 2602.08253 • Published • 27 -
ROCKET: Rapid Optimization via Calibration-guided Knapsack Enhanced Truncation for Efficient Model Compression
Paper • 2602.11008 • Published • 18
-
Apollo: An Exploration of Video Understanding in Large Multimodal Models
Paper • 2412.10360 • Published • 148 -
SeFAR: Semi-supervised Fine-grained Action Recognition with Temporal Perturbation and Learning Stabilization
Paper • 2501.01245 • Published • 5 -
VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM
Paper • 2501.00599 • Published • 46 -
Omni-RGPT: Unifying Image and Video Region-level Understanding via Token Marks
Paper • 2501.08326 • Published • 34
-
Geometry Forcing: Marrying Video Diffusion and 3D Representation for Consistent World Modeling
Paper • 2507.07982 • Published • 34 -
MeshLLM: Empowering Large Language Models to Progressively Understand and Generate 3D Mesh
Paper • 2508.01242 • Published • 11 -
Loc3R-VLM: Language-based Localization and 3D Reasoning with Vision-Language Models
Paper • 2603.18002 • Published • 14 -
Generation Models Know Space: Unleashing Implicit 3D Priors for Scene Understanding
Paper • 2603.19235 • Published • 95
-
Astrolabe: Steering Forward-Process Reinforcement Learning for Distilled Autoregressive Video Models
Paper • 2603.17051 • Published • 109 -
Versatile Editing of Video Content, Actions, and Dynamics without Training
Paper • 2603.17989 • Published • 18 -
Generation Models Know Space: Unleashing Implicit 3D Priors for Scene Understanding
Paper • 2603.19235 • Published • 95 -
3DreamBooth: High-Fidelity 3D Subject-Driven Video Generation Model
Paper • 2603.18524 • Published • 58
-
GigaBrain-0.5M*: a VLA That Learns From World Model-Based Reinforcement Learning
Paper • 2602.12099 • Published • 62 -
When to Memorize and When to Stop: Gated Recurrent Memory for Long-Context Reasoning
Paper • 2602.10560 • Published • 31 -
G-LNS: Generative Large Neighborhood Search for LLM-Based Automatic Heuristic Design
Paper • 2602.08253 • Published • 27 -
ROCKET: Rapid Optimization via Calibration-guided Knapsack Enhanced Truncation for Efficient Model Compression
Paper • 2602.11008 • Published • 18
-
Geometry Forcing: Marrying Video Diffusion and 3D Representation for Consistent World Modeling
Paper • 2507.07982 • Published • 34 -
MeshLLM: Empowering Large Language Models to Progressively Understand and Generate 3D Mesh
Paper • 2508.01242 • Published • 11 -
Loc3R-VLM: Language-based Localization and 3D Reasoning with Vision-Language Models
Paper • 2603.18002 • Published • 14 -
Generation Models Know Space: Unleashing Implicit 3D Priors for Scene Understanding
Paper • 2603.19235 • Published • 95
-
Apollo: An Exploration of Video Understanding in Large Multimodal Models
Paper • 2412.10360 • Published • 148 -
SeFAR: Semi-supervised Fine-grained Action Recognition with Temporal Perturbation and Learning Stabilization
Paper • 2501.01245 • Published • 5 -
VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM
Paper • 2501.00599 • Published • 46 -
Omni-RGPT: Unifying Image and Video Region-level Understanding via Token Marks
Paper • 2501.08326 • Published • 34