Yuhao Dong PRO

THUdyh

AI & ML interests

None yet

Recent Activity

upvoted a paper 2 days ago

S-Agent: Spatial Tool-Use Elicits Reasoning for Spatial Intelligence

upvoted a paper 16 days ago

VideoKR: Towards Knowledge- and Reasoning-Intensive Video Understanding

authored a paper 24 days ago

From Pixels to Words -- Towards Native One-Vision Models at Scale

View all activity

Organizations

upvoted a paper 2 days ago

S-Agent: Spatial Tool-Use Elicits Reasoning for Spatial Intelligence

Paper • 2606.20515 • Published 3 days ago • 34

upvoted a paper 16 days ago

VideoKR: Towards Knowledge- and Reasoning-Intensive Video Understanding

Paper • 2606.05259 • Published 18 days ago • 39

upvoted 2 papers 24 days ago

GEM: Generative Supervision Helps Embodied Intelligence

Paper • 2605.28548 • Published 25 days ago • 41

From Pixels to Words -- Towards Native One-Vision Models at Scale

Paper • 2605.28820 • Published 25 days ago • 74

upvoted a paper 25 days ago

LongAV-Compass: Towards Unified Evaluation of Minute-Scale Audio-Visual Generation Across T2AV, I2AV, and V2AV

Paper • 2605.26244 • Published 27 days ago • 38

upvoted 3 papers about 1 month ago

Artifact-Bench: Evaluating MLLMs on Detecting and Assessing the Artifacts of AI-Generated Videos

Paper • 2605.18984 • Published May 18 • 22

Self-Distilled Agentic Reinforcement Learning

Paper • 2605.15155 • Published May 14 • 114

SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-unify Architecture

Paper • 2605.12500 • Published May 12 • 194

upvoted a paper about 2 months ago

Co-Evolving Policy Distillation

Paper • 2604.27083 • Published Apr 29 • 68

upvoted a collection about 2 months ago

SenseNova-U1

Collection

SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-Unify Architecture • 10 items • Updated 9 days ago • 73

upvoted a paper about 2 months ago

SWE-chat: Coding Agent Interactions From Real Users in the Wild

Paper • 2604.20779 • Published Apr 22 • 16

upvoted 7 papers 2 months ago

Agent-World: Scaling Real-World Environment Synthesis for Evolving General Agent Intelligence

Paper • 2604.18292 • Published Apr 20 • 87

Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe

Paper • 2604.13016 • Published Apr 14 • 112

Video-MME-v2: Towards the Next Stage in Benchmarks for Comprehensive Video Understanding

Paper • 2604.05015 • Published Apr 6 • 237

upvoted 2 papers 3 months ago

Vero: An Open RL Recipe for General Visual Reasoning

Paper • 2604.04917 • Published Apr 6 • 34

FileGram: Grounding Agent Personalization in File-System Behavioral Traces

Paper • 2604.04901 • Published Apr 6 • 40

Yuhao Dong PRO

AI & ML interests

Recent Activity

Organizations

THUdyh's activity