VibeThinker-3B: Exploring the Frontier of Verifiable Reasoning in Small Language Models Paper • 2606.16140 • Published 3 days ago • 86
EurekAgent: Agent Environment Engineering is All You Need For Autonomous Scientific Discovery Paper • 2606.13662 • Published 7 days ago • 27
Benchmarking AI Agents for Addressing Scientific Challenges Across Scales Paper • 2606.12736 • Published 8 days ago • 3
VideoKR: Towards Knowledge- and Reasoning-Intensive Video Understanding Paper • 2606.05259 • Published 15 days ago • 39
MobileGym: A Verifiable and Highly Parallel Simulation Platform for Mobile GUI Agent Research Paper • 2605.26114 • Published 24 days ago • 64
CUA-Gym: Scaling Verifiable Training Environments and Tasks for Computer-Use Agents Paper • 2605.25624 • Published 24 days ago • 34
OpenComputer: Verifiable Software Worlds for Computer-Use Agents Paper • 2605.19769 • Published about 1 month ago • 82
Rethinking Reasoning-Intensive Retrieval: Evaluating and Advancing Retrievers in Agentic Search Systems Paper • 2605.04018 • Published May 5 • 41
Rewarding the Scientific Process: Process-Level Reward Modeling for Agentic Data Analysis Paper • 2604.24198 • Published Apr 27 • 23
TexOCR: Advancing Document OCR Models for Compilable Page-to-LaTeX Reconstruction Paper • 2604.22880 • Published Apr 24 • 10
Agent-World: Scaling Real-World Environment Synthesis for Evolving General Agent Intelligence Paper • 2604.18292 • Published Apr 20 • 85
Watch Before You Answer: Learning from Visually Grounded Post-Training Paper • 2604.05117 • Published Apr 6 • 36
Think in Strokes, Not Pixels: Process-Driven Image Generation via Interleaved Reasoning Paper • 2604.04746 • Published Apr 8 • 73
LLM2Vec-Gen: Generative Embeddings from Large Language Models Paper • 2603.10913 • Published Mar 11 • 44