-
HLE-Verified: A Systematic Verification and Structured Revision of Humanity's Last Exam
Paper • 2602.13964 • Published • 11 -
SWE-CI: Evaluating Agent Capabilities in Maintaining Codebases via Continuous Integration
Paper • 2603.03823 • Published • 7 -
DeepVision-103K: A Visually Diverse, Broad-Coverage, and Verifiable Mathematical Dataset for Multimodal Reasoning
Paper • 2602.16742 • Published • 12 -
Let It Flow: Agentic Crafting on Rock and Roll, Building the ROME Model within an Open Agentic Learning Ecosystem
Paper • 2512.24873 • Published • 109
Collections
Discover the best community collections!
Collections including paper arxiv:2602.16742
-
MMFineReason: Closing the Multimodal Reasoning Gap via Open Data-Centric Methods
Paper • 2601.21821 • Published • 62 -
Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text
Paper • 2601.22975 • Published • 113 -
Reinforced Attention Learning
Paper • 2602.04884 • Published • 30 -
LoongRL:Reinforcement Learning for Advanced Reasoning over Long Contexts
Paper • 2510.19363 • Published • 63
-
Skywork-SWE: Unveiling Data Scaling Laws for Software Engineering in LLMs
Paper • 2506.19290 • Published • 53 -
Data Efficacy for Language Model Training
Paper • 2506.21545 • Published • 11 -
Easy Dataset: A Unified and Extensible Framework for Synthesizing LLM Fine-Tuning Data from Unstructured Documents
Paper • 2507.04009 • Published • 55 -
RefineX: Learning to Refine Pre-training Data at Scale from Expert-Guided Programs
Paper • 2507.03253 • Published • 19
-
GLM-5: from Vibe Coding to Agentic Engineering
Paper • 2602.15763 • Published • 152 -
DeepVision-103K: A Visually Diverse, Broad-Coverage, and Verifiable Mathematical Dataset for Multimodal Reasoning
Paper • 2602.16742 • Published • 12 -
From Perception to Action: An Interactive Benchmark for Vision Reasoning
Paper • 2602.21015 • Published • 24 -
Thinking to Recall: How Reasoning Unlocks Parametric Knowledge in LLMs
Paper • 2603.09906 • Published • 76
-
OpenVision 3: A Family of Unified Visual Encoder for Both Understanding and Generation
Paper • 2601.15369 • Published • 22 -
Stable-DiffCoder: Pushing the Frontier of Code Diffusion Large Language Model
Paper • 2601.15892 • Published • 55 -
Scaling Text-to-Image Diffusion Transformers with Representation Autoencoders
Paper • 2601.16208 • Published • 55 -
NAACL: Noise-AwAre Verbal Confidence Calibration for LLMs in RAG Systems
Paper • 2601.11004 • Published • 30
-
DeepDistill: Enhancing LLM Reasoning Capabilities via Large-Scale Difficulty-Graded Data Training
Paper • 2504.17565 • Published • 2 -
AI-MO/NuminaMath-1.5
Viewer • Updated • 896k • 4.47k • 187 -
PrimeIntellect/synthetic-code-understanding
Viewer • Updated • 60.6k • 32 • 20 -
Go to Zero: Towards Zero-shot Motion Generation with Million-scale Data
Paper • 2507.07095 • Published • 56
-
HLE-Verified: A Systematic Verification and Structured Revision of Humanity's Last Exam
Paper • 2602.13964 • Published • 11 -
SWE-CI: Evaluating Agent Capabilities in Maintaining Codebases via Continuous Integration
Paper • 2603.03823 • Published • 7 -
DeepVision-103K: A Visually Diverse, Broad-Coverage, and Verifiable Mathematical Dataset for Multimodal Reasoning
Paper • 2602.16742 • Published • 12 -
Let It Flow: Agentic Crafting on Rock and Roll, Building the ROME Model within an Open Agentic Learning Ecosystem
Paper • 2512.24873 • Published • 109
-
GLM-5: from Vibe Coding to Agentic Engineering
Paper • 2602.15763 • Published • 152 -
DeepVision-103K: A Visually Diverse, Broad-Coverage, and Verifiable Mathematical Dataset for Multimodal Reasoning
Paper • 2602.16742 • Published • 12 -
From Perception to Action: An Interactive Benchmark for Vision Reasoning
Paper • 2602.21015 • Published • 24 -
Thinking to Recall: How Reasoning Unlocks Parametric Knowledge in LLMs
Paper • 2603.09906 • Published • 76
-
MMFineReason: Closing the Multimodal Reasoning Gap via Open Data-Centric Methods
Paper • 2601.21821 • Published • 62 -
Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text
Paper • 2601.22975 • Published • 113 -
Reinforced Attention Learning
Paper • 2602.04884 • Published • 30 -
LoongRL:Reinforcement Learning for Advanced Reasoning over Long Contexts
Paper • 2510.19363 • Published • 63
-
OpenVision 3: A Family of Unified Visual Encoder for Both Understanding and Generation
Paper • 2601.15369 • Published • 22 -
Stable-DiffCoder: Pushing the Frontier of Code Diffusion Large Language Model
Paper • 2601.15892 • Published • 55 -
Scaling Text-to-Image Diffusion Transformers with Representation Autoencoders
Paper • 2601.16208 • Published • 55 -
NAACL: Noise-AwAre Verbal Confidence Calibration for LLMs in RAG Systems
Paper • 2601.11004 • Published • 30
-
Skywork-SWE: Unveiling Data Scaling Laws for Software Engineering in LLMs
Paper • 2506.19290 • Published • 53 -
Data Efficacy for Language Model Training
Paper • 2506.21545 • Published • 11 -
Easy Dataset: A Unified and Extensible Framework for Synthesizing LLM Fine-Tuning Data from Unstructured Documents
Paper • 2507.04009 • Published • 55 -
RefineX: Learning to Refine Pre-training Data at Scale from Expert-Guided Programs
Paper • 2507.03253 • Published • 19
-
DeepDistill: Enhancing LLM Reasoning Capabilities via Large-Scale Difficulty-Graded Data Training
Paper • 2504.17565 • Published • 2 -
AI-MO/NuminaMath-1.5
Viewer • Updated • 896k • 4.47k • 187 -
PrimeIntellect/synthetic-code-understanding
Viewer • Updated • 60.6k • 32 • 20 -
Go to Zero: Towards Zero-shot Motion Generation with Million-scale Data
Paper • 2507.07095 • Published • 56