Collections
Discover the best community collections!
Collections including paper arxiv:2602.19672
-
Exploring Reasoning Reward Model for Agents
Paper • 2601.22154 • Published • 24 -
Group-Evolving Agents: Open-Ended Self-Improvement via Experience Sharing
Paper • 2602.04837 • Published • 9 -
Agent Skills: A Data-Driven Analysis of Claude Skills for Extending Large Language Model Functionality
Paper • 2602.08004 • Published • 5 -
SEAD: Self-Evolving Agent for Multi-Turn Service Dialogue
Paper • 2602.03548 • Published • 4
-
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning
Paper • 2506.01939 • Published • 190 -
ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration
Paper • 2511.21689 • Published • 128 -
PretrainZero: Reinforcement Active Pretraining
Paper • 2512.03442 • Published • 50 -
DSGym: A Holistic Framework for Evaluating and Training Data Science Agents
Paper • 2601.16344 • Published • 12
-
BitNet: Scaling 1-bit Transformers for Large Language Models
Paper • 2310.11453 • Published • 107 -
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection
Paper • 2310.11511 • Published • 79 -
In-Context Learning Creates Task Vectors
Paper • 2310.15916 • Published • 43 -
Matryoshka Diffusion Models
Paper • 2310.15111 • Published • 46
-
Agentic Reasoning for Large Language Models
Paper • 2601.12538 • Published • 204 -
From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence
Paper • 2511.18538 • Published • 304 -
Agent Learning via Early Experience
Paper • 2510.08558 • Published • 276 -
Weak-Driven Learning: How Weak Agents make Strong Agents Stronger
Paper • 2602.08222 • Published • 290
-
LLM Agent Operating System
Paper • 2403.16971 • Published • 73 -
On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification
Paper • 2508.05629 • Published • 189 -
Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens
Paper • 2508.01191 • Published • 240 -
A Survey of Context Engineering for Large Language Models
Paper • 2507.13334 • Published • 264
-
Agentic Reasoning for Large Language Models
Paper • 2601.12538 • Published • 204 -
From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence
Paper • 2511.18538 • Published • 304 -
Agent Learning via Early Experience
Paper • 2510.08558 • Published • 276 -
Weak-Driven Learning: How Weak Agents make Strong Agents Stronger
Paper • 2602.08222 • Published • 290
-
Exploring Reasoning Reward Model for Agents
Paper • 2601.22154 • Published • 24 -
Group-Evolving Agents: Open-Ended Self-Improvement via Experience Sharing
Paper • 2602.04837 • Published • 9 -
Agent Skills: A Data-Driven Analysis of Claude Skills for Extending Large Language Model Functionality
Paper • 2602.08004 • Published • 5 -
SEAD: Self-Evolving Agent for Multi-Turn Service Dialogue
Paper • 2602.03548 • Published • 4
-
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning
Paper • 2506.01939 • Published • 190 -
ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration
Paper • 2511.21689 • Published • 128 -
PretrainZero: Reinforcement Active Pretraining
Paper • 2512.03442 • Published • 50 -
DSGym: A Holistic Framework for Evaluating and Training Data Science Agents
Paper • 2601.16344 • Published • 12
-
LLM Agent Operating System
Paper • 2403.16971 • Published • 73 -
On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification
Paper • 2508.05629 • Published • 189 -
Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens
Paper • 2508.01191 • Published • 240 -
A Survey of Context Engineering for Large Language Models
Paper • 2507.13334 • Published • 264
-
BitNet: Scaling 1-bit Transformers for Large Language Models
Paper • 2310.11453 • Published • 107 -
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection
Paper • 2310.11511 • Published • 79 -
In-Context Learning Creates Task Vectors
Paper • 2310.15916 • Published • 43 -
Matryoshka Diffusion Models
Paper • 2310.15111 • Published • 46