Redesign Mixture-of-Experts Routers with Manifold Power Iteration Paper • 2606.12397 • Published 11 days ago • 87
Rethinking Continual Experience Internalization for Self-Evolving LLM Agents Paper • 2606.04703 • Published 18 days ago • 24
AgentProcessBench: Diagnosing Step-Level Process Quality in Tool-Using Agents Paper • 2603.14465 • Published Mar 15 • 23
Learning beyond Teacher: Generalized On-Policy Distillation with Reward Extrapolation Paper • 2602.12125 • Published Feb 12 • 67
AgentCPM-Report: Interleaving Drafting and Deepening for Open-Ended Deep Research Paper • 2602.06540 • Published Feb 6 • 22
DARC: Decoupled Asymmetric Reasoning Curriculum for LLM Evolution Paper • 2601.13761 • Published Jan 20 • 16
Less Noise, More Voice: Reinforcement Learning for Reasoning via Instruction Purification Paper • 2601.21244 • Published Jan 29 • 12