Filter, Then Reweight: Rethinking Optimization Granularity in On-Policy Distillation Paper • 2606.02684 • Published 8 days ago • 16
Domain-Specific Data Synthesis for LLMs via Minimal Sufficient Representation Learning Paper • 2605.30039 • Published 11 days ago • 18
MLEvolve: A Self-Evolving Framework for Automated Machine Learning Algorithm Discovery Paper • 2606.06473 • Published 5 days ago • 18
Decentralized Instruction Tuning: Conflict-Aware Splitting and Weight Merging Paper • 2606.01717 • Published 8 days ago • 21
MIRA: Mid-training Rubric Anchoring for Source-Aware Data Selection Paper • 2605.30288 • Published 11 days ago • 22
SAAS: Self-Aware Reinforcement Learning for Over-Search Mitigation in Agentic Search Paper • 2605.29796 • Published 12 days ago • 25
Not All Disagreement Is Learnable: Token Teachability in On-Policy Distillation Paper • 2605.26844 • Published 14 days ago • 26
Language Models Need Sleep: Learning to Self-Modify and Consolidate Memories Paper • 2606.03979 • Published 7 days ago • 28
TIDE: Proactive Multi-Problem Discovery via Template-Guided Iteration Paper • 2606.04743 • Published 6 days ago • 40
Trust-Region Behavior Blending for On-Policy Distillation Paper • 2605.31159 • Published 11 days ago • 65
Mem-π: Adaptive Memory through Learning When and What to Generate Paper • 2605.21463 • Published 20 days ago • 8
How LoRA Remembers? A Parametric Memory Law for LLM Finetuning Paper • 2605.30260 • Published 12 days ago • 42
Parameter Efficiency Is Not Memory Efficiency: Rethinking Fine-Tuning for On-Device LLM Adaptation Paper • 2604.22783 • Published Apr 3 • 1
Self-Pruned Key-Value Attention: Learning When to Write by Predicting Future Utility Paper • 2605.14037 • Published 27 days ago • 1
NITP: Next Implicit Token Prediction for LLM Pre-training Paper • 2605.24956 • Published 16 days ago • 35
Rethinking Memory as Continuously Evolving Connectivity Paper • 2605.28773 • Published 13 days ago • 34