C3I-Pretrain-HF

university

https://huggingface.co/

AI & ML interests

None defined yet.

Recent Activity

bambisheng authored a paper 3 days ago

Qwen-AgentWorld: Language World Models for General Agents

bingyang-lei submitted a paper 26 days ago

Draft-OPD: On-Policy Distillation for Speculative Draft Models

bingyang-lei authored a paper 27 days ago

Achieving Gold-Medal-Level Olympiad Reasoning via Simple and Unified Scaling

View all activity

authored a paper 3 days ago

Qwen-AgentWorld: Language World Models for General Agents

Paper • 2606.24597 • Published 5 days ago • 135

submitted a paper to Daily Papers 26 days ago

Draft-OPD: On-Policy Distillation for Speculative Draft Models

Paper • 2605.29343 • Published May 28 • 36

authored 3 papers 27 days ago

Achieving Gold-Medal-Level Olympiad Reasoning via Simple and Unified Scaling

Paper • 2605.13301 • Published May 13 • 165

$π$-Bench: Evaluating Proactive Personal Assistant Agents in Long-Horizon Workflows

Paper • 2605.14678 • Published May 19 • 108

Draft-OPD: On-Policy Distillation for Speculative Draft Models

Paper • 2605.29343 • Published May 28 • 36

authored a paper about 1 month ago

Post-Trained MoE Can Skip Half Experts via Self-Distillation

Paper • 2605.18643 • Published May 18 • 30

authored 2 papers 10 months ago

A Survey of Reinforcement Learning for Large Reasoning Models

Paper • 2509.08827 • Published Sep 10, 2025 • 193

SSRL: Self-Search Reinforcement Learning

Paper • 2508.10874 • Published Aug 14, 2025 • 97

authored a paper about 1 year ago

The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models

Paper • 2505.22617 • Published May 28, 2025 • 132

authored a paper about 1 year ago

TTRL: Test-Time Reinforcement Learning

Paper • 2504.16084 • Published Apr 22, 2025 • 123

authored a paper over 1 year ago

Technologies on Effectiveness and Efficiency: A Survey of State Spaces Models

Paper • 2503.11224 • Published Mar 14, 2025 • 28

authored a paper over 1 year ago

UltraIF: Advancing Instruction Following from the Wild

Paper • 2502.04153 • Published Feb 6, 2025 • 24

authored a paper over 1 year ago

Process Reinforcement through Implicit Rewards

Paper • 2502.01456 • Published Feb 3, 2025 • 62