view article Article Transformers v5: Simple model definitions powering the AI ecosystem +2 lysandre, ArthurZ, cyrilvallez, reach-vb • Dec 1, 2025 • 311
view article Article 20x Faster TRL Fine-tuning with RapidFire AI +1 kbigdelysh, arunkk09, qgallouedec • Nov 21, 2025 • 27
Reinforcement Learning for Reasoning in Large Language Models with One Training Example Paper • 2504.20571 • Published Apr 29, 2025 • 99
Think2SQL: Reinforce LLM Reasoning Capabilities for Text2SQL Paper • 2504.15077 • Published Apr 21, 2025 • 16
DAPO: An Open-Source LLM Reinforcement Learning System at Scale Paper • 2503.14476 • Published Mar 18, 2025 • 146
CypherBench: Towards Precise Retrieval over Full-scale Modern Knowledge Graphs in the LLM Era Paper • 2412.18702 • Published Dec 24, 2024 • 8
view article Article Illustrating Reinforcement Learning from Human Feedback (RLHF) +2 natolambert, LouisCastricato, lvwerra, Dahoas • Dec 9, 2022 • 416