Multi-module GRPO: Composing Policy Gradients and Prompt Optimization for Language Model Programs Paper • 2508.04660 • Published Aug 6, 2025 • 3
Seed-Prover: Deep and Broad Reasoning for Automated Theorem Proving Paper • 2507.23726 • Published Jul 31, 2025 • 116
Pixels, Patterns, but No Poetry: To See The World like Humans Paper • 2507.16863 • Published Jul 21, 2025 • 69
ConSens: Assessing context grounding in open-book question answering Paper • 2505.00065 • Published Apr 30, 2025 • 1