Reasoning Arena: Trace Tournaments When Verifiable Rewards Fall Short Paper • 2606.09380 • Published 10 days ago • 8
ROS-LLM: A ROS framework for embodied AI with task feedback and structured reasoning Paper • 2406.19741 • Published Jun 28, 2024 • 60
adamxyang/1.4b-policy_preference_data_gold_labelled_with_ref Viewer • Updated Apr 6, 2024 • 51.4k • 202