OpenCompass

community

https://opencompass.org.cn/

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

dongsheng submitted a paper 10 days ago

When Tools Fail: Benchmarking Dynamic Replanning and Anomaly Recovery in LLM Agents

dongsheng authored a paper 12 days ago

When Tools Fail: Benchmarking Dynamic Replanning and Anomaly Recovery in LLM Agents

ZwwWayne authored a paper 13 days ago

InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency

View all activity

Papers

CompassVerifier: A Unified and Robust Verifier for LLMs Evaluation and Outcome Reward

Creation-MMBench: Assessing Context-Aware Creative Intelligence in MLLM

View all Papers

submitted a paper to Daily Papers 10 days ago

When Tools Fail: Benchmarking Dynamic Replanning and Anomaly Recovery in LLM Agents

Paper • 2606.05806 • Published 14 days ago • 23

authored a paper 12 days ago

When Tools Fail: Benchmarking Dynamic Replanning and Anomaly Recovery in LLM Agents

Paper • 2606.05806 • Published 14 days ago • 23

authored 6 papers 13 days ago

InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency

Paper • 2508.18265 • Published Aug 25, 2025 • 224

MM-HELIX: Boosting Multimodal Long-Chain Reflective Reasoning with Holistic Platform and Adaptive Hybrid Policy Optimization

Paper • 2510.08540 • Published Oct 9, 2025 • 110

Multi-Modal Data-Efficient 3D Scene Understanding for Autonomous Driving

Paper • 2405.05258 • Published May 8, 2024

An Empirical Study of Training State-of-the-Art LiDAR Segmentation Models

Paper • 2405.14870 • Published May 23, 2024

Achieving Olympia-Level Geometry Large Language Model Agent via Complexity Boosting Reinforcement Learning

Paper • 2512.10534 • Published Dec 11, 2025 • 33

OPV: Outcome-based Process Verifier for Efficient Long Chain-of-Thought Verification

Paper • 2512.10756 • Published Dec 11, 2025 • 35

authored a paper 13 days ago

OVO-S-Bench: A Hierarchical Benchmark for Streaming Spatial Intelligence in Multimodal LLMs

Paper • 2606.03890 • Published 16 days ago • 31

authored a paper 13 days ago

Exploring Data Augmentation for Multi-Modality 3D Object Detection

Paper • 2012.12741 • Published Dec 23, 2020

authored a paper 13 days ago

Intern-S1-Pro: Scientific Multimodal Foundation Model at Trillion Scale

Paper • 2603.25040 • Published Mar 26 • 133

authored 2 papers 13 days ago

Intern-S1-Pro: Scientific Multimodal Foundation Model at Trillion Scale

Paper • 2603.25040 • Published Mar 26 • 133

ThoughtFold: Folding Reasoning Chains via Introspective Preference Learning

Paper • 2606.03503 • Published 15 days ago • 25

authored a paper 13 days ago

ThoughtFold: Folding Reasoning Chains via Introspective Preference Learning

Paper • 2606.03503 • Published 15 days ago • 25

submitted a paper to Daily Papers 14 days ago

ThoughtFold: Folding Reasoning Chains via Introspective Preference Learning

Paper • 2606.03503 • Published 15 days ago • 25

in opencompass/NeedleBench 15 days ago

Fix templating bug in English type: 0 needles

#3 opened 29 days ago by

authored 2 papers 23 days ago

SetCon: Towards Open-Ended Referring Segmentation via Set-Level Concept Prediction

Paper • 2605.20110 • Published 30 days ago • 4

ETCHR: Editing To Clarify and Harness Reasoning

Paper • 2605.23897 • Published 27 days ago • 13

submitted a paper to Daily Papers 24 days ago

ETCHR: Editing To Clarify and Harness Reasoning

Paper • 2605.23897 • Published 27 days ago • 13

authored a paper about 1 month ago

WildClawBench: A Benchmark for Real-World, Long-Horizon Agent Evaluation

Paper • 2605.10912 • Published May 11 • 46