Skill-RM: Unifying Heterogeneous Evaluation Criteria via Agent Skill Paper • 2606.03980 • Published 9 days ago • 13
Agent Explorative Policy Optimization for Multimodal Agentic Reasoning Paper • 2605.28774 • Published 15 days ago • 90
Cosmos-Reason1 Collection ⚠️ This collection is archived. 👉 https://huggingface.co/collections/nvidia/cosmos3 • 5 items • Updated 3 days ago • 42
GUI-AIMA: Aligning Intrinsic Multimodal Attention with a Context Anchor for GUI Grounding Paper • 2511.00810 • Published Nov 2, 2025 • 5
Attention-driven GUI Grounding: Leveraging Pretrained Multimodal Large Language Models without Fine-Tuning Paper • 2412.10840 • Published Dec 14, 2024 • 1
GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents Paper • 2506.03143 • Published Jun 3, 2025 • 54
LlavaGuard Collection This collection contains the original repos of the LlavaGuard releases • 17 items • Updated Mar 2 • 7