Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
- Website
- Community
- Solutions
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2605.26102

CircleRadon/InstructSAM-2B

3B • Updated 20 days ago • 60
InstructSAM: Segment Any Instance with Any Instructions

Paper • 2605.26102 • Published 21 days ago • 17
CircleRadon/Inst2Seg

Updated 21 days ago • 881
CircleRadon/Inst2Seg-Bench

Viewer • Updated 21 days ago • 986 • 995

VQ-Seg: Vector-Quantized Token Perturbation for Semi-Supervised Medical Image Segmentation

Paper • 2601.10124 • Published Jan 15 • 4
Urban Socio-Semantic Segmentation with Vision-Language Reasoning

Paper • 2601.10477 • Published Jan 15 • 155
Medical SAM3: A Foundation Model for Universal Prompt-Driven Medical Image Segmentation

Paper • 2601.10880 • Published Jan 15 • 15
SAMTok: Representing Any Mask with Two Words

Paper • 2601.16093 • Published Jan 22 • 44

What matters when building vision-language models?

Paper • 2405.02246 • Published May 3, 2024 • 104
An Introduction to Vision-Language Modeling

Paper • 2405.17247 • Published May 27, 2024 • 91
DeMamba: AI-Generated Video Detection on Million-Scale GenVideo Benchmark

Paper • 2405.19707 • Published May 30, 2024 • 9
Scaling Up Your Kernels: Large Kernel Design in ConvNets towards Universal Representations

Paper • 2410.08049 • Published Oct 10, 2024 • 8

InstructSAM: Segment Any Instance with Any Instructions

Paper • 2605.26102 • Published 21 days ago • 17

MiCo: Multi-image Contrast for Reinforcement Visual Reasoning

Paper • 2506.22434 • Published Jun 27, 2025 • 10
VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning

Paper • 2507.13348 • Published Jul 17, 2025 • 80
RewardDance: Reward Scaling in Visual Generation

Paper • 2509.08826 • Published Sep 10, 2025 • 73
Grasp Any Region: Towards Precise, Contextual Pixel Understanding for Multimodal LLMs

Paper • 2510.18876 • Published Oct 21, 2025 • 37

CircleRadon/InstructSAM-2B

3B • Updated 20 days ago • 60
InstructSAM: Segment Any Instance with Any Instructions

Paper • 2605.26102 • Published 21 days ago • 17
CircleRadon/Inst2Seg

Updated 21 days ago • 881
CircleRadon/Inst2Seg-Bench

Viewer • Updated 21 days ago • 986 • 995

InstructSAM: Segment Any Instance with Any Instructions

Paper • 2605.26102 • Published 21 days ago • 17

VQ-Seg: Vector-Quantized Token Perturbation for Semi-Supervised Medical Image Segmentation

Paper • 2601.10124 • Published Jan 15 • 4
Urban Socio-Semantic Segmentation with Vision-Language Reasoning

Paper • 2601.10477 • Published Jan 15 • 155
Medical SAM3: A Foundation Model for Universal Prompt-Driven Medical Image Segmentation

Paper • 2601.10880 • Published Jan 15 • 15
SAMTok: Representing Any Mask with Two Words

Paper • 2601.16093 • Published Jan 22 • 44

MiCo: Multi-image Contrast for Reinforcement Visual Reasoning

Paper • 2506.22434 • Published Jun 27, 2025 • 10
VisionThink: Smart and Efficient Vision Language Model via Reinforcement Learning

Paper • 2507.13348 • Published Jul 17, 2025 • 80
RewardDance: Reward Scaling in Visual Generation

Paper • 2509.08826 • Published Sep 10, 2025 • 73
Grasp Any Region: Towards Precise, Contextual Pixel Understanding for Multimodal LLMs

Paper • 2510.18876 • Published Oct 21, 2025 • 37

What matters when building vision-language models?

Paper • 2405.02246 • Published May 3, 2024 • 104
An Introduction to Vision-Language Modeling

Paper • 2405.17247 • Published May 27, 2024 • 91
DeMamba: AI-Generated Video Detection on Million-Scale GenVideo Benchmark

Paper • 2405.19707 • Published May 30, 2024 • 9
Scaling Up Your Kernels: Large Kernel Design in ConvNets towards Universal Representations

Paper • 2410.08049 • Published Oct 10, 2024 • 8

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs