Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
- Website
- Community
- Solutions
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2605.27358

MobileMoE: Scaling On-Device Mixture of Experts

Paper • 2605.27358 • Published 6 days ago • 12

LLM Architectures

Kimi Linear: An Expressive, Efficient Attention Architecture

Paper • 2510.26692 • Published Oct 30, 2025 • 133
GLM-5: from Vibe Coding to Agentic Engineering

Paper • 2602.15763 • Published Feb 17 • 150
Believe Your Model: Distribution-Guided Confidence Calibration

Paper • 2603.03872 • Published Mar 4 • 40
OpenWorldLib: A Unified Codebase and Definition of Advanced World Models

Paper • 2604.04707 • Published Apr 6 • 203

Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free

Paper • 2410.10814 • Published Oct 14, 2024 • 51
Make LoRA Great Again: Boosting LoRA with Adaptive Singular Values and Mixture-of-Experts Optimization Alignment

Paper • 2502.16894 • Published Feb 24, 2025 • 33
Ring-lite: Scalable Reasoning via C3PO-Stabilized Reinforcement Learning for LLMs

Paper • 2506.14731 • Published Jun 17, 2025 • 8
SlimMoE: Structured Compression of Large MoE Models via Expert Slimming and Distillation

Paper • 2506.18349 • Published Jun 23, 2025 • 13

NanoResearch: Co-Evolving Skills, Memory, and Policy for Personalized Research Automation

Paper • 2605.10813 • Published 21 days ago • 16
FashionChameleon: Towards Real-Time and Interactive Human-Garment Video Customization

Paper • 2605.15824 • Published 17 days ago • 64
Memory-Efficient Looped Transformer: Decoupling Compute from Memory in Looped Language Models

Paper • 2605.07721 • Published 24 days ago • 29
MAP: A Map-then-Act Paradigm for Long-Horizon Interactive Agent Reasoning

Paper • 2605.13037 • Published 19 days ago • 8

Star Attention: Efficient LLM Inference over Long Sequences

Paper • 2411.17116 • Published Nov 26, 2024 • 53
MSA: Memory Sparse Attention for Efficient End-to-End Memory Model Scaling to 100M Tokens

Paper • 2603.23516 • Published Mar 6 • 50
TriAttention: Efficient Long Reasoning with Trigonometric KV Compression

Paper • 2604.04921 • Published Apr 6 • 114
Let ViT Speak: Generative Language-Image Pre-training

Paper • 2605.00809 • Published May 1 • 33

MobileMoE: Scaling On-Device Mixture of Experts

Paper • 2605.27358 • Published 6 days ago • 12

NanoResearch: Co-Evolving Skills, Memory, and Policy for Personalized Research Automation

Paper • 2605.10813 • Published 21 days ago • 16
FashionChameleon: Towards Real-Time and Interactive Human-Garment Video Customization

Paper • 2605.15824 • Published 17 days ago • 64
Memory-Efficient Looped Transformer: Decoupling Compute from Memory in Looped Language Models

Paper • 2605.07721 • Published 24 days ago • 29
MAP: A Map-then-Act Paradigm for Long-Horizon Interactive Agent Reasoning

Paper • 2605.13037 • Published 19 days ago • 8

LLM Architectures

Kimi Linear: An Expressive, Efficient Attention Architecture

Paper • 2510.26692 • Published Oct 30, 2025 • 133
GLM-5: from Vibe Coding to Agentic Engineering

Paper • 2602.15763 • Published Feb 17 • 150
Believe Your Model: Distribution-Guided Confidence Calibration

Paper • 2603.03872 • Published Mar 4 • 40
OpenWorldLib: A Unified Codebase and Definition of Advanced World Models

Paper • 2604.04707 • Published Apr 6 • 203

Star Attention: Efficient LLM Inference over Long Sequences

Paper • 2411.17116 • Published Nov 26, 2024 • 53
MSA: Memory Sparse Attention for Efficient End-to-End Memory Model Scaling to 100M Tokens

Paper • 2603.23516 • Published Mar 6 • 50
TriAttention: Efficient Long Reasoning with Trigonometric KV Compression

Paper • 2604.04921 • Published Apr 6 • 114
Let ViT Speak: Generative Language-Image Pre-training

Paper • 2605.00809 • Published May 1 • 33

Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free

Paper • 2410.10814 • Published Oct 14, 2024 • 51
Make LoRA Great Again: Boosting LoRA with Adaptive Singular Values and Mixture-of-Experts Optimization Alignment

Paper • 2502.16894 • Published Feb 24, 2025 • 33
Ring-lite: Scalable Reasoning via C3PO-Stabilized Reinforcement Learning for LLMs

Paper • 2506.14731 • Published Jun 17, 2025 • 8
SlimMoE: Structured Compression of Large MoE Models via Expert Slimming and Distillation

Paper • 2506.18349 • Published Jun 23, 2025 • 13

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs