-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 156 -
Orion-14B: Open-source Multilingual Large Language Models
Paper • 2401.12246 • Published • 14 -
MambaByte: Token-free Selective State Space Model
Paper • 2401.13660 • Published • 59 -
MM-LLMs: Recent Advances in MultiModal Large Language Models
Paper • 2401.13601 • Published • 47
Collections
Discover the best community collections!
Collections including paper arxiv:2507.20984
-
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models
Paper • 2508.06471 • Published • 212 -
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
Paper • 2507.01006 • Published • 256 -
Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities
Paper • 2507.06261 • Published • 67 -
SmallThinker: A Family of Efficient Large Language Models Natively Trained for Local Deployment
Paper • 2507.20984 • Published • 58
-
SmallThinker: A Family of Efficient Large Language Models Natively Trained for Local Deployment
Paper • 2507.20984 • Published • 58 -
tencent/Hunyuan-0.5B-Instruct
Text Generation • 0.5B • Updated • 687 • 58 -
IndexTeam/Index-1.9B-Chat-GGUF
2B • Updated • 109 • 26 -
YannQi/R-4B
Image-Text-to-Text • 5B • Updated • 197k • 182
-
SmallThinker: A Family of Efficient Large Language Models Natively Trained for Local Deployment
Paper • 2507.20984 • Published • 58 -
Open Data Synthesis For Deep Research
Paper • 2509.00375 • Published • 73 -
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth
Paper • 2509.03867 • Published • 213
-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 156 -
Orion-14B: Open-source Multilingual Large Language Models
Paper • 2401.12246 • Published • 14 -
MambaByte: Token-free Selective State Space Model
Paper • 2401.13660 • Published • 59 -
MM-LLMs: Recent Advances in MultiModal Large Language Models
Paper • 2401.13601 • Published • 47
-
Grove MoE: Towards Efficient and Superior MoE LLMs with Adjugate Experts
Paper • 2508.07785 • Published • 30 -
MoBE: Mixture-of-Basis-Experts for Compressing MoE-based LLMs
Paper • 2508.05257 • Published • 13 -
SmallThinker: A Family of Efficient Large Language Models Natively Trained for Local Deployment
Paper • 2507.20984 • Published • 58 -
MiniCPM4: Ultra-Efficient LLMs on End Devices
Paper • 2506.07900 • Published • 99
-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 156 -
Orion-14B: Open-source Multilingual Large Language Models
Paper • 2401.12246 • Published • 14 -
MambaByte: Token-free Selective State Space Model
Paper • 2401.13660 • Published • 59 -
MM-LLMs: Recent Advances in MultiModal Large Language Models
Paper • 2401.13601 • Published • 47
-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 156 -
Orion-14B: Open-source Multilingual Large Language Models
Paper • 2401.12246 • Published • 14 -
MambaByte: Token-free Selective State Space Model
Paper • 2401.13660 • Published • 59 -
MM-LLMs: Recent Advances in MultiModal Large Language Models
Paper • 2401.13601 • Published • 47
-
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models
Paper • 2508.06471 • Published • 212 -
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
Paper • 2507.01006 • Published • 256 -
Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities
Paper • 2507.06261 • Published • 67 -
SmallThinker: A Family of Efficient Large Language Models Natively Trained for Local Deployment
Paper • 2507.20984 • Published • 58
-
Grove MoE: Towards Efficient and Superior MoE LLMs with Adjugate Experts
Paper • 2508.07785 • Published • 30 -
MoBE: Mixture-of-Basis-Experts for Compressing MoE-based LLMs
Paper • 2508.05257 • Published • 13 -
SmallThinker: A Family of Efficient Large Language Models Natively Trained for Local Deployment
Paper • 2507.20984 • Published • 58 -
MiniCPM4: Ultra-Efficient LLMs on End Devices
Paper • 2506.07900 • Published • 99
-
SmallThinker: A Family of Efficient Large Language Models Natively Trained for Local Deployment
Paper • 2507.20984 • Published • 58 -
tencent/Hunyuan-0.5B-Instruct
Text Generation • 0.5B • Updated • 687 • 58 -
IndexTeam/Index-1.9B-Chat-GGUF
2B • Updated • 109 • 26 -
YannQi/R-4B
Image-Text-to-Text • 5B • Updated • 197k • 182
-
SmallThinker: A Family of Efficient Large Language Models Natively Trained for Local Deployment
Paper • 2507.20984 • Published • 58 -
Open Data Synthesis For Deep Research
Paper • 2509.00375 • Published • 73 -
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth
Paper • 2509.03867 • Published • 213