-
Continuous Latent Diffusion Language Model
Paper • 2605.06548 • Published • 85 -
Scaling Latent Reasoning via Looped Language Models
Paper • 2510.25741 • Published • 233 -
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach
Paper • 2502.05171 • Published • 158 -
Pretraining Language Models to Ponder in Continuous Space
Paper • 2505.20674 • Published • 3
Collections
Discover the best community collections!
Collections including paper arxiv:2606.30626
-
Towards Scalable Pre-training of Visual Tokenizers for Generation
Paper • 2512.13687 • Published • 108 -
MMGR: Multi-Modal Generative Reasoning
Paper • 2512.14691 • Published • 121 -
Coupling Experts and Routers in Mixture-of-Experts via an Auxiliary Loss
Paper • 2512.23447 • Published • 100 -
LiveTalk: Real-Time Multimodal Interactive Video Diffusion via Improved On-Policy Distillation
Paper • 2512.23576 • Published • 66
-
lusxvr/nanoVLM-222M
Image-Text-to-Text • 0.2B • Updated • 770 • 99 -
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 41 -
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
Paper • 2505.24863 • Published • 98 -
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning
Paper • 2505.17667 • Published • 89
-
Outlier-Safe Pre-Training for Robust 4-Bit Quantization of Large Language Models
Paper • 2506.19697 • Published • 45 -
Winning the Pruning Gamble: A Unified Approach to Joint Sample and Token Pruning for Efficient Supervised Fine-Tuning
Paper • 2509.23873 • Published • 68 -
Efficient Multi-modal Large Language Models via Progressive Consistency Distillation
Paper • 2510.00515 • Published • 41 -
SINQ: Sinkhorn-Normalized Quantization for Calibration-Free Low-Precision LLM Weights
Paper • 2509.22944 • Published • 80
-
Continuous Latent Diffusion Language Model
Paper • 2605.06548 • Published • 85 -
Scaling Latent Reasoning via Looped Language Models
Paper • 2510.25741 • Published • 233 -
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach
Paper • 2502.05171 • Published • 158 -
Pretraining Language Models to Ponder in Continuous Space
Paper • 2505.20674 • Published • 3
-
Towards Scalable Pre-training of Visual Tokenizers for Generation
Paper • 2512.13687 • Published • 108 -
MMGR: Multi-Modal Generative Reasoning
Paper • 2512.14691 • Published • 121 -
Coupling Experts and Routers in Mixture-of-Experts via an Auxiliary Loss
Paper • 2512.23447 • Published • 100 -
LiveTalk: Real-Time Multimodal Interactive Video Diffusion via Improved On-Policy Distillation
Paper • 2512.23576 • Published • 66
-
Outlier-Safe Pre-Training for Robust 4-Bit Quantization of Large Language Models
Paper • 2506.19697 • Published • 45 -
Winning the Pruning Gamble: A Unified Approach to Joint Sample and Token Pruning for Efficient Supervised Fine-Tuning
Paper • 2509.23873 • Published • 68 -
Efficient Multi-modal Large Language Models via Progressive Consistency Distillation
Paper • 2510.00515 • Published • 41 -
SINQ: Sinkhorn-Normalized Quantization for Calibration-Free Low-Precision LLM Weights
Paper • 2509.22944 • Published • 80
-
lusxvr/nanoVLM-222M
Image-Text-to-Text • 0.2B • Updated • 770 • 99 -
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 41 -
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
Paper • 2505.24863 • Published • 98 -
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning
Paper • 2505.17667 • Published • 89