-
Beyond Language Models: Byte Models are Digital World Simulators
Paper • 2402.19155 • Published • 53 -
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models
Paper • 2402.19427 • Published • 58 -
VisionLLaMA: A Unified LLaMA Interface for Vision Tasks
Paper • 2403.00522 • Published • 47 -
Resonance RoPE: Improving Context Length Generalization of Large Language Models
Paper • 2403.00071 • Published • 24
Collections
Discover the best community collections!
Collections including paper arxiv:2412.19437
-
A Picture is Worth More Than 77 Text Tokens: Evaluating CLIP-Style Models on Dense Captions
Paper • 2312.08578 • Published • 21 -
ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks
Paper • 2312.08583 • Published • 11 -
Vision-Language Models as a Source of Rewards
Paper • 2312.09187 • Published • 12 -
StemGen: A music generation model that listens
Paper • 2312.08723 • Published • 48
-
Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models
Paper • 2402.14848 • Published • 19 -
Teaching Large Language Models to Reason with Reinforcement Learning
Paper • 2403.04642 • Published • 48 -
How Far Are We from Intelligent Visual Deductive Reasoning?
Paper • 2403.04732 • Published • 21 -
Learning to Reason and Memorize with Self-Notes
Paper • 2305.00833 • Published • 5
-
Beyond Language Models: Byte Models are Digital World Simulators
Paper • 2402.19155 • Published • 53 -
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models
Paper • 2402.19427 • Published • 58 -
VisionLLaMA: A Unified LLaMA Interface for Vision Tasks
Paper • 2403.00522 • Published • 47 -
Resonance RoPE: Improving Context Length Generalization of Large Language Models
Paper • 2403.00071 • Published • 24
-
Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models
Paper • 2402.14848 • Published • 19 -
Teaching Large Language Models to Reason with Reinforcement Learning
Paper • 2403.04642 • Published • 48 -
How Far Are We from Intelligent Visual Deductive Reasoning?
Paper • 2403.04732 • Published • 21 -
Learning to Reason and Memorize with Self-Notes
Paper • 2305.00833 • Published • 5
-
A Picture is Worth More Than 77 Text Tokens: Evaluating CLIP-Style Models on Dense Captions
Paper • 2312.08578 • Published • 21 -
ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks
Paper • 2312.08583 • Published • 11 -
Vision-Language Models as a Source of Rewards
Paper • 2312.09187 • Published • 12 -
StemGen: A music generation model that listens
Paper • 2312.08723 • Published • 48