TrackCraft3R: Repurposing Video Diffusion Transformers for Dense 3D Tracking Paper • 2605.12587 • Published 23 days ago • 37
From Skill Text to Skill Structure: The Scheduling-Structural-Logical Representation for Agent Skills Paper • 2604.24026 • Published Apr 27 • 21
MOSS-Audio Collection An open-source audio understanding model supporting speech recognition, environmental sound analysis, music understanding, time-aware QA, and complex • 7 items • Updated May 2 • 62
💧 LFM2.5 Collection Collection of post-trained and base LFM2.5 models. • 33 items • Updated 6 days ago • 145
MolmoWeb: Open Visual Web Agent and Open Data for the Open Web Paper • 2604.08516 • Published Apr 9 • 44
Gemma 4 Collection Gemma 4 is Google's new model family including including E2B, E4B, 26B-A4B, and 31B. • 31 items • Updated about 3 hours ago • 200
view article Article Welcome Gemma 4: Frontier multimodal intelligence on device +5 merve, pcuenq, sergiopaniego, burtenshaw, Steveeeeeeen, alvarobartt, SaylorTwift • Apr 2 • 903
Trinity Collection Collection of Arcee AI models in the Trinity family • 14 items • Updated Mar 25 • 30
Flagship model quants Collection Quantized versions of Arcee's latest flagship models • 11 items • Updated Dec 27, 2024 • 8