SANA-WM: Efficient Minute-Scale World Modeling with Hybrid Linear Diffusion Transformer Paper • 2605.15178 • Published May 14 • 87
CausalCine: Real-Time Autoregressive Generation for Multi-Shot Video Narratives Paper • 2605.12496 • Published May 12 • 29
TextLDM: Language Modeling with Continuous Latent Diffusion Paper • 2605.07748 • Published May 8 • 26
Running on Zero Agents 11 DialogueSidon Demo 🔥 11 Separate two speakers from an audio or video recording