Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
- Website
- Community
- Solutions
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2503.10568

Compose and Conquer: Diffusion-Based 3D Depth Aware Composable Image Synthesis

Paper • 2401.09048 • Published Jan 17, 2024 • 10
Improving fine-grained understanding in image-text pre-training

Paper • 2401.09865 • Published Jan 18, 2024 • 19
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data

Paper • 2401.10891 • Published Jan 19, 2024 • 64
Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild

Paper • 2401.13627 • Published Jan 24, 2024 • 78

Image Generation

Image Generation

Causal Diffusion Transformers for Generative Modeling

Paper • 2412.12095 • Published Dec 16, 2024 • 23
SnapGen: Taming High-Resolution Text-to-Image Models for Mobile Devices with Efficient Architectures and Training

Paper • 2412.09619 • Published Dec 12, 2024 • 32
DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for Customized Manga Generation

Paper • 2412.07589 • Published Dec 10, 2024 • 48
Flowing from Words to Pixels: A Framework for Cross-Modality Evolution

Paper • 2412.15213 • Published Dec 19, 2024 • 28

Compose and Conquer: Diffusion-Based 3D Depth Aware Composable Image Synthesis

Paper • 2401.09048 • Published Jan 17, 2024 • 10
Improving fine-grained understanding in image-text pre-training

Paper • 2401.09865 • Published Jan 18, 2024 • 19
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data

Paper • 2401.10891 • Published Jan 19, 2024 • 64
Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild

Paper • 2401.13627 • Published Jan 24, 2024 • 78

OCTO+: A Suite for Automatic Open-Vocabulary Object Placement in Mixed Reality

Paper • 2401.08973 • Published Jan 17, 2024 • 1
Autoregressive Image Generation with Randomized Parallel Decoding

Paper • 2503.10568 • Published Mar 13, 2025 • 9
VisualWebInstruct: Scaling up Multimodal Instruction Data through Web Search

Paper • 2503.10582 • Published Mar 13, 2025 • 25
meta-llama/Llama-2-7b

Text Generation • Updated Apr 17, 2024 • 109 • 4.5k

FLAME: Factuality-Aware Alignment for Large Language Models

Paper • 2405.01525 • Published May 2, 2024 • 30
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data

Paper • 2405.14333 • Published May 23, 2024 • 47
Transformers Can Do Arithmetic with the Right Embeddings

Paper • 2405.17399 • Published May 27, 2024 • 54
EasyAnimate: A High-Performance Long Video Generation Method based on Transformer Architecture

Paper • 2405.18991 • Published May 29, 2024 • 11

Compose and Conquer: Diffusion-Based 3D Depth Aware Composable Image Synthesis

Paper • 2401.09048 • Published Jan 17, 2024 • 10
Improving fine-grained understanding in image-text pre-training

Paper • 2401.09865 • Published Jan 18, 2024 • 19
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data

Paper • 2401.10891 • Published Jan 19, 2024 • 64
Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild

Paper • 2401.13627 • Published Jan 24, 2024 • 78

OCTO+: A Suite for Automatic Open-Vocabulary Object Placement in Mixed Reality

Paper • 2401.08973 • Published Jan 17, 2024 • 1
Autoregressive Image Generation with Randomized Parallel Decoding

Paper • 2503.10568 • Published Mar 13, 2025 • 9
VisualWebInstruct: Scaling up Multimodal Instruction Data through Web Search

Paper • 2503.10582 • Published Mar 13, 2025 • 25
meta-llama/Llama-2-7b

Text Generation • Updated Apr 17, 2024 • 109 • 4.5k

Image Generation

Image Generation

Causal Diffusion Transformers for Generative Modeling

Paper • 2412.12095 • Published Dec 16, 2024 • 23
SnapGen: Taming High-Resolution Text-to-Image Models for Mobile Devices with Efficient Architectures and Training

Paper • 2412.09619 • Published Dec 12, 2024 • 32
DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for Customized Manga Generation

Paper • 2412.07589 • Published Dec 10, 2024 • 48
Flowing from Words to Pixels: A Framework for Cross-Modality Evolution

Paper • 2412.15213 • Published Dec 19, 2024 • 28

FLAME: Factuality-Aware Alignment for Large Language Models

Paper • 2405.01525 • Published May 2, 2024 • 30
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data

Paper • 2405.14333 • Published May 23, 2024 • 47
Transformers Can Do Arithmetic with the Right Embeddings

Paper • 2405.17399 • Published May 27, 2024 • 54
EasyAnimate: A High-Performance Long Video Generation Method based on Transformer Architecture

Paper • 2405.18991 • Published May 29, 2024 • 11

Compose and Conquer: Diffusion-Based 3D Depth Aware Composable Image Synthesis

Paper • 2401.09048 • Published Jan 17, 2024 • 10
Improving fine-grained understanding in image-text pre-training

Paper • 2401.09865 • Published Jan 18, 2024 • 19
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data

Paper • 2401.10891 • Published Jan 19, 2024 • 64
Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild

Paper • 2401.13627 • Published Jan 24, 2024 • 78

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs