Cosmos3-Nano β€” NF4 4-bit Pre-Quantized Transformer

Pre-quantized NF4 (4-bit, double-quantized) version of NVIDIA's nvidia/Cosmos3-Nano omnimodal world model, created with bitsandbytes. Only the large Cosmos3OmniTransformer is quantized; the VAE and the text/sound tokenizers are bundled unchanged at bf16, so the repo is self-contained and fully drop-in.

This loads in seconds with no runtime quantization pass β€” on-the-fly NF4 of the bf16 original takes minutes every load; this bakes it once.

Key Details

Property Value
Repo size 11 GB (vs ~34 GB bf16)
Quantized component transformer β€” 8.3 GB NF4 (vs ~32 GB bf16)
Quantization NF4 (bitsandbytes), double quantization, bnb_4bit_compute_dtype=bfloat16
Modes text-to-image, text-to-video, image-to-video (+ optional sound)
Base params 16B (omnimodal)
VRAM (loaded) ~11 GB
Source weights nvidia/Cosmos3-Nano (bf16)
Tested on NVIDIA GB10 (DGX Spark)

Usage

Requires a diffusers build with Cosmos 3 support (currently from source) plus bitsandbytes. The NF4 config is embedded β€” do not pass a quantization_config, and do not call .to(dtype) on a 4-bit model.

pip install "git+https://github.com/huggingface/diffusers.git" bitsandbytes accelerate
import torch
from diffusers import Cosmos3OmniPipeline

pipe = Cosmos3OmniPipeline.from_pretrained(
    "SanDiegoDude/Cosmos3-Nano-nf4",
    torch_dtype=torch.bfloat16,
    enable_safety_checker=False,  # skips the optional cosmos_guardrail dependency
).to("cuda")

result = pipe("A small warehouse robot beside a blue box, clean studio lighting.")
frames = result.video[0]          # text-to-image returns a single frame
frames[0].save("out.png")

ComfyUI

A turnkey loader + T2I / T2V / I2V nodes are available in scg-Cosmos3. The loader auto-detects this pre-quantized layout and skips the re-quant pass.

Related Repos

License

Released under NVIDIA's OpenMDW 1.1 License, inherited from the base model. Quantization only changes the weight encoding.

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for SanDiegoDude/Cosmos3-Nano-nf4

Quantized
(7)
this model