Cosmos3-Nano — NF4 4-bit Pre-Quantized Transformer

Pre-quantized NF4 (4-bit, double-quantized) version of NVIDIA's nvidia/Cosmos3-Nano omnimodal world model, created with bitsandbytes. Only the large Cosmos3OmniTransformer is quantized; the VAE and the text/sound tokenizers are bundled unchanged at bf16, so the repo is self-contained and fully drop-in.

This loads in seconds with no runtime quantization pass — on-the-fly NF4 of the bf16 original takes minutes every load; this bakes it once.

Key Details

Property	Value
Repo size	11 GB (vs ~34 GB bf16)
Quantized component	`transformer` — 8.3 GB NF4 (vs ~32 GB bf16)
Quantization	NF4 (bitsandbytes), double quantization, `bnb_4bit_compute_dtype=bfloat16`
Modes	text-to-image, text-to-video, image-to-video (+ optional sound)
Base params	16B (omnimodal)
VRAM (loaded)	~11 GB
Source weights	nvidia/Cosmos3-Nano (bf16)
Tested on	NVIDIA GB10 (DGX Spark)

Usage

Requires a diffusers build with Cosmos 3 support (currently from source) plus bitsandbytes. The NF4 config is embedded — do not pass a quantization_config, and do not call .to(dtype) on a 4-bit model.

pip install "git+https://github.com/huggingface/diffusers.git" bitsandbytes accelerate

import torch
from diffusers import Cosmos3OmniPipeline

pipe = Cosmos3OmniPipeline.from_pretrained(
    "SanDiegoDude/Cosmos3-Nano-nf4",
    torch_dtype=torch.bfloat16,
    enable_safety_checker=False,  # skips the optional cosmos_guardrail dependency
).to("cuda")

result = pipe("A small warehouse robot beside a blue box, clean studio lighting.")
frames = result.video[0]          # text-to-image returns a single frame
frames[0].save("out.png")

ComfyUI

A turnkey loader + T2I / T2V / I2V nodes are available in scg-Cosmos3. The loader auto-detects this pre-quantized layout and skips the re-quant pass.

Related Repos

Original model (bf16, source): nvidia/Cosmos3-Nano
64B image variant (NF4): SanDiegoDude/Cosmos3-Super-Text2Image-nf4

License

Released under NVIDIA's OpenMDW 1.1 License, inherited from the base model. Quantization only changes the weight encoding.

Downloads last month: -

Model tree for SanDiegoDude/Cosmos3-Nano-nf4

Base model

nvidia/Cosmos3-Nano

Quantized

(7)

this model