GGUF
conversational
How to use from
Docker Model Runner
docker model run hf.co/ddh0/gemma-4-it-GGUF:
Quick Links

These are miscellaneous GGUF quantizations of the instruct-tuned Gemma 4 series of models, released by Google.

For more information about Gemma, you should refer to the original model cards.

The chat template baked into these GGUFs is technically outdated, however, inference in llama.cpp should still work exactly as it should, thanks to these fixes:

For the latest official chat template, refer to the original model repo.

Downloads last month
25,830
GGUF
Model size
12B params
Architecture
gemma4
Hardware compatibility
Log In to add your hardware

8-bit

16-bit

32-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ddh0/gemma-4-it-GGUF

Quantized
(231)
this model