How to use from the
Use from the
Transformers library
# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("ggml-org/granite-4.0-h-small-Q8_0-GGUF", dtype="auto")
Quick Links

ggml-org/granite-4.0-h-small-Q8_0-GGUF

This model was converted to GGUF format from ibm-granite/granite-4.0-h-small using llama.cpp via the ggml.ai's GGUF-my-repo space. Refer to the original model card for more details on the model.

Downloads last month
44
GGUF
Model size
32B params
Architecture
granitehybrid
Hardware compatibility
Log In to add your hardware

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for ggml-org/granite-4.0-h-small-Q8_0-GGUF

Quantized
(36)
this model