How to use from
Docker Model Runner
docker model run hf.co/finis-est/gemma-4-31b-larkspur-v1-Q4_K_M:Q4_K_M
Quick Links

Gemma 4 31B Larkspur v1 โ€” Q4_K_M GGUF

Q4_K_M quantization of trashpanda-org/gemma-4-31b-larkspur-v1.

Quant Details

Property Value
Source trashpanda-org/gemma-4-31b-larkspur-v1
Quant Q4_K_M (4.83 BPW)
Size ~18 GB
Format GGUF (llama.cpp)
Original Precision bf16

Usage

Load with any llama.cpp-compatible runtime (llama.cpp, KoboldCpp, ollama, LM Studio, etc.):

llama-cli -m gemma-4-31b-larkspur-v1-Q4_K_M.gguf -p "Your prompt here"

Notes

  • Quantized from the bf16 source weights using llama.cpp's convert_hf_to_gguf.py โ†’ llama-quantize
  • Q4_K_M offers a good balance of quality and size at ~31% of the bf16 size
Downloads last month
40
GGUF
Model size
31B params
Architecture
gemma4
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for finis-est/gemma-4-31b-larkspur-v1-Q4_K_M

Quantized
(3)
this model