--- base_model: OBLITERATUS/gemma-4-E4B-it-OBLITERATED pipeline_tag: text-generation tags: - gguf - llama-cpp - quantized - gemma --- # gemma-4-E4B-it-OBLITERATED - GGUF These are quantized GGUF format files for [OBLITERATUS/gemma-4-E4B-it-OBLITERATED](https://huggingface.co/OBLITERATUS/gemma-4-E4B-it-OBLITERATED). ## Available Quantizations The following quantization methods are provided to suit different memory and performance requirements: | Filename | Quant Type | Description | | :--- | :--- | :--- | | `gemma-4-E4B-it-obliterated-Q3_K_M.gguf` | **Q3_K_M** | Very small, high quality loss. Good for extreme low-VRAM scenarios. | | `gemma-4-E4B-it-obliterated-Q4_0.gguf` | **Q4_0** | Legacy format. Fast, but generally superseded by K-quants. | | `gemma-4-E4B-it-obliterated-Q4_K_M.gguf` | **Q4_K_M** | **Recommended.** Excellent balance of size, speed, and minimal quality loss. | | `gemma-4-E4B-it-obliterated-Q5_0.gguf` | **Q5_0** | Legacy format. Slightly higher quality and larger than Q4_0. | | `gemma-4-E4B-it-obliterated-Q5_K_M.gguf` | **Q5_K_M** | High quality. Recommended if you have enough RAM/VRAM to spare over Q4_K_M. | | `gemma-4-E4B-it-obliterated-Q6_K.gguf` | **Q6_K** | Very high quality. Near-perfect recreation of the original unquantized model. | | `gemma-4-E4B-it-obliterated-Q8_0.gguf` | **Q8_0** | Extremely high quality. Almost indistinguishable from fp16, but requires significant memory. | ## How to Use with llama.cpp Once you have downloaded `llama.cpp` and compiled it, you can run this model via the command line. **Basic CLI usage:** ```bash ./llama-cli -m gemma-4-E4B-it-obliterated-Q4_K_M.gguf -p "Your prompt goes here" -n 512