--- license: apache-2.0 library_name: gguf pipeline_tag: image-text-to-text base_model: OBLITERATUS/gemma-4-E4B-it-OBLITERATED base_model_relation: quantized quantized_by: rdhorner tags: - gguf - gemma4 - abliterated - vision - audio - multimodal - tool-use - tools - function-calling - llama.cpp - conversational lm_studio: param_count: 8b use_case: tools release_date: "16-04-2026" model_creator: OBLITERATUS prompt_template: Gemma 4 base_model: gemma4 original_repo: OBLITERATUS/gemma-4-E4B-it-OBLITERATED --- # gemma-4-E4B-it-OBLITERATED - GGUF GGUF quantizations of [OBLITERATUS/gemma-4-E4B-it-OBLITERATED](https://huggingface.co/OBLITERATUS/gemma-4-E4B-it-OBLITERATED), which is an abliterated version of Google's `gemma-4-E4B-it` produced with the OBLITERATUS method. Converted and quantized with [llama.cpp](https://github.com/ggml-org/llama.cpp) build `b1-f772f6e`. These GGUFs support **vision**, **audio** input, and **tool calling** out of the box. ## Files | File | Size | BPW | Notes | |---|---|---|---| | `gemma-4-E4B-OBLITERATED-F16.gguf` | 14 GB | 16.00 | Full F16 text model (source for requantization) | | `gemma-4-E4B-OBLITERATED-Q8_0.gguf` | 7.5 GB | 8.53 | Near-lossless, largest usable quant | | `gemma-4-E4B-OBLITERATED-Q5_K_M.gguf` | 5.4 GB | 6.12 | Balanced quality/size | | `gemma-4-E4B-OBLITERATED-Q4_K_M.gguf` | 5.0 GB | 5.67 | **Recommended** for local use | | `mmproj-gemma-4-E4B-OBLITERATED-F16.gguf` | 945 MB | - | **Required** for vision/audio. Contains both encoders. | Pair any text GGUF with the mmproj to enable multimodal input. ## Usage with llama.cpp ### CLI (image + text) ```bash llama-mtmd-cli \ -m gemma-4-E4B-OBLITERATED-Q4_K_M.gguf \ --mmproj mmproj-gemma-4-E4B-OBLITERATED-F16.gguf \ --image your_image.png \ --jinja -ngl 99 \ -p "Describe this image in detail." ``` ### Server (OpenAI-compatible API with tool use + vision) ```bash llama-server \ -m gemma-4-E4B-OBLITERATED-Q4_K_M.gguf \ --mmproj mmproj-gemma-4-E4B-OBLITERATED-F16.gguf \ --jinja -ngl 99 -c 8192 --port 8080 ``` Then send OpenAI-style requests to `http://localhost:8080/v1/chat/completions` with `tools`, `tool_choice`, and/or `image_url` content parts. ## Notes - `--jinja` is **required** - Gemma 4's chat template is custom and will not load without it. - The mmproj contains both vision and audio encoders (1411 tensors). Audio input works the same way as images via the multimodal CLI/server. - This is an *abliterated* model: refusal directions in 21/42 layers were surgically modified. This can occasionally affect tool-call reliability on refusal-adjacent topics. - Reasoning is emitted through Gemma 4's native thinking channel and surfaced as `reasoning_content` in OpenAI-compatible responses. ## Verified Smoke-tested on the Q4_K_M build: - **Vision**: correctly described shapes and colors in a synthetic test image - **Tool use**: produced a well-formed `tool_calls` response to a `get_weather` tool prompt, `finish_reason: tool_calls` ## License Apache 2.0, matching the base model.