rdhorner's picture
Add function-calling tag and lm_studio block for tool-use detection
4062902 verified
|
raw
history blame contribute delete
3.08 kB
metadata
license: apache-2.0
library_name: gguf
pipeline_tag: image-text-to-text
base_model: OBLITERATUS/gemma-4-E4B-it-OBLITERATED
base_model_relation: quantized
quantized_by: rdhorner
tags:
  - gguf
  - gemma4
  - abliterated
  - vision
  - audio
  - multimodal
  - tool-use
  - tools
  - function-calling
  - llama.cpp
  - conversational
lm_studio:
  param_count: 8b
  use_case: tools
  release_date: 16-04-2026
  model_creator: OBLITERATUS
  prompt_template: Gemma 4
  base_model: gemma4
  original_repo: OBLITERATUS/gemma-4-E4B-it-OBLITERATED

gemma-4-E4B-it-OBLITERATED - GGUF

GGUF quantizations of OBLITERATUS/gemma-4-E4B-it-OBLITERATED, which is an abliterated version of Google's gemma-4-E4B-it produced with the OBLITERATUS method.

Converted and quantized with llama.cpp build b1-f772f6e. These GGUFs support vision, audio input, and tool calling out of the box.

Files

File Size BPW Notes
gemma-4-E4B-OBLITERATED-F16.gguf 14 GB 16.00 Full F16 text model (source for requantization)
gemma-4-E4B-OBLITERATED-Q8_0.gguf 7.5 GB 8.53 Near-lossless, largest usable quant
gemma-4-E4B-OBLITERATED-Q5_K_M.gguf 5.4 GB 6.12 Balanced quality/size
gemma-4-E4B-OBLITERATED-Q4_K_M.gguf 5.0 GB 5.67 Recommended for local use
mmproj-gemma-4-E4B-OBLITERATED-F16.gguf 945 MB - Required for vision/audio. Contains both encoders.

Pair any text GGUF with the mmproj to enable multimodal input.

Usage with llama.cpp

CLI (image + text)

llama-mtmd-cli \
  -m gemma-4-E4B-OBLITERATED-Q4_K_M.gguf \
  --mmproj mmproj-gemma-4-E4B-OBLITERATED-F16.gguf \
  --image your_image.png \
  --jinja -ngl 99 \
  -p "Describe this image in detail."

Server (OpenAI-compatible API with tool use + vision)

llama-server \
  -m gemma-4-E4B-OBLITERATED-Q4_K_M.gguf \
  --mmproj mmproj-gemma-4-E4B-OBLITERATED-F16.gguf \
  --jinja -ngl 99 -c 8192 --port 8080

Then send OpenAI-style requests to http://localhost:8080/v1/chat/completions with tools, tool_choice, and/or image_url content parts.

Notes

  • --jinja is required - Gemma 4's chat template is custom and will not load without it.
  • The mmproj contains both vision and audio encoders (1411 tensors). Audio input works the same way as images via the multimodal CLI/server.
  • This is an abliterated model: refusal directions in 21/42 layers were surgically modified. This can occasionally affect tool-call reliability on refusal-adjacent topics.
  • Reasoning is emitted through Gemma 4's native thinking channel and surfaced as reasoning_content in OpenAI-compatible responses.

Verified

Smoke-tested on the Q4_K_M build:

  • Vision: correctly described shapes and colors in a synthetic test image
  • Tool use: produced a well-formed tool_calls response to a get_weather tool prompt, finish_reason: tool_calls

License

Apache 2.0, matching the base model.