Near-lossless GGUF quants of:

Uses the chat template from google/gemma-4-26B-A4B-it (2026-05-18), replacing the outdated chat template that was originally bundled with the base QAT model.


Original QAT Specifications

The base QAT model was trained with the following specifications:

  • Text & Draft Models: 4-bit
  • Multimodal Projector (mmproj): BF16

Quantization Error Evaluation

  • Dequantization: Quantized weights were dequantized to F32 for evaluation.
  • Error Calculation: Quantization error metrics (compared to the BF16 baseline) were computed and accumulated in F64 precision to prevent numerical underflow and precision loss.

Metrics:

  • MAE (Mean Absolute Error)
  • RMSE (Root Mean Squared Error)
  • Max Error

Interpretation of Metrics: These error metrics do not directly reflect actual model performance. This is because imatrix optimization intentionally increases measured error by downweighting less important parameters to improve overall performance. However, these metrics remain useful for estimating how close the quantized model is to the original, lossless quality.

gemma-4-26B-A4B-it-qat

Model MAE RMSE Max Error
JMingo/gemma-4-26B-A4B-it-qat-BF16.gguf (Baseline) 0.00000000 0.00000000 0.00000000
JMingo/gemma-4-26B-A4B-it-qat-Q4_0.gguf 0.00001919 0.00003454 0.00164795
unsloth/gemma-4-26B-A4B-it-qat-UD-Q4_K_XL.gguf 0.00002271 0.00004124 0.00195312
google/gemma-4-26B_q4_0-it.gguf 0.00143565 0.00203419 0.05419922
lmstudio-community/gemma-4-26B-A4B-it-QAT-Q4_0.gguf 0.00143565 0.00203419 0.05419922
mradermacher/gemma-4-26B-A4B-it-qat-q4_0-unquantized.i1-Q4_0.gguf 0.00089195 0.00150608 0.20263672

mtp-gemma-4-26B-A4B-it-qat

Model MAE RMSE Max Error
JMingo/mtp-gemma-4-26B-A4B-it-qat-BF16.gguf (Baseline) 0.00000000 0.00000000 0.00000000
JMingo/mtp-gemma-4-26B-A4B-it-qat-Q4_0.gguf 0.00002363 0.00003926 0.00158691
unsloth/mtp-gemma-4-26B-A4B-it.gguf 0.00003318 0.00005112 0.00341797
Downloads last month
1,988
GGUF
Model size
25B params
Architecture
gemma4
Hardware compatibility
Log In to add your hardware

4-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for JMingo/gemma-4-26B-A4B-it-qat-GGUF

Quantized
(22)
this model
Quantizations
1 model