OsaurusAI/gemma-4-E4B-it-qat-JANG_4M

JANG_4M MLX affine bundle converted from google/gemma-4-E4B-it-qat-q4_0-unquantized.

This bundle keeps Gemma 4 bookends and media-sensitive components coherent: token embeddings/output projection, norms, media towers/embedders, and Gemma 4 per-layer embedding/gate/projector tensors are fp16 passthrough; self-attention and MoE router projections are 8-bit affine; decoder MLP/expert bulk is 4-bit affine.

Bundle

Field	Value
Source	`google/gemma-4-E4B-it-qat-q4_0-unquantized`
Architecture	`gemma4` / `Gemma4ForConditionalGeneration`
Text layers	42
Hidden size	2560
Weight format	`jang_affine`
Top-level quantization	bits=8, group_size=32, mode=affine
Tier bits	attention=8, router=8, mlp=4, embed=16, per_layer_media=16
Quantized modules	259 affine bases with `.scales` and `.biases` sidecars
Shards	6 safetensors shards
Capabilities	`multimodal`

Modalities

Modality	Status
Text	supported
Vision	source config present and preserved
Audio	`audio_config` is present and preserved.
Video	No `video_config` is present, so this card does not claim a verified video runtime path.

Runtime support depends on a Gemma 4 compatible MLX/vMLX loader that understands config.json quantization overrides, jang_config.json, and Gemma 4 processor/chat-template files.

Runtime Metadata

config.json has source-derived has_vision, has_audio, has_video, modalities, and capabilities.
tokenizer_config.json includes bos_token_id, eos_token_id, pad_token_id, and the patched Gemma 4 chat template.
processor_config.json is preserved for Gemma 4 multimodal processing.
MTP/speculative drafter weights are not present in the source checkpoint; metadata is mtp: none / mtp_policy: none.

Downloads last month: 176

Safetensors

Model size

5B params

Tensor type

BF16

F16

U32

MLX

Hardware compatibility

Quantized

Model tree for OsaurusAI/gemma-4-E4B-it-qat-JANG_4M

Base model

google/gemma-4-E4B

Finetuned

google/gemma-4-E4B-it

Finetuned

google/gemma-4-E4B-it-qat-q4_0-unquantized

Finetuned

(12)

this model