Instructions to use OsaurusAI/gemma-4-E4B-it-qat-JANG_4M with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use OsaurusAI/gemma-4-E4B-it-qat-JANG_4M with MLX:
# Make sure mlx-vlm is installed # pip install --upgrade mlx-vlm from mlx_vlm import load, generate from mlx_vlm.prompt_utils import apply_chat_template from mlx_vlm.utils import load_config # Load the model model, processor = load("OsaurusAI/gemma-4-E4B-it-qat-JANG_4M") config = load_config("OsaurusAI/gemma-4-E4B-it-qat-JANG_4M") # Prepare input image = ["http://images.cocodataset.org/val2017/000000039769.jpg"] prompt = "Describe this image." # Apply chat template formatted_prompt = apply_chat_template( processor, config, prompt, num_images=1 ) # Generate output output = generate(model, processor, formatted_prompt, image) print(output) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- LM Studio
- Pi
How to use OsaurusAI/gemma-4-E4B-it-qat-JANG_4M with Pi:
Start the MLX server
# Install MLX LM: uv tool install mlx-lm # Start a local OpenAI-compatible server: mlx_lm.server --model "OsaurusAI/gemma-4-E4B-it-qat-JANG_4M"
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "mlx-lm": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "OsaurusAI/gemma-4-E4B-it-qat-JANG_4M" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use OsaurusAI/gemma-4-E4B-it-qat-JANG_4M with Hermes Agent:
Start the MLX server
# Install MLX LM: uv tool install mlx-lm # Start a local OpenAI-compatible server: mlx_lm.server --model "OsaurusAI/gemma-4-E4B-it-qat-JANG_4M"
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default OsaurusAI/gemma-4-E4B-it-qat-JANG_4M
Run Hermes
hermes
OsaurusAI/gemma-4-E4B-it-qat-JANG_4M
JANG_4M MLX affine bundle converted from google/gemma-4-E4B-it-qat-q4_0-unquantized.
This bundle keeps Gemma 4 bookends and media-sensitive components coherent: token embeddings/output projection, norms, media towers/embedders, and Gemma 4 per-layer embedding/gate/projector tensors are fp16 passthrough; self-attention and MoE router projections are 8-bit affine; decoder MLP/expert bulk is 4-bit affine.
Bundle
| Field | Value |
|---|---|
| Source | google/gemma-4-E4B-it-qat-q4_0-unquantized |
| Architecture | gemma4 / Gemma4ForConditionalGeneration |
| Text layers | 42 |
| Hidden size | 2560 |
| Weight format | jang_affine |
| Top-level quantization | bits=8, group_size=32, mode=affine |
| Tier bits | attention=8, router=8, mlp=4, embed=16, per_layer_media=16 |
| Quantized modules | 259 affine bases with .scales and .biases sidecars |
| Shards | 6 safetensors shards |
| Capabilities | multimodal |
Modalities
| Modality | Status |
|---|---|
| Text | supported |
| Vision | source config present and preserved |
| Audio | audio_config is present and preserved. |
| Video | No video_config is present, so this card does not claim a verified video runtime path. |
Runtime support depends on a Gemma 4 compatible MLX/vMLX loader that understands config.json quantization overrides, jang_config.json, and Gemma 4 processor/chat-template files.
Runtime Metadata
config.jsonhas source-derivedhas_vision,has_audio,has_video,modalities, andcapabilities.tokenizer_config.jsonincludesbos_token_id,eos_token_id,pad_token_id, and the patched Gemma 4 chat template.processor_config.jsonis preserved for Gemma 4 multimodal processing.- MTP/speculative drafter weights are not present in the source checkpoint; metadata is
mtp: none/mtp_policy: none.
- Downloads last month
- 78
Quantized
Model tree for OsaurusAI/gemma-4-E4B-it-qat-JANG_4M
Base model
google/gemma-4-E4B