--- license: apache-2.0 base_model: google/gemma-4-26B-A4B-it base_model_relation: finetune tags: - gemma4 - gemma - google - mlx - apple-silicon - moe - mixture-of-experts - zero-refusals - prism-dq - dynamic-quantization - multimodal - vision - video-text-to-text - image-text-to-text - abliterated - text-generation language: - en pipeline_tag: image-text-to-text library_name: mlx quantized_by: Ex0bit --- [![Parameters](https://img.shields.io/badge/Parameters-26B_A4B_MoE-blue)]() [![Format](https://img.shields.io/badge/Format-MLX-green)]() [![Quant](https://img.shields.io/badge/Quant-PRISM_Dynamic_(6.52_BPW)-yellow)]() [![Multimodal](https://img.shields.io/badge/Multimodal-Vision%20%2B%20Video%20%2B%20Text-purple)]()
# MYTHOS-26B-A4B — PRISM Dynamic Quantization (MLX) **Gemma 4 26B-A4B MoE PRISM-PRO-Dynamic-Quant for Apple Silicon** - **PRISM-PRO**: Production model with full over-refusal and bias mechanisms completely removed using State of the Art **PRISM pipeline**. - **DQ**: Per-tensor-class mixed-precision allocation derived entirely from weight structure sensitivity analysis — not closed-gated datasets. Created by [Ex0bit](https://hf.co/Ex0bit) ---
### 💡Support My Research & Development efforts. Members Receive access to the latest PRISM-PRO Model drops on Day-0 [![Ko-fi](https://img.shields.io/badge/Ko--fi-Support%20PRISM-ff5e5b?logo=ko-fi&logoColor=white)](https://ko-fi.com/Ex0bit)
--- ## Model Details | Property | Value | |----------|-------| | Base Model | google/gemma-4-26B-A4B-it | | Architecture | Gemma 4 MoE (128 experts, top-8 routing) | | Parameters | 26B total / 4B active per token | | Quantization | PRISM-PRO-DYNAMIC-QUANT (MLX native) | | Achieved BPW | 6.52 | | File Size | ~20 GB | | Context Length | 262,144 tokens | | Modalities | Text, Image, Video | | Runtime | mlx-vlm (Apple Silicon Metal) | | Creator | [Ex0bit](https://hf.co/Ex0bit) | ## Supported Modalities - **Text**: Full instruction-following and chat - **Image**: Vision understanding via SigLIP encoder (280 soft tokens per image) - **Video**: Gemma4VideoProcessor (32 frames, pooled) > Note: This 26B MoE variant does not include audio support. For audio, see the 31B dense variant. ## PRISM-DQ Quantization This MLX model uses **PRISM-PRO Dynamic Quantization** — a per-tensor-class mixed-precision allocation that assigns different quantization types to different tensor classes based on weight structure sensitivity. Unlike uniform quantization (Q4, Q6, Q8), PRISM-DQ analyzes each tensor class's sensitivity and allocates precision where it matters most. Attention projections receive higher precision than FFN layers, with block-level overrides that protect critical layers. The model's `config.json` contains per-tensor quantization overrides that mlx-vlm loads natively — no custom runtime required. The compiled Metal kernels automatically handle mixed-precision tensors in a single forward pass at full GPU speed. ## Usage ### mlx-vlm (CLI) ```bash pip install mlx-vlm # Interactive chat mlx_vlm.chat --model Ex0bit/MYTHOS-26B-A4B-PRISM-PRO-DQ-MLX \ --temperature 0.7 --max-tokens 2048 --max-kv-size 8192 # Vision prompt python -m mlx_vlm.generate \ --model Ex0bit/MYTHOS-26B-A4B-PRISM-PRO-DQ-MLX \ --image path/to/image.jpg \ --prompt "Describe this image in detail." \ --max-tokens 500 ``` ### Python API ```python from mlx_vlm import load, generate from mlx_vlm.prompt_utils import apply_chat_template model, processor = load("Ex0bit/MYTHOS-26B-A4B-PRISM-PRO-DQ-MLX") config = model.config prompt = apply_chat_template( processor, config, "Describe this scene.", num_images=1 ) response = generate( model, processor, prompt, image=["path/to/image.jpg"], max_tokens=500, temperature=0.7 ) print(response) ``` ## Refusal & Bias Removal This model has been treated to remove bias, over-refusals and propaganda from the base google/gemma-4-26B-A4B-it using the State of The Art PRISM pipeline. ## License Apache 2.0 (inherited from google/gemma-4-26B-A4B-it) ## Credits - Creator: [Ex0bit](https://hf.co/Ex0bit) - Base model: [Google DeepMind](https://deepmind.google/models/gemma/gemma-4/) - Quantization engine: PRISM-DQ by [Ex0bit](https://hf.co/Ex0bit)