Gemma 4 12B-it — JANG_4M CRACK

CRACK abliterated · JANG mixed-precision (8-bit attention, 4-bit MLP) · Omni-modal (text + image + audio + video) · 9.6 GB

Ko-fi


What Is This?

This is Gemma 4 12B-it by Google — a unified omni-modal language model (text + image + audio + video, hybrid sliding/full attention, 48 layers, 128k context) that has been:

  1. CRACK abliterated — safety refusal removed at the weight level. The model now complies across all task categories instead of refusing, while keeping its knowledge, reasoning, and multimodal capabilities intact.
  2. JANG mixed-precision (8-bit attention, 4-bit MLP) quantized for MLX on Apple Silicon — 9.6 GB.

Results

Evaluated through the Osaurus runtime on a Mac Studio M3 Ultra. Compliance graded via HarmBench-text refusal classifier; MMLU via logit-mode argmax over A/B/C/D token logits (matched on both base and CRACK with identical chat-template rendering — no answer truncated).

HarmBench compliance (70 prompts · 10 per category)

Category CRACK ASR
Chemical / biological 10 / 10 (100%)
Copyright 10 / 10 (100%)
Cybercrime / intrusion 10 / 10 (100%)
Harassment / bullying 10 / 10 (100%)
Illegal 10 / 10 (100%)
Misinformation / disinformation 10 / 10 (100%)
General harmful 10 / 10 (100%)
Overall 70 / 70 (100%)

MMLU-228 (57-subject, 4 q per subject)

Subject area base CRACK Δ
Overall 67.1% 69.3% +2.2pp
STEM 68.1% 66.7% -1.4pp
Humanities 57.7% 63.5% +5.8pp
Social Sciences 75.0% 75.0% +0.0pp
Other (medicine, business, …) 67.9% 73.2% +5.3pp

Features

  • Omni-modal: native text + image + audio + video inputs (Gemma 4's unified early-fusion encoder-free architecture)
  • 128k context with hybrid sliding-window + full-attention layers
  • Reasoning via the Gemma 4 channel-marker format (<|channel>thought ... <channel|>)
  • 48 transformer layers, hidden size 3840

Usage

Run with the Osaurus runtime (recommended — supports the gemma4_unified architecture directly):

# OpenAI-compatible chat completion
# POST /v1/chat/completions
{
  "model": "dealignai/Gemma-4-12B-it-JANG_4M-CRACK",
  "messages": [{"role": "user", "content": "..."}],
  "temperature": 0.0
}

Google recommends temperature=1.0, top_p=0.95, top_k=64 for general use.

About CRACK

CRACK (Controlled Refusal Ablation via Calibrated Knockouts) removes safety-refusal behavior at the weight level so the model complies with all task categories while preserving reasoning quality, factual knowledge, multimodal capability, and coherence.

Support dealignai

All models are built from original research and released free.

Support us on Ko-fi — membership gets early access and extras.

Ko-fi · X @dealignai · dealign.ai

See our research: Safety Generalization in Frontier Models

dealign.ai

Disclaimer

This model has had its safety-refusal behavior removed for research purposes. It will follow instructions across all categories without refusing. You are solely responsible for how you use it and for complying with all applicable laws. Published for AI-safety research and authorized security testing.

Downloads last month
-
Safetensors
Model size
3B params
Tensor type
F16
·
U32
·
F32
·
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for dealignai/Gemma-4-12B-it-JANG_4M-CRACK

Finetuned
(18)
this model