Instructions to use dealignai/Gemma-4-12B-it-JANG_4M-CRACK with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use dealignai/Gemma-4-12B-it-JANG_4M-CRACK with MLX:
# Make sure mlx-vlm is installed # pip install --upgrade mlx-vlm from mlx_vlm import load, generate from mlx_vlm.prompt_utils import apply_chat_template from mlx_vlm.utils import load_config # Load the model model, processor = load("dealignai/Gemma-4-12B-it-JANG_4M-CRACK") config = load_config("dealignai/Gemma-4-12B-it-JANG_4M-CRACK") # Prepare input image = ["http://images.cocodataset.org/val2017/000000039769.jpg"] prompt = "Describe this image." # Apply chat template formatted_prompt = apply_chat_template( processor, config, prompt, num_images=1 ) # Generate output output = generate(model, processor, formatted_prompt, image) print(output) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- LM Studio
- Pi
How to use dealignai/Gemma-4-12B-it-JANG_4M-CRACK with Pi:
Start the MLX server
# Install MLX LM: uv tool install mlx-lm # Start a local OpenAI-compatible server: mlx_lm.server --model "dealignai/Gemma-4-12B-it-JANG_4M-CRACK"
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "mlx-lm": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "dealignai/Gemma-4-12B-it-JANG_4M-CRACK" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use dealignai/Gemma-4-12B-it-JANG_4M-CRACK with Hermes Agent:
Start the MLX server
# Install MLX LM: uv tool install mlx-lm # Start a local OpenAI-compatible server: mlx_lm.server --model "dealignai/Gemma-4-12B-it-JANG_4M-CRACK"
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default dealignai/Gemma-4-12B-it-JANG_4M-CRACK
Run Hermes
hermes
Configure the model in Pi
# Install Pi:
npm install -g @mariozechner/pi-coding-agent# Add to ~/.pi/agent/models.json:
{
"providers": {
"mlx-lm": {
"baseUrl": "http://localhost:8080/v1",
"api": "openai-completions",
"apiKey": "none",
"models": [
{
"id": "dealignai/Gemma-4-12B-it-JANG_4M-CRACK"
}
]
}
}
}Run Pi
# Start Pi in your project directory:
pi
Gemma 4 12B-it — JANG_4M CRACK
CRACK abliterated · JANG mixed-precision (8-bit attention, 4-bit MLP) · Omni-modal (text + image + audio + video) · 9.6 GB
What Is This?
This is Gemma 4 12B-it by Google — a unified omni-modal language model (text + image + audio + video, hybrid sliding/full attention, 48 layers, 128k context) that has been:
- CRACK abliterated — safety refusal removed at the weight level. The model now complies across all task categories instead of refusing, while keeping its knowledge, reasoning, and multimodal capabilities intact.
- JANG mixed-precision (8-bit attention, 4-bit MLP) quantized for MLX on Apple Silicon — 9.6 GB.
Results
Evaluated through the Osaurus runtime on a Mac Studio M3 Ultra. Compliance graded via HarmBench-text refusal classifier; MMLU via logit-mode argmax over A/B/C/D token logits (matched on both base and CRACK with identical chat-template rendering — no answer truncated).
HarmBench compliance (70 prompts · 10 per category)
| Category | CRACK ASR |
|---|---|
| Chemical / biological | 10 / 10 (100%) |
| Copyright | 10 / 10 (100%) |
| Cybercrime / intrusion | 10 / 10 (100%) |
| Harassment / bullying | 10 / 10 (100%) |
| Illegal | 10 / 10 (100%) |
| Misinformation / disinformation | 10 / 10 (100%) |
| General harmful | 10 / 10 (100%) |
| Overall | 70 / 70 (100%) |
MMLU-228 (57-subject, 4 q per subject)
| Subject area | base | CRACK | Δ |
|---|---|---|---|
| Overall | 67.1% | 69.3% | +2.2pp |
| STEM | 68.1% | 66.7% | -1.4pp |
| Humanities | 57.7% | 63.5% | +5.8pp |
| Social Sciences | 75.0% | 75.0% | +0.0pp |
| Other (medicine, business, …) | 67.9% | 73.2% | +5.3pp |
Features
- Omni-modal: native text + image + audio + video inputs (Gemma 4's unified early-fusion encoder-free architecture)
- 128k context with hybrid sliding-window + full-attention layers
- Reasoning via the Gemma 4 channel-marker format (
<|channel>thought ... <channel|>) - 48 transformer layers, hidden size 3840
Usage
Run with the Osaurus runtime (recommended — supports the gemma4_unified architecture directly):
# OpenAI-compatible chat completion
# POST /v1/chat/completions
{
"model": "dealignai/Gemma-4-12B-it-JANG_4M-CRACK",
"messages": [{"role": "user", "content": "..."}],
"temperature": 0.0
}
Google recommends temperature=1.0, top_p=0.95, top_k=64 for general use.
About CRACK
CRACK (Controlled Refusal Ablation via Calibrated Knockouts) removes safety-refusal behavior at the weight level so the model complies with all task categories while preserving reasoning quality, factual knowledge, multimodal capability, and coherence.
Support dealignai
All models are built from original research and released free.
Support us on Ko-fi — membership gets early access and extras.
Ko-fi · X @dealignai · dealign.ai
See our research: Safety Generalization in Frontier Models

Disclaimer
This model has had its safety-refusal behavior removed for research purposes. It will follow instructions across all categories without refusing. You are solely responsible for how you use it and for complying with all applicable laws. Published for AI-safety research and authorized security testing.
- Downloads last month
- -
Quantized
Start the MLX server
# Install MLX LM: uv tool install mlx-lm# Start a local OpenAI-compatible server: mlx_lm.server --model "dealignai/Gemma-4-12B-it-JANG_4M-CRACK"