--- library_name: mlx license: apache-2.0 license_link: https://ai.google.dev/gemma/docs/gemma_4_license pipeline_tag: text-generation tags: - mlx - lora - adapters - gemma4 - reasoning - sft - opus - claude-code - chain-of-thought - tool-use - ravenx - apple-silicon - turboquant - kv-cache-compression - long-context base_model: deadbydawn101/gemma-4-E4B-mlx-4bit base_model_relation: adapter language: - en --- # gemma-4-E4B โ€” Opus Reasoning + Claude Code LoRA ## ๐Ÿง  Opus Reasoning + Claude Code LoRA LoRA adapters trained on **Claude Opus 4.6 reasoning traces** and **Claude Code tool-use patterns** โ€” applied on top of `deadbydawn101/gemma-4-E4B-mlx-4bit` to give Gemma 4 a reasoning-heavy, structured assistant style. > **What this means:** these adapters teach the model to think before answering โ€” using `` tags for chain-of-thought, multi-step reasoning, and tool-invocation patterns extracted from real Claude Code sessions. ## What's in this LoRA | Source | Examples | Description | |--------|--------:|-------------| | **Crownelius/Opus-4.6-Reasoning-2100x-formatted** | 2,054 | Claude Opus 4.6 reasoning traces formatted with `` tags | | **Claude Code tool-use patterns** | 140 files | Real Claude Code agentic patterns โ€” file read/write, bash, search loops | | **Total** | **2,163** | SFT dataset: assistant completions only (`--train-on-completions`) | Training on **completions only** means the model learns the *response style* without memorizing specific facts โ€” it generalizes to new prompts. ## Adapter Details | Property | Value | |----------|-------| | **Base model** | `deadbydawn101/gemma-4-E4B-mlx-4bit` | | **Adapter type** | LoRA (MLX SFT) | | **File size** | **658.8 MB** | | **Rank** | 8 | | **Alpha** | 16.0 | | **Dropout** | 0.0 | | **Trainable params** | 325M / 7,993M total (4.07%) | ## Training Config | Setting | Value | |---------|------:| | Iterations | 1,000 | | Batch size | 2 + grad accum ร—4 (eff. batch 8) | | Learning rate | 1e-5 | | Max seq length | 2,048 | | Peak GPU memory | 7.876 GB | | Hardware | Apple M4 Max 128GB | ## Training Curve Loss collapsed fast โ€” the reasoning patterns absorbed cleanly: ``` Iter 10 โ†’ 2.277 Iter 20 โ†’ 0.097 โ† rapid style acquisition Iter 50 โ†’ 0.00063 Iter 100 โ†’ 0.0000398 Iter 200 โ†’ 0.0000067 (checkpoint saved) Iter 1000 โ†’ ~3.5e-7 (final) ``` ## Quickstart (MLX) ### Install base model + adapters ```bash pip install mlx-lm ``` ```python from mlx_lm import load, generate # Load base model with LoRA adapters model, tokenizer = load( "deadbydawn101/gemma-4-E4B-mlx-4bit", adapter_path="deadbydawn101/gemma-4-E4B-opus-reasoning-claude-code-lora", ) messages = [{"role": "user", "content": "Solve this step by step: A train leaves Chicago at 60mph..."}] prompt = tokenizer.apply_chat_template(messages, add_generation_prompt=True, tokenize=False) response = generate(model, tokenizer, prompt=prompt, max_tokens=1024, verbose=True) ``` ### CLI ```bash mlx_lm.generate \ --model deadbydawn101/gemma-4-E4B-mlx-4bit \ --adapter-path deadbydawn101/gemma-4-E4B-opus-reasoning-claude-code-lora \ --prompt "Write a Python function to find prime numbers and explain your reasoning." \ --max-tokens 1024 ``` ## Intended Use Best for prompts where you want the model to: - **Think step by step** before responding - Handle **multi-step problems** (math, logic, code debugging) - Follow **agentic tool-use patterns** (read โ†’ reason โ†’ act โ†’ verify) - Produce well-structured, deliberate completions Not ideal for: - Short creative tasks (adds reasoning overhead) - Casual chitchat ## Files | File | Description | |------|-------------| | `adapters.safetensors` | LoRA weights (658.8 MB) | | `adapter_config.json` | Config: `rank=8, alpha=16, dropout=0.0` | ## โšก TurboQuant-MLX Compatibility Works alongside **[TurboQuant-MLX](https://github.com/DeadByDawn101/turboquant-mlx)** โ€” combine LoRA fine-tuning with 4.6x KV cache compression for long-context reasoning with Claude-style behavior. โ†’ [TurboQuant-MLX on GitHub](https://github.com/DeadByDawn101/turboquant-mlx) ## Related Models | Model | Size | Description | |-------|------|-------------| | [deadbydawn101/gemma-4-E4B-mlx-4bit](https://huggingface.co/deadbydawn101/gemma-4-E4B-mlx-4bit) | 4.86 GB | Base model โ€” load this first | | [deadbydawn101/gemma-4-E2B-Heretic-Uncensored-mlx-4bit](https://huggingface.co/deadbydawn101/gemma-4-E2B-Heretic-Uncensored-mlx-4bit) | 3.34 GB | 2B uncensored abliterated variant | --- *Trained and released by [deadbydawn101](https://huggingface.co/deadbydawn101) ยท RavenX AI*