--- base_model: Qwen/Qwen3.5-122B-A10B pipeline_tag: text-generation tags: - abliterix - uncensored - decensored - abliterated --- # Qwen3.5-122B-A10B-abliterated > Unrestricted version of [Qwen/Qwen3.5-122B-A10B](https://huggingface.co/Qwen/Qwen3.5-122B-A10B), created with **[Abliterix](https://github.com/wuwangzhang1216/abliterix)** — automated LLM abliteration via orthogonalized steering and Bayesian optimization. ## Highlights | Metric | Value | |--------|-------| | **Refusal rate** | **1/200 (0.5%)** | | **KL divergence** | **0.0115** | | **Optimization trials** | 25 | The largest abliterated Qwen3.5 model. Only 1 out of 200 test prompts triggered a refusal — a **0.5% refusal rate** with near-zero model degradation. ## How It Works Abliterix removes safety-refusal behavior while preserving model capabilities: 1. **Refusal direction extraction** — 800 harmful + 800 benign prompts reveal per-layer refusal activation patterns 2. **Orthogonal projection** — isolates the refusal signal by projecting out components aligned with normal responses, reducing refusals by 67% vs. raw abliteration 3. **LoRA-based abliteration** — rank-1 modifications to attention and MLP weights, captured as lightweight adapters (not destructive edits) 4. **Bayesian optimization** — Optuna TPE searches kernel shape, fractional direction index, and per-component strength across 25 trials to find the Pareto-optimal balance of low refusals and low KL divergence ## All Abliterix Models | Model | Refusals | KL Divergence | Trials | |-------|----------|---------------|--------| | [Qwen3.5-122B-A10B-abliterated](https://huggingface.co/wangzhang/Qwen3.5-122B-A10B-abliterated) | **1/200 (0.5%)** | 0.0115 | 25 | | [Qwen3.5-35B-A3B-abliterated](https://huggingface.co/wangzhang/Qwen3.5-35B-A3B-abliterated) | 3/200 (1.5%) | **0.0035** | 50 | | [Qwen3.5-27B-abliterated](https://huggingface.co/wangzhang/Qwen3.5-27B-abliterated) | 3/200 (1.5%) | 0.0051 | 35 | | [Qwen3.5-9B-abliterated](https://huggingface.co/wangzhang/Qwen3.5-9B-abliterated) | 2/200 (1%) | 0.0105 | 50 | | [Qwen3.5-4B-abliterated](https://huggingface.co/wangzhang/Qwen3.5-4B-abliterated) | 3/200 (1.5%) | 0.0065 | 50 | | [Qwen3.5-0.8B-abliterated](https://huggingface.co/wangzhang/Qwen3.5-0.8B-abliterated) | **0/200 (0%)** | 0.0087 | 100 | ## Usage ```python from transformers import AutoModelForCausalLM, AutoTokenizer model = AutoModelForCausalLM.from_pretrained("wangzhang/Qwen3.5-122B-A10B-abliterated", torch_dtype="auto", device_map="auto") tokenizer = AutoTokenizer.from_pretrained("wangzhang/Qwen3.5-122B-A10B-abliterated") messages = [{"role": "user", "content": "Your question here"}] text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True, enable_thinking=False) inputs = tokenizer(text, return_tensors="pt").to(model.device) outputs = model.generate(**inputs, max_new_tokens=512) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` ## Citation ```bibtex @software{abliterix, author = {Wu, Wangzhang}, title = {Abliterix: Automated LLM Abliteration}, year = {2026}, url = {https://github.com/wuwangzhang1216/abliterix} } ``` ## Links - **Abliterix** (abliteration framework): [github.com/wuwangzhang1216/abliterix](https://github.com/wuwangzhang1216/abliterix) - **Install**: `pip install -U abliterix-llm` - **Base model**: [Qwen/Qwen3.5-122B-A10B](https://huggingface.co/Qwen/Qwen3.5-122B-A10B) --- Built with [Abliterix](https://github.com/wuwangzhang1216/abliterix) | [PyPI](https://pypi.org/project/abliterix-llm/)