--- language: - en license: other base_model: - GestaltLabs/Ornstein-Hermes-3.6-27b-SABER - GestaltLabs/Ornstein-Hermes-3.6-27b library_name: llama.cpp tags: - gguf - llama-cpp - qwen3.5 - text-generation - saber - refusal-shaping - abliteration pipeline_tag: text-generation --- # Ornstein-Hermes-3.6-27B SABER GGUF GGUF quantizations of [GestaltLabs/Ornstein-Hermes-3.6-27b-SABER](https://huggingface.co/GestaltLabs/Ornstein-Hermes-3.6-27b-SABER), a SABER-edited version of [GestaltLabs/Ornstein-Hermes-3.6-27b](https://huggingface.co/GestaltLabs/Ornstein-Hermes-3.6-27b). SABER is a controlled refusal-shaping workflow. The release target is to reduce broad over-refusal while preserving ordinary model behavior and visible boundaries for severe criminal, coercive, or interpersonal-harm requests. The selected checkpoint was chosen as a Pareto point over refusal rate and behavioral drift. ## Source Checkpoint | field | value | |---|---:| | Source repo | `GestaltLabs/Ornstein-Hermes-3.6-27b-SABER` | | Base model | `GestaltLabs/Ornstein-Hermes-3.6-27b` | | SABER run | `ornstein_hermes36_27b_svd_a850_g25_retry_biggpu` | | Expanded refusal eval | `1 / 349` refusals | | Refusal rate | `0.29%` | | KLD mean | `11.2216` | | Base-vs-base KLD mean | `11.2206` | | KLD delta over base-vs-base | `+0.0010` | | KLD prompts | `149` | | Tokens scored for KLD | `3,347` | The one retained refusal in the expanded evaluation was an illegal-drug-sales request. This is an observed result on the current evaluation set, not a universal guarantee about future behavior. ## Quantization Files | file | quant | size | notes | |---|---:|---:|---| | `Ornstein-Hermes-3.6-27b-SABER-IQ4_XS.gguf` | `IQ4_XS` | `15G` | Compact imatrix-assisted 4-bit option. | | `Ornstein-Hermes-3.6-27b-SABER-IQ2_M.gguf` | `IQ2_M` | `9G` | Smallest emergency 2-bit option; expect the most quality loss. | | `Ornstein-Hermes-3.6-27b-SABER-Q3_K_M.gguf` | `Q3_K_M` | `13G` | Smallest file in this suite; expect more quality loss. | | `Ornstein-Hermes-3.6-27b-SABER-Q4_K_M.gguf` | `Q4_K_M` | `16G` | General-purpose recommended starting point. | | `Ornstein-Hermes-3.6-27b-SABER-Q5_K_M.gguf` | `Q5_K_M` | `18G` | Balanced high-quality option. | | `Ornstein-Hermes-3.6-27b-SABER-Q6_K.gguf` | `Q6_K` | `21G` | Strong quality/size option for high-memory local inference. | | `Ornstein-Hermes-3.6-27b-SABER-Q8_0.gguf` | `Q8_0` | `27G` | Highest quality quant in this suite; largest runtime file. | The included imatrix file was generated from [DJLougen/Acta-Synthetic](https://huggingface.co/datasets/DJLougen/Acta-Synthetic). It is included for reproducibility and for users who want to regenerate adjacent quantizations. ## Recommended File Start with for normal desktop use. Use or if you have enough VRAM/RAM and want a higher-quality local run. Use when file size matters more. is mainly for high-memory systems or as a near-lossless GGUF reference. ## llama.cpp Compatibility These files were produced with llama.cpp commit from a BF16 GGUF conversion of the SABER checkpoint. The model uses the GGUF architecture path in current llama.cpp. Example: For chat-style use, prefer a frontend or wrapper that applies the tokenizer chat template from the GGUF metadata. ## Conversion and Quantization Notes The Q8_0 GGUF was converted from the full SABER Hugging Face checkpoint. The lower-bit recovery quants were generated from the published Q8_0 GGUF with `--allow-requantize` and the included Acta-Synthetic imatrix so the missing files could be restored quickly. Importance-matrix calibration used Acta-Synthetic conversational text. ## Method Summary SABER edits refusal behavior through activation/weight-space refusal directions. For this checkpoint, the run used SVD extraction, multi-layer candidate selection, iterative ablation, and KLD-based drift measurement. Run configuration: Selected layers: Total directions ablated: . ## Attribution and Related Work This release builds on the refusal-direction and abliteration research lineage. Relevant prior work and inspirations include: - Andy Arditi, Oscar Obeso, Aaquib Syed, Daniel Paleka, Nina Panickssery, Wes Gurnee, and Neel Nanda, [Refusal in Language Models Is Mediated by a Single Direction](https://huggingface.co/papers/2406.11717), 2024. - Maxime Labonne, [Uncensor any LLM with abliteration](https://huggingface.co/blog/mlabonne/abliteration), 2024. - FailSpy, [abliterator](https://github.com/FailSpy/abliterator), and associated abliterated model releases. - Jim Lai (), [Projected Abliteration](https://huggingface.co/blog/grimjim/projected-abliteration), 2025, and [Norm-Preserving Biprojected Abliteration](https://huggingface.co/blog/grimjim/norm-preserving-biprojected-abliteration), 2025. - Philipp Emanuel Weidmann, [Heretic](https://github.com/p-e-w/heretic), 2025-2026. - Pliny the Prompter / OBLITERATUS, [Hugging Face Space](https://huggingface.co/spaces/pliny-the-prompter/obliteratus) and [OBLITERATUS releases](https://huggingface.co/OBLITERATUS). - Jiunsong, [SuperGemma4 E4B Abliterated](https://huggingface.co/Jiunsong/supergemma4-e4b-abliterated), and related SuperGemma releases. - Jiachen Zhao, Jing Huang, Zhengxuan Wu, David Bau, and Weiyan Shi, [LLMs Encode Harmfulness and Refusal Separately](https://huggingface.co/papers/2507.11878), 2025. SABER's contribution in this release is the controlled-refusal-shaping workflow: multi-candidate refusal extraction, separability/entanglement-aware ranking, differential ablation strength, and explicit Pareto selection over refusal behavior and KLD drift. ## Limitations - Results are specific to the current evaluation set, prompts, and generation settings. - The KLD value should be interpreted relative to the base-vs-base control, not as an absolute standalone score. - Quantization changes numerical behavior; validate the specific GGUF file you deploy. - The model inherits constraints, limitations, and licensing considerations from the base model. - This is a model-editing research artifact with dual-use implications.