froggeric
/

Qwen3.6-35B-A3B-Uncensored-Heretic-MLX-4bit

Image-Text-to-Text

4-bit precision

Model card Files Files and versions

froggeric commited on May 1

Commit

7c41485

·

verified ·

1 Parent(s): 2f14381

Upload README.md with huggingface_hub

Files changed (1) hide show

README.md +20 -0

README.md CHANGED Viewed

@@ -170,6 +170,26 @@ This approach was submitted as a pull request to Heretic but was not merged —
 ---
 ## Sampling
 From the official Qwen authors. Reserve 128K+ context for thinking mode.

 ---
+## How it compares
+### Community results
+r/LocalLLaMA users have been A/B-testing various uncensored Qwen 3.6 variants — [Heretic](https://github.com/p-e-w/heretic), HauhauCS Aggressive, abliterix, and simple orthogonal projection. The pattern is consistent: **Heretic produces the best balance of refusal removal and output quality**.
+[Community discussion →](https://www.reddit.com/r/LocalLLaMA/comments/1sw5fb7/qwen36_35b_a3b_heretic_kld_00015_incredible_model/)
+### Why
+Most abliteration methods treat all layers identically. Qwen 3.6's hybrid attention (3:1 linear-to-softmax ratio) means a single parameter set either under-abliterate the DeltaNet blocks or over-abliterate the softmax blocks. Architecture-aware abliteration — separate parameters per attention type — is the key differentiator.
+### A note on SSM conv1d "repair"
+Some uncensored variants apply a pre-processing step that rescales SSM conv1d weights before abliteration, claiming to fix "outlier" tensors in the DeltaNet linear attention layers. This technique (originating as "Sig-ScaleSync") was benchmarked with **284 data points** across perplexity, needle-in-a-haystack, and repetition tests at multiple context lengths (4K–128K). Result: **perplexity degraded at every length with no improvement** in NIAH or repetition. The unrepaired original weights perform best.
+Abliterating a degraded baseline can yield a lower measured KL divergence — but that measures distance from a worse starting point, not better preservation of the original model's capabilities.
+---
 ## Sampling
 From the official Qwen authors. Reserve 128K+ context for thinking mode.