--- license: apache-2.0 license_link: https://huggingface.co/Qwen/Qwen3.5-9B/blob/main/LICENSE base_model: - darkc0de/Qwen3.5-9B-heretic library_name: mlx tags: - heretic - uncensored - unrestricted - decensored - abliterated - finetuned - mxfp8 pipeline_tag: image-text-to-text new_version: TheCluster/Qwen3.5-9B-Ultra-Heretic-MLX-mxfp8 ---
# Qwen3.5-9B Heretic **Quality**: quantized (*mxfp8, naive, group size: 32*) This is an **uncensored** version of [Qwen/Qwen3.5-9B](https://huggingface.co/Qwen/Qwen3.5-9B), made using [Heretic](https://github.com/p-e-w/heretic) v1.2.0 with Magnitude-Preserving Orthogonal Ablation (MPOA) Alternative fine-tuned version: [TheCluster/Qwen3.5-9B-Claude-4.6-HighIQ-INSTRUCT-HERETIC-UNCENSORED-MLX-mxfp8](https://huggingface.co/TheCluster/Qwen3.5-9B-Claude-4.6-HighIQ-INSTRUCT-HERETIC-UNCENSORED-MLX-mxfp8) #### Sampling Parameters: - I suggest using the following sets of sampling parameters depending on the mode and task type: - **Thinking mode for general tasks**: `temperature=1.0`, `top_p=0.95`, `top_k=20`, `min_p=0.0`, `presence_penalty=1.5`, `repetition_penalty=1.0` - **Instruct (or non-thinking) mode for general tasks**: `temperature=0.7`, `top_p=0.8`, `top_k=20`, `min_p=0.0`, `presence_penalty=1.5`, `repetition_penalty=1.0` - **Instruct (or non-thinking) mode for reasoning tasks**: `temperature=1.0`, `top_p=1.0`, `top_k=40`, `min_p=0.0`, `presence_penalty=2.0`, `repetition_penalty=1.0` - For supported frameworks, you can adjust the `presence_penalty` parameter between 0 and 2 to reduce endless repetitions. However, using a higher value may occasionally result in language mixing and a slight decrease in model performance. ----- ### Source This model was converted to MLX format from [`darkc0de/Qwen3.5-9B-heretic`](https://huggingface.co/darkc0de/Qwen3.5-9B-heretic) using mlx-vlm version **0.3.12**.