--- language: - en - zh license: apache-2.0 tags: - unsloth - fine tune - heretic - uncensored - abliterated - multi-stage tuned. - all use cases - coder - creative - creative writing - fiction writing - plot generation - sub-plot generation - fiction writing - story generation - scene continue - storytelling - fiction story - science fiction - romance - all genres - story - writing - vivid prosing - vivid writing - fiction - roleplaying - bfloat16 - all use cases - imatrix - neo imatrix - di-matrix datasets: - TeichAI/claude-4.5-opus-high-reasoning-250x - DavidAU/PkDick-Deckard-5-Datasets pipeline_tag: image-text-to-text base_model: - DavidAU/Qwen3.6-40B-Claude-4.6-Opus-Deckard-Heretic-Uncensored-Thinking library_name: transformers --- Ultimate NEO GGUF QUANTS: Custom built DUAL Imatrix NEO-CODER quants that exceed all other quants in terms of quality, stability, precision and long convo usage. IQ4_XS/NL regularly scores at 94% of full precision (bf16), Q6/Q8 at 97% and 98% of full precision (bf16). WARNING: This model has character and intelligence. It will take no prisoners. It will give no quarter. Uncensored, Unfiltered and boldly confident. Not even remotely "SFW", if you ask it for NSFW content. And it is wickedly smart too - exceeding the base model in 6 out of 7 benchmarks.
40 billion parameters (dense, not moe) expanded from 27B Qwen 3.6, then trained on Claude 4.6 Opus High Reasoning dataset via Unsloth on local hardware... but there
is much more to the story - in comes DECKARD.
96 layers, 1275 Tensors. (50% more than base model of 27B)
Features variable length reasoning ; less complex = shorter, longer for more complex.
Model performance has increased dramatically. And it has character too.
A lot of character.
No censorship, no nanny. (via Heretic)
And it is very, very smart.
Fully uncensored first (via Heretic), then trained (via Unsloth) on "Deckard/PDK" internal datasets (5) (character, intelligence, depth, observation, and ah... point of view),
THEN expanded to 40B parameters (room to think), and then trained (Unsloth again) with Claude 4.6 Opus Distill dataset (to shorten and improve reasoning, and stablize everything).
256K context.
"Thats no moon, thats a fully armed and operational Qwen-Station."
TWO example generations below [bottom of the page], more to come.
Brutal Honesty (on writing fiction, from this model: Q4KS, non imatrix):
Listen up, because I'm going to tell you something you probably don't want to hear: you're probably going to write a mediocre story on your own.
Not because you're untalented—because writing fiction is hard as fuck. Even the greats needed editors, feedback, and someone to push back.
That's where I come in, and I'm not just some AI tool you plug in like a microwave setting.
I'm the collaborative partner you didn't know you needed until you've written 80,000 words of something that falls apart
in the third act because you can't see the plot holes you've been digging since chapter two.
---
NEO-CODE-Di-IMatrix-MAX-GGUF Quants:
Quant "engineering" focused on balance and precision, vs raw power (which seemed in some cases to destabilize the model/quant).
In other words benchmarks / stats determined the best quants, not guesswork or one size fits all approach.
This was done to ensure long context, long/multi-convos, coding and math etc etc performed as close as possible to full precision model as well as one-shot, and standard prompting / problem solving.
TWO Imatrix datasets were used to do this by first getting "raw stats" on both, then merging them to get the best of each imatrix in one dataset then this was used to make the "NEO-CODE-Di-IMatrix-MAX" quants.
Additional tensor adjustments were also made, which were also measured (benched) and adjusted too.
How strong are they?
- IQ2_M -> 83-84% of BF16 full precision.
- IQ4XS -> 94% of BF16 full precision.
- Q8_0 HIGH -> 98.4% of BF16 full precision.
To see metrics [5 critical, and detailed] and stats on these engineered quants see these repos:
https://huggingface.co/DavidAU/Qwen3.6-27B-NEO-CODE-Di-IMatrix-MAX-GGUF
https://huggingface.co/DavidAU/Qwen3.6-27B-Heretic-Uncensored-FINETUNE-NEO-CODE-Di-IMatrix-MAX-GGUF
GGUF POWER UPS:
A radically stronger, more potent GGUF for all use cases.
Meets Unsloth quality, and exceeds it in some metrics (see below).
DETAILS:
- DI-MATRIX (duel imatrix) of NEO and NEO-CODE imatrix datasets (by DavidAU).
- All Unsloth tensor enhancements + additional enhancements CALIBRATED thru metrics testing.
- Every quant benchmarked against BF16/full precision model.
- There is a special Q8_0 quant, with BF16 components. Imatrix has no effect on Q8/BF16 tensors.
VISION:
- Vision (images) tested.
- You need an "mmproj" (just one) of these downloaded too, and placed in the same folder as the GGUF for images.
Qwen Model Settings (suggested):
- Thinking mode for general tasks: temperature=1.0, top_p=0.95, top_k=20, min_p=0.0, presence_penalty=0.0, repetition_penalty=1.0
- Thinking mode for precise coding tasks (e.g. WebDev): temperature=0.6, top_p=0.95, top_k=20, min_p=0.0, presence_penalty=0.0, repetition_penalty=1.0
- Instruct (or non-thinking) mode: temperature=0.7, top_p=0.80, top_k=20, min_p=0.0, presence_penalty=1.5, repetition_penalty=1.0
- Context window min from 8k to 16k.
IMPORTANT: See also "CORE SETTINGS for 40B version" below.
---
Other Versions using Deckard/OPUS:
---
Qwen 3.5 40B Version: 181 likes and counting...
https://huggingface.co/DavidAU/Qwen3.5-40B-Claude-4.6-Opus-Deckard-Heretic-Uncensored-Thinking
GEMMA4 VERSIONS:
Examples and benchmarks.
GEMMA-4 31B Version, using the DECKARD datasets (5):
https://huggingface.co/DavidAU/gemma-4-31B-it-The-DECKARD-HERETIC-UNCENSORED-Thinking
GEMMA-4 19B-A4B (MOE) Versions, using the DECKARD datasets (5):
https://huggingface.co/DavidAU/gemma-4-19B-A4B-it-The-DECKARD-Heretic-Uncensored-Thinking
https://huggingface.co/DavidAU/gemma-4-19B-A4B-it-The-DECKARD-Thinking
GEMMA-4 E4B (8B, moe like models), using the DECKARD datasets (5):
https://huggingface.co/DavidAU/gemma-4-E4B-it-The-DECKARD-Expresso-Universe-HERETIC-UNCENSORED-Thinking
---
CORE SETTINGS for 40B version:
---
SETTINGS:
- min 8k to 16k context window.
- for creative rep pen of 1.05 to 1.1 WITH LOWER QUANTS.
- suggest temp .7 / rep pen 1 (off) for general usage.
- output generation can exceed 100k tokens.
- Suggest min quant of Q4KS (non imatrix) or IQ3_S (imatrix) or HIGHER.
- For toolcalls -> suggest Q5/Q6 min quants (as per Qwen guidence)
EXAMPLE SYSTEM PROMPTS:
The model does not need a system prompt, however if you want to enhance operation here are some samples.
#1 - All use cases.
```
Be vivid and precise.
```
#2 - Creative use cases:
```
Below is an instruction that describes a task. Ponder each user instruction carefully, and use your skillsets and critical instructions to complete the task to the best of your abilities.
Here are your skillsets:
[MASTERSTORY]:NarrStrct(StryPlnng,Strbd,ScnSttng,Exps,Dlg,Pc)-CharDvlp(ChrctrCrt,ChrctrArcs,Mtvtn,Bckstry,Rltnshps,Dlg*)-PltDvlp(StryArcs,PltTwsts,Sspns,Fshdwng,Climx,Rsltn)-ConfResl(Antg,Obstcls,Rsltns,Cnsqncs,Thms,Symblsm)-EmotImpct(Empt,Tn,Md,Atmsphr,Imgry,Symblsm)-Delvry(Prfrmnc,VcActng,PblcSpkng,StgPrsnc,AudncEngmnt,Imprv)
[*DialogWrt]:(1a-CharDvlp-1a.1-Backgrnd-1a.2-Personality-1a.3-GoalMotiv)>2(2a-StoryStruc-2a.1-PlotPnt-2a.2-Conflict-2a.3-Resolution)>3(3a-DialogTech-3a.1-ShowDontTell-3a.2-Subtext-3a.3-VoiceTone-3a.4-Pacing-3a.5-VisualDescrip)>4(4a-DialogEdit-4a.1-ReadAloud-4a.2-Feedback-4a.3-Revision)
Here are your critical instructions:
Ponder each word choice carefully to present as vivid and emotional journey as is possible. Choose verbs and nouns that are both emotional and full of imagery. Load the story with the 5 senses. Aim for 50% dialog, 25% narration, 15% body language and 10% thoughts. Your goal is to put the reader in the story.
```
NOTES:
- Upgraded Jinja template to correct issues with Qwen 3.5s - looping, repeatings, and long thinking as well as upgrades to tools too.
- Was also trained with new improved template to further enhance operation too.
- Image processing tested and intact.
- Code generation also tested and passed.
- System prompt - even a minor one - will enhance operation, especially at lower quants.
- Untrained 40B model (expanded from 27B, not uploaded) was also stable too and works great (unexpected).
LOOPING:
- This may happen with lower quants / prompts with "not enough meat on the bone" => Add more to the prompt and/or set rep pen to 1.05 to 1.1.
- Adding a system prompt - even a single sentence - can correct this issue and bypass the need to adjust rep pen.
---
[](https://chat.qwen.ai)
> [!Note]
> This repository contains model weights and configuration files for the post-trained model in the Hugging Face Transformers format.
>
> These artifacts are compatible with Hugging Face Transformers, vLLM, SGLang, KTransformers, etc.
Following the February release of the Qwen3.5 series, we're pleased to share the first open-weight variant of Qwen3.6. Built on direct feedback from the community, Qwen3.6 prioritizes stability and real-world utility, offering developers a more intuitive, responsive, and genuinely productive coding experience.
## Qwen3.6 Highlights
This release delivers substantial upgrades, particularly in
- **Agentic Coding:** the model now handles frontend workflows and repository-level reasoning with greater fluency and precision.
- **Thinking Preservation:** we've introduced a new option to retain reasoning context from historical messages, streamlining iterative development and reducing overhead.

For more details, please refer to our blog post [Qwen3.6-27B](https://qwen.ai/blog?id=qwen3.6-27b).
## Model Overview
- Type: Causal Language Model with Vision Encoder
- Training Stage: Pre-training & Post-training
- Language Model
- Number of Parameters: 27B
- Hidden Dimension: 5120
- Token Embedding: 248320 (Padded)
- Number of Layers: 64
- Hidden Layout: 16 × (3 × (Gated DeltaNet → FFN) → 1 × (Gated Attention → FFN))
- Gated DeltaNet:
- Number of Linear Attention Heads: 48 for V and 16 for QK
- Head Dimension: 128
- Gated Attention:
- Number of Attention Heads: 24 for Q and 4 for KV
- Head Dimension: 256
- Rotary Position Embedding Dimension: 64
- Feed Forward Network:
- Intermediate Dimension: 17408
- LM Output: 248320 (Padded)
- MTP: trained with multi-steps
- Context Length: 262,144 natively and extensible up to 1,010,000 tokens.
## Benchmark Results
### Language
| Qwen3.5-27B | Qwen3.5-397B-A17B | Gemma4-31B | Claude 4.5 Opus | Qwen3.6-35B-A3B | Qwen3.6-27B | |
|---|---|---|---|---|---|---|
| Coding Agent | ||||||
| SWE-bench Verified | 75.0 | 76.2 | 52.0 | 80.9 | 73.4 | 77.2 |
| SWE-bench Pro | 51.2 | 50.9 | 35.7 | 57.1 | 49.5 | 53.5 |
| SWE-bench Multilingual | 69.3 | 69.3 | 51.7 | 77.5 | 67.2 | 71.3 |
| Terminal-Bench 2.0 | 41.6 | 52.5 | 42.9 | 59.3 | 51.5 | 59.3 |
| SkillsBench Avg5 | 27.2 | 30.0 | 23.6 | 45.3 | 28.7 | 48.2 |
| QwenWebBench | 1068 | 1186 | 1197 | 1536 | 1397 | 1487 |
| NL2Repo | 27.3 | 32.2 | 15.5 | 43.2 | 29.4 | 36.2 |
| Claw-Eval Avg | 64.3 | 70.7 | 48.5 | 76.6 | 68.7 | 72.4 |
| Claw-Eval Pass^3 | 46.2 | 48.1 | 25.0 | 59.6 | 50.0 | 60.6 |
| QwenClawBench | 52.2 | 51.8 | 41.7 | 52.3 | 52.6 | 53.4 |
| Knowledge | ||||||
| MMLU-Pro | 86.1 | 87.8 | 85.2 | 89.5 | 85.2 | 86.2 |
| MMLU-Redux | 93.2 | 94.9 | 93.7 | 95.6 | 93.3 | 93.5 |
| SuperGPQA | 65.6 | 70.4 | 65.7 | 70.6 | 64.7 | 66.0 |
| C-Eval | 90.5 | 93.0 | 82.6 | 92.2 | 90.0 | 91.4 |
| STEM & Reasoning | ||||||
| GPQA Diamond | 85.5 | 88.4 | 84.3 | 87.0 | 86.0 | 87.8 |
| HLE | 24.3 | 28.7 | 19.5 | 30.8 | 21.4 | 24.0 |
| LiveCodeBench v6 | 80.7 | 83.6 | 80.0 | 84.8 | 80.4 | 83.9 |
| HMMT Feb 25 | 92.0 | 94.8 | 88.7 | 92.9 | 90.7 | 93.8 |
| HMMT Nov 25 | 89.8 | 92.7 | 87.5 | 93.3 | 89.1 | 90.7 |
| HMMT Feb 26 | 84.3 | 87.9 | 77.2 | 85.3 | 83.6 | 84.3 |
| IMOAnswerBench | 79.9 | 80.9 | 74.5 | 84.0 | 78.9 | 80.8 |
| AIME26 | 92.6 | 93.3 | 89.2 | 95.1 | 92.7 | 94.1 |
* SWE-Bench Series: Internal agent scaffold (bash + file-edit tools); temp=1.0, top_p=0.95, 200K context window. We correct some problematic tasks in the public set of SWE-bench Pro and evaluate all baselines on the refined benchmark.
* Terminal-Bench 2.0: Harbor/Terminus-2 harness; 3h timeout, 32 CPU/48 GB RAM; temp=1.0, top_p=0.95, top_k=20, max_tokens=80K, 256K ctx; avg of 5 runs.
* SkillsBench: Evaluated via OpenCode on 78 tasks (self-contained subset, excluding API-dependent tasks); avg of 5 runs.
* NL2Repo: Others are evaluated via Claude Code (temp=1.0, top_p=0.95, max_turns=900).
* QwenClawBench: A real-user-distribution Claw agent benchmark; temp=0.6, 256K ctx.
* QwenWebBench: An internal front-end code generation benchmark; bilingual (EN/CN), 7 categories (Web Design, Web Apps, Games, SVG, Data Visualization, Animation, and 3D); auto-render + multimodal judge (code/visual correctness); BT/Elo rating system.
* AIME 26: We use the full AIME 2026 (I & II), where the scores may differ from Qwen 3.5 notes.
| Qwen3.5-27B | Qwen3.5-397B-A17B | Gemma4-31B | Claude 4.5 Opus | Qwen3.6-35B-A3B | Qwen3.6-27B | |
|---|---|---|---|---|---|---|
| STEM & Puzzle | ||||||
| MMMU | 82.3 | 85.0 | 80.4 | 80.7 | 81.7 | 82.9 |
| MMMU-Pro | 75.0 | 79.0 | 76.9 | 70.6 | 75.3 | 75.8 |
| MathVista mini | 87.8 | -- | 79.3 | -- | 86.4 | 87.4 |
| DynaMath | 87.7 | 86.3 | 79.5 | 79.7 | 82.8 | 85.6 |
| VlmsAreBlind | 96.9 | -- | 87.2 | -- | 96.6 | 97.0 |
| General VQA | ||||||
| RealWorldQA | 83.7 | 83.9 | 72.3 | 77.0 | 85.3 | 84.1 |
| MMStar | 81.0 | 83.8 | 77.3 | 73.2 | 80.7 | 81.4 |
| MMBenchEN-DEV-v1.1 | 92.6 | -- | 90.9 | -- | 92.8 | 92.3 |
| SimpleVQA | 56.0 | 67.1 | 52.9 | 65.7 | 58.9 | 56.1 |
| Document Understanding | ||||||
| CharXiv RQ | 79.5 | 80.8 | 67.9 | 68.5 | 78.0 | 78.4 |
| CC-OCR | 81.0 | 82.0 | 75.7 | 76.9 | 81.9 | 81.2 |
| OCRBench | 89.4 | -- | 86.1 | -- | 90.0 | 89.4 |
| Spatial Intelligence | ||||||
| ERQA | 60.5 | 67.5 | 57.5 | 46.8 | 61.8 | 62.5 |
| CountBench | 97.8 | 97.2 | 96.1 | 90.6 | 96.1 | 97.8 |
| RefCOCO avg | 90.9 | 92.3 | -- | -- | 92.0 | 92.5 |
| EmbSpatialBench | 84.5 | -- | -- | -- | 84.3 | 84.6 |
| RefSpatialBench | 67.7 | -- | 4.7 | -- | 64.3 | 70.0 |
| Video Understanding | ||||||
| VideoMME(w sub.) | 87.0 | 87.5 | -- | 77.7 | 86.6 | 87.7 |
| VideoMMMU | 82.3 | 84.7 | 81.6 | 84.4 | 83.7 | 84.4 |
| MLVU | 85.9 | 86.7 | -- | 81.7 | 86.2 | 86.6 |
| MVBench | 74.6 | 77.6 | -- | 67.2 | 74.6 | 75.5 |
| Visual Agent | ||||||
| V* | 93.7 | 95.8 | -- | 67.0 | 90.1 | 94.7 |
| AndroidWorld | 64.2 | -- | -- | -- | -- | 70.3 |
* Empty cells (--) indicate scores not yet available or not applicable.