Qwen3.5-0.8B Uncensored โ€” MNN Format (mix-mask export)

MNN conversion of huihui-ai/Huihui-Qwen3.5-0.8B-abliterated (vision included) for on-device inference. All credit for the abliteration goes to huihui-ai.

  • Hybrid attention: 18 LinearAttention + 6 full-attention layers of 24 โ€” self-describing metadata (attention_type:"mix", layer_nums:24, sliding_window:4), stacked mask graph [2,1,1,seq,kv]
  • Quantization: 4-bit, block 64, HQQ (matches the taobao-mnn 0.8B bundle parameters)
  • Tokenizer: tokenizer.mtok (new format)
  • Exporter: MNN-TokForge tokforge/rebase-3.5.0 @ 21c339c2fd (upstream #4336 attention-type detection)
  • Requires an MNN runtime with the mix-mask gen_attention_mask path (MNN-TokForge โ‰ฅ tokforge/rebase-3.5.0); legacy single-mask runtimes are NOT compatible
  • Validated: multi-turn coherence (3 cycles/session, CPU), LinearAttention snapshot/restore (Y=Z), self-describing hybrid metadata verified against the exported graph

File checksums (md5)

3af7de552fb9cea05e67f615d96efb53  llm.mnn
10fe7c624159f62c69d93a0e836fc80f  llm.mnn.weight
88bbc9851fc9e2aae364eaf59698326a  llm_config.json
3c183b4e10bef80909cb6998fe5e4fa6  tokenizer.mtok
06b782392bdb4ca19a1045c3709fdbb6  visual.mnn
e48e427a48dc363278ecaacb7810db6a  visual.mnn.weight
Downloads last month
36
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for darkmaniac7/Qwen3.5-0.8B-uncensored-MNN

Finetuned
(4)
this model