Qwen3.5-4B-Claude-4.6-Opus-abliterated (MNN)

Pre-converted Qwen3.5-4B abliterated model in MNN format for on-device inference.

Model Details

Device	SoC	Backend	tok/s
S26 Ultra	SM8850	CPU	20.5
Lenovo TB520FU	SM8650	CPU	14.0
Xiaomi Pad 7 Pro	SM8635	CPU	11.8

CPU is recommended for Qwen3.5 models (LinearAttention runs natively on CPU).

Optimized for TokForge — an Android app for on-device LLM inference.

Safety filters removed for unrestricted conversation. Use responsibly.

Intended for TokForge / MNN on-device inference, especially creative writing and roleplay use.
Qwen3.5 hybrid LinearAttention models generally route best to CPU in current TokForge builds.
Performance varies by device class and generation length.
This repo is a packaged runtime artifact, not a standard Transformers training checkpoint.

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Base model

Finetuned

Finetuned

(312)

this model