---
license: apache-2.0
tags:
  - mnn
  - qwen3
  - mobile
  - on-device
  - tokforge
  - abliterated
base_model: Qwen/Qwen3-8B
---

# Qwen3-8B-abliterated-v2 (MNN)

Pre-converted [Qwen3-8B](https://huggingface.co/Qwen/Qwen3-8B) abliterated model in MNN format for on-device inference.

## Model Details
- **Architecture:** Qwen3 (standard attention, 36 layers)
- **Parameters:** 8B (4-bit quantized)
- **Format:** MNN (Alibaba Mobile Neural Network)
- **Vocab:** 151,936 tokens
- **Quantization:** W4A16 (4-bit weights, 16-bit activations)

## Files
| File | Size | Description |
|------|------|-------------|
| `llm.mnn` | 631KB | Model graph |
| `llm.mnn.weight` | 4.4GB | Quantized weights |
| `embeddings_bf16.bin` | 1.2GB | BF16 embedding table (required) |
| `llm_config.json` | 4.5KB | Model config with jinja chat template |
| `tokenizer.txt` | 3.0MB | Tokenizer |
| `config.json` | 210B | MNN runtime config |

## Usage with TokForge
This model is optimized for [TokForge](https://tokforge.ai) — an Android app for on-device LLM inference.

### Performance (Speculative Decoding)
| Device | SoC | Backend | AR tok/s | Spec Decode tok/s | Uplift |
|--------|-----|---------|----------|-------------------|--------|
| S26 Ultra | SM8850 | OpenCL | ~14 | 17.8 | +27% |
| RedMagic 11 Pro | SM8850 | OpenCL | ~14 | 17.8 | +27% |
| Lenovo TB520FU | SM8650 | OpenCL | 9.9 | 12.2 | +23% |

Draft model: [Qwen3-0.6B](https://huggingface.co/darkmaniac7/TokForge-AccelerationPack-Draft)

## Abliteration
This model has been abliterated (safety filters removed) for unrestricted conversation. Use responsibly.

## Limitations and Intended Use

- Intended for TokForge / MNN on-device inference, especially Android phones and tablets.
- The best-known uplift for this model comes from pairing it with a small CPU draft model for speculative decoding.
- Real throughput varies by SoC, thermal state, backend, and generation length.
- This repo is a runtime bundle, not a standard Transformers training checkpoint.

## Community

- Website: [tokforge.ai](https://tokforge.ai)
- Discord: [Join the Discord](https://discord.gg/Acv3CBtfVm)

## Export
Converted using MNN's `llmexport` pipeline with `--quant_bit 4 --quant_block 128`.