--- license: apache-2.0 language: - en pipeline_tag: text-generation base_model: Goekdeniz-Guelmez/Josiefied-Qwen3-4B-abliterated-v2 tags: - mnn - qwen3 - mobile - on-device - tokforge - uncensored - abliterated --- # Josiefied-Qwen3-4B-abliterated-v2-MNN Pre-converted [Josiefied-Qwen3-4B-abliterated-v2](https://huggingface.co/Goekdeniz-Guelmez/Josiefied-Qwen3-4B-abliterated-v2) in MNN format for on-device inference with [TokForge](https://tokforge.ai). > **Original model by [Goekdeniz-Guelmez](https://huggingface.co/Goekdeniz-Guelmez)** — converted to MNN Q4 for mobile deployment. ## Model Details | | | |---|---| | **Architecture** | Qwen3 (standard multi-head attention, 36 layers) | | **Parameters** | 4B (4-bit quantized) | | **Format** | MNN (Alibaba Mobile Neural Network) | | **Quantization** | W4A16 (4-bit weights, block size 128) | | **Vocab** | 151,936 tokens | | **Source** | [Goekdeniz-Guelmez/Josiefied-Qwen3-4B-abliterated-v2](https://huggingface.co/Goekdeniz-Guelmez/Josiefied-Qwen3-4B-abliterated-v2) | ## Description Josiefied abliterated v2 by Goekdeniz Guelmez — refined 4B Qwen3 with abliterated safety filters. The v2 iteration improves on the original with better uncensoring and instruction following. Great balance of speed and quality for everyday mobile use. ## Files | File | Description | |------|-------------| | `llm.mnn` | Model computation graph | | `llm.mnn.weight` | Quantized weight data (Q4, block=128) | | `llm_config.json` | Model config with Jinja chat template | | `tokenizer.txt` | Tokenizer vocabulary | | `config.json` | MNN runtime config | ## Usage with TokForge This model is optimized for **[TokForge](https://tokforge.ai)** — a free Android app for private, on-device LLM inference. 1. Download [TokForge from the Play Store](https://tokforge.ai) 2. Open the app → Models → Download this model 3. Start chatting — runs 100% locally, no internet required ### Recommended Settings | Setting | Value | |---------|-------| | Backend | OpenCL (Qualcomm) / Vulkan (MediaTek) / CPU (fallback) | | Precision | Low | | Threads | 4 | | Thinking | Off (or On for thinking-capable models) | ### Speculative Decoding Pair with the [TokForge Acceleration Pack](https://huggingface.co/darkmaniac7/TokForge-AccelerationPack-Draft) for **+20-38% faster generation** on supported devices. | Device | SoC | Backend | tok/s | |---|---|---|---| | RedMagic 11 Pro | SM8850 (Snapdragon 8 Elite 2) | OpenCL | **22.4 tok/s** | | Lenovo TB520FU | SM8650 (Snapdragon 8 Gen 3) | OpenCL | **16.9 tok/s** | | OnePlus Ace 5 Ultra | D9400+ (Dimensity 9400) | OpenCL | **15.9 tok/s** | | Xiaomi Pad 7 Pro | SM8635 (Snapdragon 7+ Gen 3) | OpenCL | **9.3 tok/s** | ## Performance Actual speed varies by device, thermal state, and generation length. Typical ranges for this model size: | Device | SoC | Backend | Approx. tok/s | |---|---|---|---| | SM8850 (RedMagic) | Snapdragon 8 Elite 2 | OpenCL | ~17-24 tok/s | | SM8650 (Lenovo) | Snapdragon 8 Gen 3 | OpenCL | ~15-17 tok/s | | SM8635 (Xiaomi) | Snapdragon 7+ Gen 3 | OpenCL | ~9-12 tok/s | | D9400+ (OnePlus) | Dimensity 9400 | OpenCL | ~9-15 tok/s | ## Attribution This is an MNN conversion of **[Josiefied-Qwen3-4B-abliterated-v2](https://huggingface.co/Goekdeniz-Guelmez/Josiefied-Qwen3-4B-abliterated-v2)** by **[Goekdeniz-Guelmez](https://huggingface.co/Goekdeniz-Guelmez)**. All credit for the model architecture, training, and fine-tuning goes to the original author(s). This conversion only changes the runtime format for mobile deployment. ## Limitations - Intended for TokForge / MNN on-device inference on Android - This is a runtime bundle, not a standard Transformers training checkpoint - Quantization (Q4) may slightly reduce quality compared to the full-precision original - Abliterated/uncensored models have had safety filters removed — **use responsibly** ## Community - **Website:** [tokforge.ai](https://tokforge.ai) - **Discord:** [Join our Discord](https://discord.gg/Acv3CBtfVm) - **GitHub:** [TokForge on GitHub](https://github.com/darkmaniac7/Elysium) ## Export Details Converted using MNN's `llmexport` pipeline: ```bash python llmexport.py --path Goekdeniz-Guelmez/Josiefied-Qwen3-4B-abliterated-v2 --export mnn --quant_bit 4 --quant_block 128 ```