--- license: apache-2.0 language: - en tags: - tokforge - mnn - android - mobile - speculative-decoding - qwen3.5 - draft-model - experimental - text-generation pipeline_tag: text-generation inference: false --- # Qwen3.5-0.8B-lk-alpha-ep4-MNN Experimental `Qwen3.5-0.8B` draft bundle for **TokForge + MNN** speculative decoding research. ## Why this repo exists This repo captures an acceptance-oriented `Qwen3.5-0.8B` draft experiment exported into a ready-to-run `MNN` bundle. It is here because people asked for the actual artifacts behind the work, not because it is the current default recommendation. ## Training snapshot For the associated `LK Alpha` training lane: - final reported acceptance (`alpha`) was in the `0.6972` on the small Qwen3.5 dataset ## Status This is currently best treated as **experimental**: - useful if you want to inspect the `Qwen3.5-0.8B` draft path - useful for reproducing training/export experiments - not currently the top practical mobile recommendation versus the stronger `Qwen3-0.6B` draft lane ## Included files - `llm.mnn` - `llm.mnn.weight` - `llm_config.json` - `config_cpu.json` - tokenizer files - ONNX export artifact for reference ## Notes - This is an `MNN` runtime bundle for TokForge-style use. - It is not a standard HF Transformers checkpoint. ## TokForge - Website: [tokforge.ai](https://tokforge.ai) - Discord: [Join the Discord](https://discord.gg/Acv3CBtfVm) If you benchmark this on your own device, feel free to share results in Discord.