--- license: apache-2.0 language: - en tags: - tokforge - mnn - android - mobile - speculative-decoding - qwen3.5 - draft-model - experimental - text-generation pipeline_tag: text-generation inference: false --- # Qwen3.5-0.8B-lk-alpha-ep4-MNN Experimental `Qwen3.5-0.8B` draft bundle for **TokForge + MNN** speculative decoding research. ## Why this repo exists This repo captures an acceptance-oriented `Qwen3.5-0.8B` draft experiment exported into a ready-to-run `MNN` bundle. It is here because people asked for the actual artifacts behind the work, not because it is the current default recommendation. ## Training snapshot For the associated `LK Alpha` training lane: - final reported acceptance (`alpha`) was `0.6972` on the small Qwen3.5 dataset ## Usage This bundle is meant for **TokForge / MNN**, not standard HF Inference. Typical TokForge recipe: ```json { "backend_type": "cpu", "thread_num": 2, "precision": "low", "memory": "low", "sampler_type": "greedy", "speculative_type": "draftmodel", "draft_predict_length": 2, "draft_config_path": "/path/to/config_cpu.json" } ``` ## Status This is currently best treated as **experimental**: - useful if you want to inspect the `Qwen3.5-0.8B` draft path - useful for reproducing training/export experiments - not currently the top practical mobile recommendation versus the stronger `Qwen3-0.6B` draft lane ## Limitations and Intended Use - This is an experimental `Qwen3.5` draft lane, not the strongest practical mobile draft we have. - Cross-family drafting was generally weaker than the same-architecture `Qwen3-0.6B -> Qwen3` draft lane. - Use this for reproducibility and research rather than as the default recommended draft. ## Collection - [TokForge Mobile Draft Models](https://huggingface.co/collections/darkmaniac7/tokforge-mobile-draft-models-69c36153ea7084ce78329665) ## Included files - `llm.mnn` - `llm.mnn.weight` - `llm_config.json` - `config_cpu.json` - tokenizer files - ONNX export artifact for reference ## Notes - This is an `MNN` runtime bundle for TokForge-style use. - It is not a standard HF Transformers checkpoint. ## TokForge - Website: [tokforge.ai](https://tokforge.ai) - Discord: [Join the Discord](https://discord.gg/Acv3CBtfVm) If you benchmark this on your own device, feel free to share results in Discord.