File size: 2,344 Bytes
f17c0c9 c7e30fe f17c0c9 c7e30fe f17c0c9 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 | ---
license: apache-2.0
language:
- en
tags:
- tokforge
- mnn
- android
- mobile
- speculative-decoding
- qwen3.5
- draft-model
- experimental
- text-generation
pipeline_tag: text-generation
inference: false
---
# Qwen3.5-0.8B-lk-alpha-ep4-MNN
Experimental `Qwen3.5-0.8B` draft bundle for **TokForge + MNN** speculative decoding research.
## Why this repo exists
This repo captures an acceptance-oriented `Qwen3.5-0.8B` draft experiment exported into a ready-to-run `MNN` bundle.
It is here because people asked for the actual artifacts behind the work, not because it is the current default recommendation.
## Training snapshot
For the associated `LK Alpha` training lane:
- final reported acceptance (`alpha`) was `0.6972` on the small Qwen3.5 dataset
## Usage
This bundle is meant for **TokForge / MNN**, not standard HF Inference.
Typical TokForge recipe:
```json
{
"backend_type": "cpu",
"thread_num": 2,
"precision": "low",
"memory": "low",
"sampler_type": "greedy",
"speculative_type": "draftmodel",
"draft_predict_length": 2,
"draft_config_path": "/path/to/config_cpu.json"
}
```
## Status
This is currently best treated as **experimental**:
- useful if you want to inspect the `Qwen3.5-0.8B` draft path
- useful for reproducing training/export experiments
- not currently the top practical mobile recommendation versus the stronger `Qwen3-0.6B` draft lane
## Limitations and Intended Use
- This is an experimental `Qwen3.5` draft lane, not the strongest practical mobile draft we have.
- Cross-family drafting was generally weaker than the same-architecture `Qwen3-0.6B -> Qwen3` draft lane.
- Use this for reproducibility and research rather than as the default recommended draft.
## Collection
- [TokForge Mobile Draft Models](https://huggingface.co/collections/darkmaniac7/tokforge-mobile-draft-models-69c36153ea7084ce78329665)
## Included files
- `llm.mnn`
- `llm.mnn.weight`
- `llm_config.json`
- `config_cpu.json`
- tokenizer files
- ONNX export artifact for reference
## Notes
- This is an `MNN` runtime bundle for TokForge-style use.
- It is not a standard HF Transformers checkpoint.
## TokForge
- Website: [tokforge.ai](https://tokforge.ai)
- Discord: [Join the Discord](https://discord.gg/Acv3CBtfVm)
If you benchmark this on your own device, feel free to share results in Discord.
|