---
license: apache-2.0
language:
- en
tags:
- tokforge
- mnn
- android
- mobile
- speculative-decoding
- qwen3.5
- draft-model
- experimental
- text-generation
pipeline_tag: text-generation
inference: false
---

# Qwen3.5-0.8B-lk-alpha-ep4-MNN

Experimental `Qwen3.5-0.8B` draft bundle for **TokForge + MNN** speculative decoding research.

## Why this repo exists

This repo captures an acceptance-oriented `Qwen3.5-0.8B` draft experiment exported into a ready-to-run `MNN` bundle.

It is here because people asked for the actual artifacts behind the work, not because it is the current default recommendation.

## Training snapshot

For the associated `LK Alpha` training lane:

- final reported acceptance (`alpha`) was in the `0.6972` on the small Qwen3.5 dataset

## Status

This is currently best treated as **experimental**:

- useful if you want to inspect the `Qwen3.5-0.8B` draft path
- useful for reproducing training/export experiments
- not currently the top practical mobile recommendation versus the stronger `Qwen3-0.6B` draft lane

## Included files

- `llm.mnn`
- `llm.mnn.weight`
- `llm_config.json`
- `config_cpu.json`
- tokenizer files
- ONNX export artifact for reference

## Notes

- This is an `MNN` runtime bundle for TokForge-style use.
- It is not a standard HF Transformers checkpoint.

## TokForge

- Website: [tokforge.ai](https://tokforge.ai)
- Discord: [Join the Discord](https://discord.gg/Acv3CBtfVm)

If you benchmark this on your own device, feel free to share results in Discord.