darkmaniac7
/

Qwen3.5-0.8B-lk-alpha-ep4-MNN

Text Generation

speculative-decoding

Model card Files Files and versions

Qwen3.5-0.8B-lk-alpha-ep4-MNN

Experimental Qwen3.5-0.8B draft bundle for TokForge + MNN speculative decoding research.

Why this repo exists

This repo captures an acceptance-oriented Qwen3.5-0.8B draft experiment exported into a ready-to-run MNN bundle.

It is here because people asked for the actual artifacts behind the work, not because it is the current default recommendation.

Training snapshot

For the associated LK Alpha training lane:

final reported acceptance (alpha) was 0.6972 on the small Qwen3.5 dataset

Usage

This bundle is meant for TokForge / MNN, not standard HF Inference.

Typical TokForge recipe:

{
  "backend_type": "cpu",
  "thread_num": 2,
  "precision": "low",
  "memory": "low",
  "sampler_type": "greedy",
  "speculative_type": "draftmodel",
  "draft_predict_length": 2,
  "draft_config_path": "/path/to/config_cpu.json"
}

Status

This is currently best treated as experimental:

useful if you want to inspect the Qwen3.5-0.8B draft path
useful for reproducing training/export experiments
not currently the top practical mobile recommendation versus the stronger Qwen3-0.6B draft lane

Limitations and Intended Use

This is an experimental Qwen3.5 draft lane, not the strongest practical mobile draft we have.
Cross-family drafting was generally weaker than the same-architecture Qwen3-0.6B -> Qwen3 draft lane.
Use this for reproducibility and research rather than as the default recommended draft.

Collection

TokForge Mobile Draft Models

Included files

llm.mnn
llm.mnn.weight
llm_config.json
config_cpu.json
tokenizer files
ONNX export artifact for reference

Notes

This is an MNN runtime bundle for TokForge-style use.
It is not a standard HF Transformers checkpoint.

TokForge

Website: tokforge.ai
Discord: Join the Discord

If you benchmark this on your own device, feel free to share results in Discord.

Downloads last month: 1

Collection including darkmaniac7/Qwen3.5-0.8B-lk-alpha-ep4-MNN

TokForge Mobile Draft Models

Small MNN draft models and speculative-decoding bundles for TokForge on Android. Includes practical Qwen3 0.6B drafts plus experimental variants. • 5 items • Updated Mar 25