darkmaniac7
/

TokForge-AccelerationPack-Draft

Text Generation

speculative-decoding

Model card Files Files and versions

TokForge-AccelerationPack-Draft

339 MB

Ctrl+K

Ctrl+K

1 contributor

History: 14 commits

darkmaniac7's picture

Upload llm.mnn.json with huggingface_hub

e20d0d7 verified 2 months ago

.gitattributes

1.61 kB
Upload folder using huggingface_hub 3 months ago
README.md

5 kB
Upload README.md with huggingface_hub 3 months ago
config.json

211 Bytes
v3: KL-distilled 0.6B from Qwen3-8B (10K samples, KL=0.339, +41% uplift on SM8850) 3 months ago
config_cpu.json

211 Bytes
v3: KL-distilled 0.6B from Qwen3-8B (10K samples, KL=0.339, +41% uplift on SM8850) 3 months ago
config_opencl.json

172 Bytes
Add config_opencl.json for OpenCL draft backend support 3 months ago
draft_config_cpu.json

211 Bytes
v3: KL-distilled 0.6B from Qwen3-8B (10K samples, KL=0.339, +41% uplift on SM8850) 3 months ago
llm.mnn

504 kB
xet

v3: KL-distilled 0.6B from Qwen3-8B (10K samples, KL=0.339, +41% uplift on SM8850) 3 months ago
llm.mnn.json

1.01 MB
Upload llm.mnn.json with huggingface_hub 2 months ago
llm.mnn.weight

336 MB
xet

v3: KL-distilled 0.6B from Qwen3-8B (10K samples, KL=0.339, +41% uplift on SM8850) 3 months ago
llm_config.json

4.66 kB
v3: KL-distilled 0.6B from Qwen3-8B (10K samples, KL=0.339, +41% uplift on SM8850) 3 months ago
runtime_config.json

1.36 kB
Add runtime_config.json with optimal spec decode settings 3 months ago
tokenizer.txt

1.61 MB
v3: KL-distilled 0.6B from Qwen3-8B (10K samples, KL=0.339, +41% uplift on SM8850) 3 months ago