Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing
    • Website
      • Tasks
      • HuggingChat
      • Collections
      • Languages
      • Organizations
    • Community
      • Blog
      • Posts
      • Daily Papers
      • Learn
      • Discord
      • Forum
      • GitHub
    • Solutions
      • Team & Enterprise
      • Hugging Face PRO
      • Enterprise Support
      • Inference Providers
      • Inference Endpoints
      • Storage Buckets

  • Log In
  • Sign Up

darkmaniac7
/
TokForge-AccelerationPack-Draft

Text Generation
English
mnn
speculative-decoding
draft-model
qwen3
tokforge
Model card Files Files and versions
xet
Community
TokForge-AccelerationPack-Draft
339 MB
Ctrl+K
Ctrl+K
  • 1 contributor
History: 14 commits
darkmaniac7's picture
darkmaniac7
Upload llm.mnn.json with huggingface_hub
e20d0d7 verified 2 months ago
  • .gitattributes
    1.61 kB
    Upload folder using huggingface_hub 3 months ago
  • README.md
    5 kB
    Upload README.md with huggingface_hub 3 months ago
  • config.json
    211 Bytes
    v3: KL-distilled 0.6B from Qwen3-8B (10K samples, KL=0.339, +41% uplift on SM8850) 3 months ago
  • config_cpu.json
    211 Bytes
    v3: KL-distilled 0.6B from Qwen3-8B (10K samples, KL=0.339, +41% uplift on SM8850) 3 months ago
  • config_opencl.json
    172 Bytes
    Add config_opencl.json for OpenCL draft backend support 3 months ago
  • draft_config_cpu.json
    211 Bytes
    v3: KL-distilled 0.6B from Qwen3-8B (10K samples, KL=0.339, +41% uplift on SM8850) 3 months ago
  • llm.mnn
    504 kB
    xet
    v3: KL-distilled 0.6B from Qwen3-8B (10K samples, KL=0.339, +41% uplift on SM8850) 3 months ago
  • llm.mnn.json
    1.01 MB
    Upload llm.mnn.json with huggingface_hub 2 months ago
  • llm.mnn.weight
    336 MB
    xet
    v3: KL-distilled 0.6B from Qwen3-8B (10K samples, KL=0.339, +41% uplift on SM8850) 3 months ago
  • llm_config.json
    4.66 kB
    v3: KL-distilled 0.6B from Qwen3-8B (10K samples, KL=0.339, +41% uplift on SM8850) 3 months ago
  • runtime_config.json
    1.36 kB
    Add runtime_config.json with optimal spec decode settings 3 months ago
  • tokenizer.txt
    1.61 MB
    v3: KL-distilled 0.6B from Qwen3-8B (10K samples, KL=0.339, +41% uplift on SM8850) 3 months ago