FoxlightAI/deepseek-v4-flash-mtp

This repository contains an MTP speculative-decoding sidecar derived from deepseek-ai/DeepSeek-V4-Flash. It is not a standalone model — it provides the multi-token-prediction heads used by Skulk to speculatively decode for the target model deepseek-ai/DeepSeek-V4-Flash. The heads ship at full precision (bf16, unquantized) — they are the speculative drafter, where precision drives draft acceptance — so there is one sidecar per base model and it serves every quantization of the target.

Provenance

Field	Value
Artifact type	mtp-sidecar
Source model	`deepseek-ai/DeepSeek-V4-Flash`
Source revision	`6976c7ff1b30a1b2cb7805021b8ba4684041f136`
Target model	`deepseek-ai/DeepSeek-V4-Flash`
Extracted with	skulk-weights-publisher `0.1.0`
Generated	2026-06-03T16:45:17Z

Usage

Skulk loads this sidecar (mtp.safetensors) alongside the target model to enable MTP speculative decoding. It is referenced from the Skulk Weights Publisher catalog and fetched automatically by the Skulk shard downloader; it is not intended to be loaded standalone.

License

This artifact is derived from deepseek-ai/DeepSeek-V4-Flash and is published under that model's original license (mit), preserved unchanged. Refer to the source model's card for the full terms.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for FoxlightAI/deepseek-v4-flash-mtp

Base model

deepseek-ai/DeepSeek-V4-Flash

Finetuned

(14)

this model

Collection including FoxlightAI/deepseek-v4-flash-mtp

MTP Sidecars

Collection

17 items • Updated 5 days ago