Sehyo
/

Qwen3.5-397B-A17B-NVFP4

@@ -20,6 +20,7 @@ You might need to build from source as it is not included in the nightly build y
 Alternatively, patch the latest nightly image yourself to include that PR.
 ## Changelog
 - **25/02/2026**: Added missing processor configs (`preprocessor_config.json`, `video_preprocessor_config.json`, `processor_config.json`), `vocab.json`, and restored full `tokenizer_config.json` from the base model. Fixes vision/video input support and tokenizer loading issues.
 - **22/02/2026**: Re-quantized with improved calibration data and parameters. Fixed 14 Inf `input_global_scale` values caused by rarely-activated experts receiving all-zero activations during calibration. All 92,400 scale tensors now valid. Fixed `tokenizer_class` and added `mlp.gate` to quantization ignore list in `config.json`.
 - **20/02/2026**: Reuploaded weights with some issues fixed.

 Alternatively, patch the latest nightly image yourself to include that PR.
 ## Changelog
+- **02/03/2026**: Added MTP (multi-token prediction) weights from source checkpoint, enabling speculative decoding with vLLM.
 - **25/02/2026**: Added missing processor configs (`preprocessor_config.json`, `video_preprocessor_config.json`, `processor_config.json`), `vocab.json`, and restored full `tokenizer_config.json` from the base model. Fixes vision/video input support and tokenizer loading issues.
 - **22/02/2026**: Re-quantized with improved calibration data and parameters. Fixed 14 Inf `input_global_scale` values caused by rarely-activated experts receiving all-zero activations during calibration. All 92,400 scale tensors now valid. Fixed `tokenizer_class` and added `mlp.gate` to quantization ignore list in `config.json`.
 - **20/02/2026**: Reuploaded weights with some issues fixed.