inferencerlabs commited on
Commit
4e96025
·
verified ·
1 Parent(s): 3f971b8

Upload model file

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -8,13 +8,13 @@ tags:
8
  - speculative-decoding
9
  - draft-model
10
  base_model:
11
- - Qwen/Qwen3.6-27B
12
  pipeline_tag: image-text-to-text
13
  ---
14
- # Qwen3.6-27B MTP
15
- See Qwen3.6-27B with MTP in action: [demonstration videos](https://youtube.com/xcreate)
16
 
17
- This draft model contains the extracted **Multi-Token Prediction (MTP)** layers from **[Qwen/Qwen3.6-27B](https://huggingface.co/Qwen/Qwen3.6-27B)** for use alongside the [Qwen3.6-27B-MLX](https://huggingface.co/models?search=inferencerlabs/qwen3.6-27b-mlx) model as a speculative decoder for improved performance.
18
 
19
  #### Tested on a M3 Ultra 512GB RAM using [Inferencer app v1.11.5](https://inferencer.com)
20
  <table style="border-collapse: collapse; border: none; text-align:left; margin-top:10px; margin-bottom:0px;">
 
8
  - speculative-decoding
9
  - draft-model
10
  base_model:
11
+ - Qwen/Qwen3.6-35B-A3B
12
  pipeline_tag: image-text-to-text
13
  ---
14
+ # Qwen3.6-35B-A3B MTP
15
+ See Qwen3.6-35B-A3B with MTP in action: [demonstration videos](https://youtube.com/xcreate)
16
 
17
+ This draft model contains the extracted **Multi-Token Prediction (MTP)** layers from **[Qwen/Qwen3.6-35B-A3B](https://huggingface.co/Qwen/Qwen3.6-35B-A3B)** for use alongside the [Qwen3.6-35B-A3B-MLX](https://huggingface.co/models?search=inferencerlabs/qwen3.6-35B-A3B-mlx) model as a speculative decoder for improved performance.
18
 
19
  #### Tested on a M3 Ultra 512GB RAM using [Inferencer app v1.11.5](https://inferencer.com)
20
  <table style="border-collapse: collapse; border: none; text-align:left; margin-top:10px; margin-bottom:0px;">