Qwen3.6-35B-A3B REAP Pruned Ratio 0.5

This repository contains a REAP-pruned version of Qwen/Qwen3.6-35B-A3B. The checkpoint was produced with routed-expert pruning using REAP (Router-weighted Expert Activation Pruning), which scores routed experts with router weights and expert activation norms.

Pruning Settings

Setting Value
Base model Qwen/Qwen3.6-35B-A3B
Compression / pruning ratio 0.50
Pruning method reap
Calibration samples 1024
Calibration sequence length 2048
Seed 42
Router weight renormalization true
Routed experts per MoE layer 256 -> 128
Routed experts selected per token 8
Shared experts Preserved
Precision BF16
Quantization None

Calibration Data

The calibration set used the REAP paper/code mixture with 1024 total samples:

  • theblackcat102/evol-codealpaca-v1: 171 samples
  • Salesforce/xlam-function-calling-60k: 171 samples
  • open-r1/Mixture-of-Thoughts[code]: 171 samples
  • open-r1/Mixture-of-Thoughts[math]: 171 samples
  • open-r1/Mixture-of-Thoughts[science]: 170 samples
  • SWE-bench/SWE-smith-trajectories(tool): 170 samples

Integration Notes

This checkpoint was generated with packed Qwen3.5/Qwen3.6 REAP support. The packed routed expert tensors and router rows were sliced while preserving the shared expert and the vision-language configuration. The saved model uses the Transformers qwen3_5_moe architecture and includes tokenizer and processor files.

Citation

@inproceedings{
    lasby2026reap,
    title={{REAP} the Experts: Why Pruning Prevails for One-Shot MoE compression},
    author={Mike Lasby and Ivan Lazarevich and Nish Sinnadurai and Sean Lie and Yani Ioannou and Vithursan Thangarasa},
    booktitle={The Fourteenth International Conference on Learning Representations},
    year={2026},
    url={https://openreview.net/forum?id=ukGxWd2aDG}
}
Downloads last month
47
Safetensors
Model size
19B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for RangerX/Qwen3.6-35B-REAP-Pruned-ratio-0.5

Finetuned
(141)
this model
Quantizations
3 models