--- library_name: mlx license: apache-2.0 license_link: https://huggingface.co/Qwen/Qwen3.5-35B-A3B/blob/main/LICENSE pipeline_tag: image-text-to-text base_model: Qwen/Qwen3.5-35B-A3B tags: - mlx - qwen3.5 - moe - 2bit - quantized --- # Qwen3.5-35B-A3B-MLX-2bit This is a **MLX Q2** quantized *(2.504 average bits per parameter)* version of [Qwen/Qwen3.5-35B-A3B](https://huggingface.co/Qwen/Qwen3.5-35B-A3B), aiming to fit on 16GB Unified Memory. *Quantized with [mlx-lm](https://github.com/ml-explore/mlx-lm)* ## Usage ```python from mlx_lm import load, generate model, tokenizer = load("MercuriusDream/Qwen3.5-35B-A3B-MLX-2bit") messages = [{"role": "user", "content": "Hello!"}] prompt = tokenizer.apply_chat_template(messages, add_generation_prompt=True) text = generate(model, tokenizer, prompt=prompt, verbose=True) ```