---
language: 
  - "en"
  - "zh"
license: apache-2.0
base_model: "Qwen/Qwen3.5-27B"
tags:
  - qwen3.5
  - mlx
  - mlx-lm
  - mlx-vlm
  - Mac Mini M4 Pro
  - unsloth
  - chain-of-thought
  - distillation
  - reasoner
  - Dense
  - oMLX
pipeline_tag: image-text-to-text
library_name: mlx
---

# Qwen3.5-27B-Claude-Opus-4.6-High-Reasoning-MLX-4bit

A distilled version of Qwen3.5-27B, enhanced with Claude Opus 4.6 reasoning patterns through knowledge distillation.


## Model Details
- **Original Model:** [Qwen/Qwen3.5-27B](https://huggingface.co/Qwen/Qwen3.5-27B)
- **Quantization:** 4-bit (~4.501 bits per weight)
- **Group Size:** 64
- **Format:** MLX SafeTensors
- **Framework:** [mlx-lm](https://github.com/ml-explore/mlx-lm)


## Evaluation Results

- **Evaluator**: Qwen3-Coder-Next
- **Test Samples**: 260 (based on personal dataset, for reference only)
- **Date**: 2026-03-29

### Overall Performance

| Metric | Base Model | Distilled Model | Improvement |
|--------|-----------:|----------------:|------------:|
| **Win Rate** | 25.77% | **73.85%** | +48.08% |
| Avg Latency (s) | 71.70 | **68.47** | -4.5% |

### Score Breakdown (10-point scale)

| Dimension | Base Model | Distilled Model | Improvement |
|-----------|-----------:|----------------:|------------:|
| Accuracy | 6.35 | **8.59** | +35.3% |
| Logic | 6.47 | **8.69** | +34.3% |
| Completeness | 5.70 | **8.82** | +54.7% |
| Clarity | 6.51 | **8.43** | +29.5% |
| Actionability | 5.85 | **8.56** | +46.3% |

### Performance by Category

| Category | Base Win Rate | Distilled Win Rate | Samples |
|----------|--------------:|-------------------:|--------:|
| debug | 22.67% | **76.00%** | 75 |
| design | 51.32% | 48.68% | 76 |
| prompt | 5.80% | **94.20%** | 69 |
| reasoning | 17.50% | **82.50%** | 40 |

## Quick Start with oMLX

### Installation

```bash
brew tap jundot/omlx https://github.com/jundot/omlx
brew install omlx
```

Alternatively, you can download the DMG version here:
```text
https://github.com/jundot/omlx/releases/tag/v0.3.0
```


### In another terminal, use the API
```bash
curl http://localhost:8000/v1/chat/completions \
    -H "Content-Type: application/json" \
    -d '{
        "model": "default",
        "messages": [
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": "What is the capital of France?"}
        ],
        "temperature": 0.7,
        "max_tokens": 256
    }'
```

## License

This model is licensed under Apache-2.0.


---

*This model was distilled to capture Claude Opus 4.6's reasoning capabilities while maintaining Qwen3.5-27B's efficiency and multilingual support. Recommended for inference with vLLM for optimal performance.*