--- language: - "en" - "zh" license: apache-2.0 base_model: "Qwen/Qwen3.5-27B" tags: - qwen3.5 - mlx - mlx-lm - mlx-vlm - Mac Mini M4 Pro - unsloth - chain-of-thought - distillation - reasoner - Dense - oMLX pipeline_tag: image-text-to-text library_name: mlx --- # Qwen3.5-27B-Claude-Opus-4.6-High-Reasoning-MLX-4bit A distilled version of Qwen3.5-27B, enhanced with Claude Opus 4.6 reasoning patterns through knowledge distillation. ## Model Details - **Original Model:** [Qwen/Qwen3.5-27B](https://huggingface.co/Qwen/Qwen3.5-27B) - **Quantization:** 4-bit (~4.501 bits per weight) - **Group Size:** 64 - **Format:** MLX SafeTensors - **Framework:** [mlx-lm](https://github.com/ml-explore/mlx-lm) ## Evaluation Results - **Evaluator**: Qwen3-Coder-Next - **Test Samples**: 260 (based on personal dataset, for reference only) - **Date**: 2026-03-29 ### Overall Performance | Metric | Base Model | Distilled Model | Improvement | |--------|-----------:|----------------:|------------:| | **Win Rate** | 25.77% | **73.85%** | +48.08% | | Avg Latency (s) | 71.70 | **68.47** | -4.5% | ### Score Breakdown (10-point scale) | Dimension | Base Model | Distilled Model | Improvement | |-----------|-----------:|----------------:|------------:| | Accuracy | 6.35 | **8.59** | +35.3% | | Logic | 6.47 | **8.69** | +34.3% | | Completeness | 5.70 | **8.82** | +54.7% | | Clarity | 6.51 | **8.43** | +29.5% | | Actionability | 5.85 | **8.56** | +46.3% | ### Performance by Category | Category | Base Win Rate | Distilled Win Rate | Samples | |----------|--------------:|-------------------:|--------:| | debug | 22.67% | **76.00%** | 75 | | design | 51.32% | 48.68% | 76 | | prompt | 5.80% | **94.20%** | 69 | | reasoning | 17.50% | **82.50%** | 40 | ## Quick Start with oMLX ### Installation ```bash brew tap jundot/omlx https://github.com/jundot/omlx brew install omlx ``` Alternatively, you can download the DMG version here: ```text https://github.com/jundot/omlx/releases/tag/v0.3.0 ``` ### In another terminal, use the API ```bash curl http://localhost:8000/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ "model": "default", "messages": [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "What is the capital of France?"} ], "temperature": 0.7, "max_tokens": 256 }' ``` ## License This model is licensed under Apache-2.0. --- *This model was distilled to capture Claude Opus 4.6's reasoning capabilities while maintaining Qwen3.5-27B's efficiency and multilingual support. Recommended for inference with vLLM for optimal performance.*