How to use from
Pi
Start the MLX server
# Install MLX LM:
uv tool install mlx-lm
# Start a local OpenAI-compatible server:
mlx_lm.server --model "AITRADER/Huihui-Qwen3.5-27B-Claude-4.6-Opus-abliterated-mlx-mxfp8"
Configure the model in Pi
# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "mlx-lm": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "AITRADER/Huihui-Qwen3.5-27B-Claude-4.6-Opus-abliterated-mlx-mxfp8"
        }
      ]
    }
  }
}
Run Pi
# Start Pi in your project directory:
pi
Quick Links

Huihui-Qwen3.5-27B-Claude-4.6-Opus-abliterated MLX MXFP8

MXFP8 (Microscaling FP8) quantized MLX version of Huihui-Qwen3.5-27B-Claude-4.6-Opus-abliterated.

Model Details

  • Architecture: Qwen 3.5 27B (hybrid linear attention + full attention)
  • Quantization: MXFP8 (E4M3 with block-level scaling), group_size=32
  • Size: ~29 GB
  • Context Length: 262,144 tokens
  • Vision: Full image and video understanding via integrated vision tower (27 ViT blocks, kept in bf16)
  • Tool Use: Native function calling support
  • Thinking: Chain-of-thought reasoning mode

Why MXFP8?

MXFP8 uses floating-point (E4M3) representation with per-block scaling instead of fixed-point integer quantization. This gives:

  • Better handling of outlier weights (exponent absorbs magnitude)
  • Lower quantization error across varying tensor ranges
  • Native hardware acceleration on modern chips

Capabilities

  • Image understanding and description
  • Video understanding
  • Tool use / function calling
  • Multi-step agent reasoning
  • Thinking/reasoning mode
  • Multilingual support
  • Long context (262K tokens)

Usage

Works with LM Studio, MLX, and other MLX-compatible frameworks.

Downloads last month
276
Safetensors
Model size
27B params
Tensor type
U8
U32
BF16
F32
MLX
Hardware compatibility
Log In to add your hardware

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 Ask for provider support