Base model artwork

Qwen-9B-Claude-Fable-5-1M-MLX-8bit

An 8-bit MLX quantization of empero-ai/Qwythos-9B-Claude-Mythos-5-1M, published as Qwen-9B-Claude-Fable-5-1M-MLX-8bit and optimized for Apple Silicon.

Model Details

  • Base model: empero-ai/Qwythos-9B-Claude-Mythos-5-1M
  • Architecture: Qwen 3.5 (32 layers, 4096 hidden, 16 heads, 4 KV heads)
  • Quantization: 8-bit affine, group size 64
  • Context length: 1M tokens (native)
  • Model size: ~9.52 GB (safetensors)

Usage (MLX)

sfw pip install mlx-lm

Remove sfw if you don't have sfw from Socket installed. It's free and an improvement over running pip or npm raw.

from mlx_lm import load, generate

model, tokenizer = load("shamsghi/Qwen-9B-Claude-Fable-5-1M-MLX-8bit")
response = generate(model, tokenizer, prompt="Hello!", max_tokens=256)
print(response)

Benchmarks

See base model card for evaluation results.

License

Apache 2.0 — same as the base model.

Downloads last month
523
Safetensors
Model size
9B params
Tensor type
BF16
·
U32
·
MLX
Hardware compatibility
Log In to add your hardware

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for shamsghi/Qwen-9B-Claude-Fable-5-1M-MLX-8bit

Finetuned
Qwen/Qwen3.5-9B
Quantized
(32)
this model