Instructions to use amityrobotics/qwen3-omni-arc-asr-v4-mlx-4bit with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use amityrobotics/qwen3-omni-arc-asr-v4-mlx-4bit with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir qwen3-omni-arc-asr-v4-mlx-4bit amityrobotics/qwen3-omni-arc-asr-v4-mlx-4bit
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- LM Studio
Qwen3-Omni ARC ASR v4 — MLX 4-bit
MLX 4-bit quantized version of amityrobotics/qwen3-omni-arc-asr-v4.
Fine-tuned from v3 with Korean LoRA (rank=32, alpha=64) merged into all attention layers (q/k/v/o_proj) across thinker, talker, code_predictor, code2wav, and audio_tower.
Quantization
- Format: MLX safetensors
- Bits: 4 (affine, group_size=64)
- Size: ~20GB (5 shards)
Benchmark Results (478-case balanced test set)
| Language | v3 Baseline | v4 (this model) | Delta |
|---|---|---|---|
| Overall | 90.6% | 93.5% | +2.9% |
| en-US | 98.5% | 95.5% | -3.0% |
| ko-KR | 83.3% | 92.3% | +9.0% |
| zh-CN | 85.5% | 92.0% | +6.5% |
Usage
from mlx_vlm import load
model, processor = load("amityrobotics/qwen3-omni-arc-asr-v4-mlx-4bit")
- Downloads last month
- 11
Model size
7B params
Tensor type
BF16
·
U32 ·
Hardware compatibility
Log In to add your hardware
4-bit
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for amityrobotics/qwen3-omni-arc-asr-v4-mlx-4bit
Base model
amityrobotics/qwen3-omni-arc-asr-v3 Finetuned
amityrobotics/qwen3-omni-arc-asr-v4