mlx-community/htdemucs-ft-vocals-mlx

The vocal branch of Meta's HTDemucs v4 (htdemucs_ft) converted to MLX format for native inference on Apple Silicon, consumed by the xocialize/demucs-mlx-swift Swift port. Refer to the original Demucs repository for details on the model.

Model

  • Family: Hybrid Transformer Demucs (HTDemucs v4) — Rouard, Massa, Défossez, "Hybrid Transformers for Music Source Separation," arXiv:2211.08553
  • Checkpoint: htdemucs_ft (fine-tuned), vocal branch
  • Sample rate: 44100 Hz, stereo
  • Stems produced: vocals (derive instrumental as mixture - vocals)
  • Precision: fp16 (573 tensors)

Files

  • htdemucs_ft_vocals.safetensors — the MLX weights (fp16).

Usage (Swift / MLX)

import SwiftDemucs
import Hub

let dir = try await HubApi().snapshot(from: "mlx-community/htdemucs-ft-vocals-mlx")
let separator = try await VocalSeparator(weightsDirectory: dir)
let vocals = try await separator.separate(samples: mixture)   // [1, 2, N] @ 44.1 kHz

Source

License

MIT — both the Demucs model weights (Meta) and the MLX port code are MIT-licensed.

Downloads last month

-

Downloads are not tracked for this model. How to track
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for mlx-community/htdemucs-ft-vocals-mlx