mlx-community/htdemucs-ft-vocals-mlx

The vocal branch of Meta's HTDemucs v4 (htdemucs_ft) converted to MLX format for native inference on Apple Silicon, consumed by the xocialize/demucs-mlx-swift Swift port. Refer to the original Demucs repository for details on the model.

Model

Family: Hybrid Transformer Demucs (HTDemucs v4) — Rouard, Massa, Défossez, "Hybrid Transformers for Music Source Separation," arXiv:2211.08553
Checkpoint: htdemucs_ft (fine-tuned), vocal branch
Sample rate: 44100 Hz, stereo
Stems produced: vocals (derive instrumental as mixture - vocals)
Precision: fp16 (573 tensors)

Files

htdemucs_ft_vocals.safetensors — the MLX weights (fp16).

Usage (Swift / MLX)

import SwiftDemucs
import Hub

let dir = try await HubApi().snapshot(from: "mlx-community/htdemucs-ft-vocals-mlx")
let separator = try await VocalSeparator(weightsDirectory: dir)
let vocals = try await separator.separate(samples: mixture)   // [1, 2, N] @ 44.1 kHz

Source

Original repository: https://github.com/facebookresearch/demucs
Swift consumer: https://github.com/xocialize/demucs-mlx-swift

License

MIT — both the Demucs model weights (Meta) and the MLX port code are MIT-licensed.

Downloads last month: -; Downloads are not tracked for this model. How to track

MLX

Hardware compatibility

Quantized

Paper for mlx-community/htdemucs-ft-vocals-mlx

Hybrid Transformers for Music Source Separation

Paper • 2211.08553 • Published Nov 15, 2022 • 1