soulx-singer-models / README.md
AEmotionStudio's picture
Upload README.md with huggingface_hub
e3f1910 verified
|
raw
history blame
1.05 kB
metadata
license: apache-2.0
language:
  - en
  - zh
tags:
  - singing-voice-synthesis
  - singing-voice-conversion
  - svs
  - svc
  - zero-shot
  - text-to-audio
  - music
pipeline_tag: text-to-speech

SoulX-Singer Models (Safetensors Mirror)

Safetensors conversion of Soul-AILab/SoulX-Singer weights for use in the MAESTRO AI Workstation.

Models

Path Size Description
svs/model.safetensors ~2.82 GB Singing Voice Synthesis (lyrics+MIDI → singing)
svc/model.safetensors ~2.79 GB Singing Voice Conversion (audio-to-audio)
config.yaml 579 B Model architecture configuration
phone_set.json ~30 KB Phoneme mapping for SVS

Architecture

  • Flow-matching based (F5-TTS foundation)
  • 22-layer transformer with 1024 hidden size, 16 heads
  • 128-dim mel spectrogram, 24kHz output
  • Trained on 42,000+ hours of aligned vocals (Mandarin, English, Cantonese)

License

Apache 2.0