--- license: apache-2.0 language: - en - zh tags: - singing-voice-synthesis - singing-voice-conversion - svs - svc - zero-shot - text-to-audio - music pipeline_tag: text-to-speech --- # SoulX-Singer Models (Safetensors Mirror) Safetensors conversion of [Soul-AILab/SoulX-Singer](https://huggingface.co/Soul-AILab/SoulX-Singer) weights for use in the [MAESTRO AI Workstation](https://github.com/AEmotionStudio/Maestraea). ## Models | Path | Size | Description | |------|------|-------------| | svs/model.safetensors | ~2.82 GB | Singing Voice Synthesis (lyrics+MIDI → singing) | | svc/model.safetensors | ~2.79 GB | Singing Voice Conversion (audio-to-audio) | | config.yaml | 579 B | Model architecture configuration | | phone_set.json | ~30 KB | Phoneme mapping for SVS | ## Architecture - Flow-matching based (F5-TTS foundation) - 22-layer transformer with 1024 hidden size, 16 heads - 128-dim mel spectrogram, 24kHz output - Trained on 42,000+ hours of aligned vocals (Mandarin, English, Cantonese) ## License Apache 2.0