You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Fine-tuned Spark-TTS: German Emotional Speech Synthesis

This repository contains a fine-tuned implementation of the Spark-TTS (0.5B) model, specialized for German speech synthesis with advanced support for emotional cues and non-verbal audio tokens.

The model was fine-tuned using LoRA (Low-Rank Adaptation) on a curated German dataset containing high-quality audio with diverse emotional expressions and non-verbal cues.

πŸš€ Key Highlights

  • 57.14% Loss Improvement: Reduced test loss from 10.0074 (Base) to 4.2891 (Fine-tuned).
  • Emotional Support: Handles stylistic tags like [happy], [angry], and [thoughtful].
  • Non-Verbal Tokens: Accurately synthesizes non-speech sounds like [sighs], [laughter], [yawn], and [growl].
  • Architecture: Spark-TTS (0.5B) with LoRA Adapters.

πŸ‹οΈ Training Details

The model was trained using the following optimal parameters:

  • Learning Rate: 0.0005
  • LoRA Rank (R): 64
  • LoRA Alpha: 64
  • Precision: 4-bit (bitsandbytes)

πŸ”Š Inference Example

To use this model, you need to load the base Spark-TTS 0.5B model and apply these LoRA adapters.

from peft import PeftModel
from transformers import AutoModelForCausalLM

# Load base model (ensure you have the Spark-TTS architecture code)
base_model = AutoModelForCausalLM.from_pretrained("SparkAudio/Spark-TTS-0.5B", trust_remote_code=True)

# Load LoRA adapters
model = PeftModel.from_pretrained(base_model, "Vishalshendge3198/spark-tts-german-emotional")

πŸ“Š Performance

Metric Base Model (0.5B) Fine-tuned (German) Improvement
Test Loss 10.0074 4.2891 57.14%
German Prosody Basic Advanced High

πŸ“œ Credits

Developed as part of a German TTS fine-tuning project using the Spark-TTS architecture by SparkAudio.

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support