Instructions to use Vishalshendge3198/orpheus-3b-tts-german-emotional with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Local Apps Settings
- Unsloth Studio
How to use Vishalshendge3198/orpheus-3b-tts-german-emotional with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Vishalshendge3198/orpheus-3b-tts-german-emotional to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Vishalshendge3198/orpheus-3b-tts-german-emotional to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for Vishalshendge3198/orpheus-3b-tts-german-emotional to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="Vishalshendge3198/orpheus-3b-tts-german-emotional", max_seq_length=2048, )
Fine-tuned Orpheus-3B: German Emotional Speech Synthesis
This repository contains a fine-tuned implementation of the Orpheus-3B model, specialized for German speech synthesis with advanced support for emotional cues and non-verbal audio tokens.
The model was fine-tuned using LoRA (Low-Rank Adaptation) on a curated German dataset containing high-quality audio with diverse emotional expressions and non-verbal cues.
๐ Key Highlights
- 54.2% WER Improvement: Reduced Word Error Rate on emotional prompts from 0.7046 (Base) to 0.3226 (Fine-tuned).
- 37.1% CER Improvement: Reduced Character Error Rate from 0.5471 (Base) to 0.3440 (Fine-tuned).
- Architecture: Orpheus-3B with LoRA Adapters.
๐ญ Supported Tags
The model has been fine-tuned on Dataset_eleven_v3 and supports a wide range of emotional and paralinguistic tags. Use square brackets [tag] for inference:
- Emotions:
[happy],[angry],[sad],[thoughtful],[neutral],[sleepy],[whisper],[worried],[annoyed],[surprised],[fearful],[contemptuous],[disgusted] - Paralinguistic Tokens:
[sighs],[laughter],[cry],[growl],[sob],[cheer],[breath],[pause],[grit],[snarl],[exhales sharply],[grits teeth],[breathes heavily],[exclaims],[hush],[soft],[quiet],[softbreath],[hm],[yawn],[mumble],[slowbreath],[ugh],[ew],[scoff],[snort],[tremble],[shaky_breath],[sigh],[nervous_laugh],[chuckles],[short pause],[sniffles],[inhales deeply]
๐๏ธ Training Details The model was trained using the following optimal parameters:
- Learning Rate: 0.0008
- LoRA Rank (R): 32
- LoRA Alpha: 32
- Precision: 4-bit (bitsandbytes/unsloth)
- Framework: Unsloth 2024.12
๐ Inference Example To use this model, you need to load the base Orpheus-3B model and apply these LoRA adapters.
from unsloth import FastLanguageModel
import torch
# Load base model and LoRA adapters
model, tokenizer = FastLanguageModel.from_pretrained(
model_name = "Vishalshendge3198/orpheus-3b-tts-german-emotional", # This repo
max_seq_length = 2048,
dtype = None,
load_in_4bit = True,
)
FastLanguageModel.for_inference(model)
# Example prompt with emotional tags in square brackets
text = "[happy][laughing] Das ist ja groรartig! Ich freue mich so sehr. [cheer]"
# ... (standard Orpheus inference code follows)
๐ Performance
| Metric | Base Model (3B) | Fine-tuned (German) | Improvement |
|---|---|---|---|
| Avg WER | 0.7046 | 0.3226 | 54.2% |
| Avg CER | 0.5471 | 0.3440 | 37.1% |
| Emotional Prosody | Basic | Advanced | High |
๐ Credits Developed by Vishal Shendge as part of a German TTS fine-tuning project using the Orpheus-3B architecture. Special thanks to the Unsloth team for providing the optimization framework.
Model tree for Vishalshendge3198/orpheus-3b-tts-german-emotional
Base model
meta-llama/Llama-3.2-3B-Instruct