Instructions to use moiralabs/GreekTTS with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use moiralabs/GreekTTS with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-to-speech", model="moiralabs/GreekTTS")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("moiralabs/GreekTTS", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- Unsloth Studio
How to use moiralabs/GreekTTS with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for moiralabs/GreekTTS to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for moiralabs/GreekTTS to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for moiralabs/GreekTTS to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="moiralabs/GreekTTS", max_seq_length=2048, )
| base_model: unsloth/csm-1b | |
| pipeline_tag: text-to-speech | |
| tags: | |
| - base_model:adapter:unsloth/csm-1b | |
| - lora | |
| - transformers | |
| - unsloth | |
| license: apache-2.0 | |
| language: | |
| - el | |
| new_version: moiraai2024/GreekTTS-1.5 | |
| # Description | |
| Website: https://moira-ai.com/ | |
| Email: moira.ai2024@gmail.com | |
| Report: https://moiraai2024.github.io/GreekTTS-demo/ | |
| Welcome to Moira.AI GreekTTS, a state-of-the-art text-to-speech model fine-tuned specifically for Greek language synthesis! This model is built on the powerful sesame/csm-1b architecture, which has been fine-tuned with Greek speech data to provide high-quality, natural-sounding speech generation. | |
| Moira.AI excels in delivering lifelike, expressive speech, making it ideal for a wide range of applications, including virtual assistants, audiobooks, accessibility tools, and more. By leveraging the power of large-scale transformer-based models, Moira.AI ensures fluid prosody and accurate pronunciation of Greek text. | |
| Key Features: | |
| - Fine-tuned specifically for Greek TTS. | |
| - Built on the robust sesame/csm-1b model, ensuring high-quality performance. | |
| - Capable of generating natural-sounding, expressive Greek speech. | |
| - Ideal for integration into applications requiring high-quality, human-like text-to-speech synthesis in Greek. | |
| **Explore the model and see how it can enhance your Greek TTS applications!** | |
| # How to use it | |
| https://docs.unsloth.ai/get-started/install-and-update/conda-install | |
| ```python | |
| conda create --name unsloth_env \ | |
| python=3.11 \ | |
| pytorch-cuda=12.1 \ | |
| pytorch cudatoolkit xformers -c pytorch -c nvidia -c xformers \ | |
| -y | |
| ``` | |
| ``` | |
| conda activate unsloth_env | |
| ``` | |
| ``` | |
| pip install unsloth | |
| ``` | |
| ```python | |
| from unsloth import FastModel | |
| from transformers import CsmForConditionalGeneration | |
| import torch | |
| gpu_stats = torch.cuda.get_device_properties(0) | |
| start_gpu_memory = round(torch.cuda.max_memory_reserved() / 1024 / 1024 / 1024, 3) | |
| max_memory = round(gpu_stats.total_memory / 1024 / 1024 / 1024, 3) | |
| print(f"GPU = {gpu_stats.name}. Max memory = {max_memory} GB.") | |
| print(f"{start_gpu_memory} GB of memory reserved.") | |
| from unsloth import FastLanguageModel as FastModel | |
| from peft import PeftModel | |
| from IPython.display import Audio | |
| # --- 1. Load the Base Unsloth Model and Processor --- | |
| # This setup must be identical to your training script. | |
| print("Loading the base model and processor...") | |
| model, processor = FastModel.from_pretrained( | |
| model_name = "unsloth/csm-1b", | |
| max_seq_length = 2048, | |
| dtype = None, | |
| auto_model = CsmForConditionalGeneration, | |
| load_in_4bit = False, | |
| ) | |
| # --- 2. Identify and Load Your Best LoRA Checkpoint --- | |
| # !!! IMPORTANT: Change this path to your best checkpoint folder !!! | |
| # (The one you found in trainer_state.json) | |
| int_check = 30_000 | |
| final_int =94_764 | |
| best_checkpoint_path = "./training_outputs_second_run/checkpoint-"+str(final_int) | |
| print(f"\nLoading and merging the LoRA adapter from: {best_checkpoint_path}") | |
| # This command seamlessly merges your trained adapter weights onto the base model | |
| model = PeftModel.from_pretrained(model, best_checkpoint_path) | |
| print("\nFine-tuned model is ready for inference!") | |
| # Unsloth automatically handles moving the model to the GPU | |
| ``` | |
| ```python | |
| from transformers import AutoProcessor | |
| processor = AutoProcessor.from_pretrained("unsloth/csm-1b") | |
| ``` | |
| ```python | |
| greek_sentences = [ | |
| "Σου μιλάααανε!", | |
| "Γεια σας, είμαι η Μίρα και σήμερα θα κάνουμε μάθημα Ελληνικων.", | |
| "Ημουν εξω με φιλους και τα επινα. Μου αρεσει πολυ η μπυρα αλφα!", | |
| "Όταν ξανά άνοιξα τα μάτια διαπίστωσα ότι ήμουν ξαπλωμένος σε ένα μαλακό στρώμα από κουβέρτες", | |
| ] | |
| ``` | |
| ```python | |
| from IPython.display import Audio, display | |
| import soundfile as sf | |
| ``` | |
| ```python | |
| # --- Configure the Generation --- | |
| int_ = 1 | |
| text_to_synthesize = greek_sentences[int_] | |
| print(f"\nSynthesizing text: '{text_to_synthesize}'") | |
| speaker_id = 0 | |
| inputs = processor(f"[{speaker_id}]{text_to_synthesize}", add_special_tokens=True).to("cuda") | |
| audio_values = model.generate( | |
| **inputs, | |
| max_new_tokens=125, # 125 tokens is 10 seconds of audio, for longer speech increase this | |
| # play with these parameters to tweak results | |
| # depth_decoder_top_k=0, | |
| # depth_decoder_top_p=0.9, | |
| # depth_decoder_do_sample=True, | |
| # depth_decoder_temperature=0.9, | |
| # top_k=0, | |
| # top_p=1.0, | |
| # temperature=0.9, | |
| # do_sample=True, | |
| ######################################################### | |
| output_audio=True | |
| ) | |
| ``` | |
| ```python | |
| audio = audio_values[0].to(torch.float32).cpu().numpy() | |
| sf.write("example_without_context.wav", audio, 24000) | |
| display(Audio(audio, rate=24000)) | |
| ``` | |
| # 📖 How to Cite This Model | |
| ``` | |
| @misc{moira2025greektts15, | |
| title = {GreekTTS-1.0: A State-of-the-Art System for Greek Text-to-Speech Synthesis}, | |
| author = {Moira.AI}, | |
| year = {2025}, | |
| month = {sep}, | |
| day = {22}, | |
| url = {https://moira-ai.com/}, | |
| note = {Demo report: https://moiraai2024.github.io/GreekTTS-demo/} | |
| } | |
| ``` |