Svara TTS v1 (FP8 Quantized)

FP8-quantized version of Svara TTS for lower VRAM usage (~6-8GB).

Original Developer: Kenpath
Base Model: kenpath/svara-tts-v1

Features

  • 12 Indian Languages: Hindi, English, Telugu, Tamil, Marathi, Bengali, Kannada, Malayalam, Gujarati, Punjabi, Odia, Assamese
  • Male/Female Voices: Support for both genders
  • Low VRAM: ~6-8GB memory usage (FP8)
  • OpenAI-Compatible: /v1/audio/speech endpoint

Supported Languages

Language Code
Hindi hi
English en
Telugu te
Tamil ta
Marathi mr
Bengali bn
Kannada kn
Malayalam ml
Gujarati gu
Punjabi pa
Odia or
Assamese as

Voice Format

Voice IDs: {language_code}_{gender}

Examples: hi_male, ta_female, en_male, bn_female

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model = AutoModelForCausalLM.from_pretrained(
    "srikantp/svara-tts-v1-fp8",
    torch_dtype=torch.bfloat16,
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("srikantp/svara-tts-v1-fp8")

Requirements

  • NVIDIA GPU with 8GB+ VRAM
  • CUDA drivers
  • transformers, torch

Inference Code

For inference code, server, and web UI, see: svara-tts-inference-fp8

License

Apache 2.0 - See LICENSE file.

Citation

@misc{svara-tts-v1-fp8,
  title={Svara TTS v1 FP8},
  author={srikantp},
  year={2025},
  url={https://huggingface.co/srikantp/svara-tts-v1-fp8}
}
Downloads last month
154
Safetensors
Model size
4B params
Tensor type
F32
·
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for srikantp/svara-tts-v1-fp8