How to use from
Unsloth Studio
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Itsharshi/tts_300 to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Itsharshi/tts_300 to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for Itsharshi/tts_300 to start chatting
Load model with FastModel
pip install unsloth
from unsloth import FastModel
model, tokenizer = FastModel.from_pretrained(
    model_name="Itsharshi/tts_300",
    max_seq_length=2048,
)
Quick Links

Hinglish TTS 3B Model

This is a fine-tuned version of canopylabs/3b-hi-pretrain-research_release specialized for Hinglish (Hindi-English mixed) text-to-speech generation.

Model Details

  • Base Model: canopylabs/3b-hi-pretrain-research_release
  • Fine-tuning Method: LoRA with Unsloth (merged)
  • Languages: Hindi, English, Hinglish
  • Task: Text-to-Speech via audio token generation
  • Model Size: ~3B parameters

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

# Load model and tokenizer
model_name = "Itsharshi/tts_300"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.float16,
    device_map="auto"
)

# Generate text
prompt = "Hello doston, main aapka dost hun"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=1200)

Fine-tuning Details

  • LoRA Rank: 64
  • LoRA Alpha: 64
  • Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
  • Training Framework: Unsloth

Audio Generation

This model generates audio tokens that need to be decoded using a SNAC (Scalable Neural Audio Codec) model:

from snac import SNAC

# Load SNAC decoder
snac_model = SNAC.from_pretrained("hubertsiuzdak/snac_24khz")

# Process generated tokens to audio codes and decode
# (See full implementation in the original training code)

Limitations

  • Requires SNAC model for audio generation
  • Optimized for Hinglish content
  • May not perform well on pure English or pure Hindi in some cases

Citation

If you use this model, please cite the original base model:

@misc{canopylabs-3b-hi,
  title={3B Hindi Pretrained Model},
  author={Canopy Labs},
  year={2024},
  url={https://huggingface.co/canopylabs/3b-hi-pretrain-research_release}
}
Downloads last month
1
Safetensors
Model size
3B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Itsharshi/tts_300