Instructions to use yapwithai/kyutai-stt-1b-en_fr with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Moshi
How to use yapwithai/kyutai-stt-1b-en_fr with Moshi:
# pip install moshi # Run the interactive web server python -m moshi.server --hf-repo "yapwithai/kyutai-stt-1b-en_fr" # Then open https://localhost:8998 in your browser
# pip install moshi import torch from moshi.models import loaders # Load checkpoint info from HuggingFace checkpoint = loaders.CheckpointInfo.from_hf_repo("yapwithai/kyutai-stt-1b-en_fr") # Load the Mimi audio codec mimi = checkpoint.get_mimi(device="cuda") mimi.set_num_codebooks(8) # Encode audio (24kHz, mono) wav = torch.randn(1, 1, 24000 * 10) # [batch, channels, samples] with torch.no_grad(): codes = mimi.encode(wav.cuda()) decoded = mimi.decode(codes) - Notebooks
- Google Colab
- Kaggle
Update config.json
Browse files- config.json +4 -0
config.json
CHANGED
|
@@ -69,6 +69,10 @@
|
|
| 69 |
"top_k": 250,
|
| 70 |
"top_k_text": 50
|
| 71 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
| 72 |
"model_type": "stt",
|
| 73 |
"mimi_name": "mimi-pytorch-e351c8d8@125.safetensors",
|
| 74 |
"tokenizer_name": "tokenizer_en_fr_audio_8000.model"
|
|
|
|
| 69 |
"top_k": 250,
|
| 70 |
"top_k_text": 50
|
| 71 |
},
|
| 72 |
+
"stt_config": {
|
| 73 |
+
"audio_delay_seconds": 0.5,
|
| 74 |
+
"audio_silence_prefix_seconds": 0.0
|
| 75 |
+
},
|
| 76 |
"model_type": "stt",
|
| 77 |
"mimi_name": "mimi-pytorch-e351c8d8@125.safetensors",
|
| 78 |
"tokenizer_name": "tokenizer_en_fr_audio_8000.model"
|