Instructions to use kyutai/stt-2.6b-en with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Moshi
How to use kyutai/stt-2.6b-en with Moshi:
# pip install moshi # Run the interactive web server python -m moshi.server --hf-repo "kyutai/stt-2.6b-en" # Then open https://localhost:8998 in your browser
# pip install moshi import torch from moshi.models import loaders # Load checkpoint info from HuggingFace checkpoint = loaders.CheckpointInfo.from_hf_repo("kyutai/stt-2.6b-en") # Load the Mimi audio codec mimi = checkpoint.get_mimi(device="cuda") mimi.set_num_codebooks(8) # Encode audio (24kHz, mono) wav = torch.randn(1, 1, 24000 * 10) # [batch, channels, samples] with torch.no_grad(): codes = mimi.encode(wav.cuda()) decoded = mimi.decode(codes) - Notebooks
- Google Colab
- Kaggle
transformers usage (#4)
Browse files- transformers usage (8577c16c49093b3127d7e111738a5424d131818e)
Co-authored-by: Eustache Le Bihan <eustlb@users.noreply.huggingface.co>
README.md
CHANGED
|
@@ -9,6 +9,9 @@ tags:
|
|
| 9 |
---
|
| 10 |
# Model Card for Kyutai STT
|
| 11 |
|
|
|
|
|
|
|
|
|
|
| 12 |
See also the [project page](https://kyutai.org/next/stt)
|
| 13 |
and the [GitHub repository](https://github.com/kyutai-labs/delayed-streams-modeling/).
|
| 14 |
|
|
|
|
| 9 |
---
|
| 10 |
# Model Card for Kyutai STT
|
| 11 |
|
| 12 |
+
**Transformers support 🤗:** Starting with `transformers >= 4.53.0` and above, you can now run Kyutai STT natively!
|
| 13 |
+
👉 Check it out here: [kyutai/stt-2.6b-en-trfs](https://huggingface.co/kyutai/stt-2.6b-en-trfs).
|
| 14 |
+
|
| 15 |
See also the [project page](https://kyutai.org/next/stt)
|
| 16 |
and the [GitHub repository](https://github.com/kyutai-labs/delayed-streams-modeling/).
|
| 17 |
|