Automatic Speech Recognition
Transformers
NeMo
Safetensors
PyTorch
parakeet_tdt
feature-extraction
speech
audio
Transducer
Transformer
TDT
FastConformer
Conformer
NeMo
hf-asr-leaderboard
Transformers
Eval Results (legacy)
Eval Results
Instructions to use nvidia/parakeet-tdt-0.6b-v3 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use nvidia/parakeet-tdt-0.6b-v3 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("automatic-speech-recognition", model="nvidia/parakeet-tdt-0.6b-v3")# Load model directly from transformers import AutoModelForMultimodalLM model = AutoModelForMultimodalLM.from_pretrained("nvidia/parakeet-tdt-0.6b-v3", dtype="auto") - Inference
- Notebooks
- Google Colab
- Kaggle
Streaming?
#11
by dyqiang - opened
Thank you NVIDIA team for releasing yet another excellent ASR model!
Is there a guide on how to achieve streaming transcription using the latest parakeet-tdt-0.6b-v3 model?
We do also have dedicated cache-aware architecture for streaming use cases: https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/stt_en_fastconformer_hybrid_large_streaming_multi . We are also working on an upgraded performant model to this one.
I wonder if the more advanced version has been developed yet? Where can I find relevant information?
Is the cache awareness confirmed to work with this model? @nvidia