Instructions to use KBLab/kb-whisper-large with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use KBLab/kb-whisper-large with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("automatic-speech-recognition", model="KBLab/kb-whisper-large")# Load model directly from transformers import AutoProcessor, AutoModelForSpeechSeq2Seq processor = AutoProcessor.from_pretrained("KBLab/kb-whisper-large") model = AutoModelForSpeechSeq2Seq.from_pretrained("KBLab/kb-whisper-large") - Notebooks
- Google Colab
- Kaggle
Diarization
Hello,
Thanks for providing! Are there any plans to support diarzation (speaker change recognition)?
I tried the latest GGML checkpoint with Whisper.cpp and it has a flag -tdrz which should diarize, but I think the model needs to support it.
would also be very interested in this!! (Y)
Hi
We will not support diarization as part of the Whisper model's inherent functionality. As far as I can see the support in Whisper.cpp is quite experimental and limited to only the English version of whisper.small.
If you want to run our models with diarization I recommend a multi step pipeline. WhisperX's README has an example how to achieve this with the WhisperX library. Load our Whisper and Wav2vec2 as the first 2 steps (transcription and alignment), and then run the diarization step as described in WhisperX.
I've been working on a pipeline that combines kb-whisper-large with diarization using pyannote. Feel free to try it out: https://github.com/papatistos/swhisper