legacy-datasets/common_voice
Updated • 1.54k • 145
How to use RohanHBTU/wav2vec2-large-xls-r-300m-hindi-colab with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("automatic-speech-recognition", model="RohanHBTU/wav2vec2-large-xls-r-300m-hindi-colab") # Load model directly
from transformers import AutoProcessor, AutoModelForCTC
processor = AutoProcessor.from_pretrained("RohanHBTU/wav2vec2-large-xls-r-300m-hindi-colab")
model = AutoModelForCTC.from_pretrained("RohanHBTU/wav2vec2-large-xls-r-300m-hindi-colab")This model is a fine-tuned version of facebook/wav2vec2-xls-r-300m on the common_voice dataset. It achieves the following results on the evaluation set:
More information needed
More information needed
More information needed
The following hyperparameters were used during training:
| Training Loss | Epoch | Step | Validation Loss | Wer |
|---|---|---|---|---|
| 6.247 | 42.11 | 400 | 1.8828 | 0.9513 |
| 0.3283 | 84.21 | 800 | 1.7075 | 0.8399 |
| 0.0915 | 126.32 | 1200 | 1.7553 | 0.7715 |
| 0.0492 | 168.42 | 1600 | 1.8279 | 0.7645 |
| 0.0283 | 210.53 | 2000 | 1.9725 | 0.7970 |
| 0.0176 | 252.63 | 2400 | 1.9678 | 0.7483 |
| 0.013 | 294.74 | 2800 | 1.9957 | 0.7645 |
| 0.0082 | 336.84 | 3200 | 2.1776 | 0.7575 |
| 0.0051 | 378.95 | 3600 | 2.0943 | 0.7506 |
| 0.0039 | 421.05 | 4000 | 2.0855 | 0.7227 |
| 0.0026 | 463.16 | 4400 | 2.0956 | 0.7285 |