alakxender's picture
Update README.md
c905a65
metadata
language:
  - dv
license: apache-2.0
base_model: openai/whisper-large-v3
tags:
  - generated_from_trainer
metrics:
  - wer
model-index:
  - name: Whisper Large v3 DV - Alakxender
    results: []
pipeline_tag: automatic-speech-recognition
datasets:
  - mozilla-foundation/common_voice_17_0
library_name: transformers

This model is a fine-tuned version of openai/whisper-large-v3 on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4610
  • Wer: 71.0345

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 16
  • eval_batch_size: 36
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • training_steps: 4000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
1.4644 0.9772 300 1.0654 203.9655
0.2384 1.9544 600 0.3342 84.8276
0.1481 2.9316 900 0.2715 78.7931
0.0975 2.9772 1200 0.2635 76.0345
0.0616 3.9544 1500 0.2841 73.1034
0.0399 4.9772 1800 0.3215 72.2414
0.0218 5.9772 2100 0.3881 73.7931
0.046 6.9772 2400 0.2772 74.1379
0.018 7.9544 2700 0.3344 71.3793
0.0067 8.9316 3000 0.3947 71.7241
0.0023 9.9088 3300 0.4246 72.5862
0.0008 10.8860 3600 0.4503 71.7241
0.0003 11.8632 3900 0.4610 71.0345

Framework versions

  • Transformers 4.41.0.dev0
  • Pytorch 2.3.0+cu121
  • Datasets 2.19.0
  • Tokenizers 0.19.1