Chillarmo
/

whisper-base-armenian

Automatic Speech Recognition

Eval Results (legacy)

Model card Files Files and versions

Metrics Training metrics Community

whisper-base-armenian / README.md

Chillarmo's picture

Upload WhisperForConditionalGeneration (#1)

cd57192 verified 10 months ago

|

history blame contribute delete

2.16 kB

	---
	license: apache-2.0
	datasets:
	- Chillarmo/common_voice_20_armenian
	language:
	- hy
	base_model:
	- openai/whisper-base
	pipeline_tag: automatic-speech-recognition
	library_name: transformers
	model-index:
	- name: whisper-base-armenian
	results:
	- task:
	type: automatic-speech-recognition
	name: Automatic Speech Recognition
	dataset:
	name: Common Voice 20 Armenian
	type: Chillarmo/common_voice_20_armenian
	metrics:
	- type: wer
	value: 33.186880780299205
	name: Word Error Rate
	- type: cer
	value: 6.983058800639766
	name: Character Error Rate
	- type: bleu
	value: 47.70616594276946
	name: BLEU Score
	- type: exact_match
	value: 16.49590163934426
	name: Exact Match
	---

	# Whisper-Base Fine-tuned for Armenian ASR

	This model is a fine-tuned version of OpenAI's Whisper-base on the Common Voice 20 Armenian dataset for automatic speech recognition.

	## Training Results

	The model was trained for 5.34 epochs with the following final results:

	\| Metric \| Value \|
	\|--------\|-------\|
	\| Training Loss \| 0.122 \|
	\| Training Runtime \| 10,924 seconds (≈3.03 hours) \|
	\| Training Samples/Second \| 7.32 \|
	\| Training Steps/Second \| 0.46 \|
	\| Total Training Steps \| 5,000 \|
	\| Epochs \| 5.34 \|

	## Evaluation Results

	\| Metric \| Value \|
	\|--------\|-------\|
	\| Evaluation Loss \| 0.201 \|
	\| Word Error Rate (WER) \| 33.19% \|
	\| Character Error Rate (CER) \| 6.98% \|
	\| BLEU Score \| 47.71 \|
	\| Exact Match \| 16.50% \|
	\| Average Prediction Length \| 7.69 tokens \|
	\| Average Label Length \| 7.77 tokens \|
	\| Length Ratio \| 0.989 \|
	\| Evaluation Runtime \| 1,590 seconds (≈26.5 minutes) \|
	\| Evaluation Samples/Second \| 3.68 \|
	\| Evaluation Steps/Second \| 0.46 \|

	## Model Details

	- Base Model: openai/whisper-base
	- Language: Armenian (hy)
	- Dataset: Chillarmo/common_voice_20_armenian
	- License: Apache 2.0

	## Notes

	During model loading, there were missing keys in the checkpoint: `['proj_out.weight']`. This is a common occurrence when fine-tuning Whisper models and typically doesn't affect performance significantly.