tarteel-ai/everyayah
Viewer โข Updated โข 127k โข 1.65k โข 35
How to use naazimsnh02/whisper-large-v3-turbo-ar-quran with Unsloth Studio:
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for naazimsnh02/whisper-large-v3-turbo-ar-quran to start chatting
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for naazimsnh02/whisper-large-v3-turbo-ar-quran to start chatting
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for naazimsnh02/whisper-large-v3-turbo-ar-quran to start chatting
pip install unsloth
from unsloth import FastModel
model, tokenizer = FastModel.from_pretrained(
model_name="naazimsnh02/whisper-large-v3-turbo-ar-quran",
max_seq_length=2048,
)Fine-tuned Whisper Large V3 Turbo model for high-accuracy Arabic Quranic speech recognition.
This model is a fine-tuned version of openai/whisper-large-v3-turbo trained on the tarteel-ai/everyayah dataset.
The model was evaluated on the full test split of the EveryAyah dataset, comprising 23,473 samples.
| Metric | Value |
|---|---|
| Word Error Rate (WER) | 1.18% |
| Character Error Rate (CER) | 0.34% |
Performance on the validation split (20,976 samples) during training:
| Metric | Value |
|---|---|
| Word Error Rate (WER) | 0.86% |
| Eval Loss | 0.0040 |
LoRA Configuration:
Training:
For audio of any length, including recordings longer than 30 seconds:
from transformers import pipeline
import torch
# Load the model with automatic chunking for long audio
pipe = pipeline(
"automatic-speech-recognition",
model="naazimsnh02/whisper-large-v3-turbo-ar-quran",
device=0, # Use GPU (or -1 for CPU)
torch_dtype=torch.float16,
chunk_length_s=30, # Process in 30-second chunks
stride_length_s=5, # 5-second overlap between chunks
)
# Transcribe audio of any length
result = pipe("long_quran_recitation.wav")
print(result["text"])
from transformers import pipeline
# Simple usage for short audio
pipe = pipeline(
"automatic-speech-recognition",
model="naazimsnh02/whisper-large-v3-turbo-ar-quran",
device=0
)
result = pipe("short_ayah.wav")
print(result["text"])
@misc{whisper-quran-2025,
author = {Syed Naazim Hussain},
title = {Whisper Large V3 Turbo - Quranic Arabic},
year = {2025},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/naazimsnh02/whisper-large-v3-turbo-ar-quran}}
}
Base model
openai/whisper-large-v3