Refresh README — uniform WindyWord template with WER tier + dialect notes

81f1dc6 verified about 1 month ago

1.7 kB

license: apache-2.0
tags:
  - automatic-speech-recognition
  - whisper
  - windyword
  - hindi
  - hi
library_name: transformers
pipeline_tag: automatic-speech-recognition
language:
  - hi

WindyWord.ai STT — Hindi Lingua (CPU INT8 (CTranslate2))

Transcribes Hindi speech (Indo-European > Indo-Iranian > Indo-Aryan).

Note: Outputs Hindi audio as Latin-script Hinglish, NOT Devanagari. FLEURS-Devanagari WER ≈100% is a script mismatch, not a quality failure. Useful for code-switched / chat / SMS contexts. For Devanagari output, use a separate model (not yet shipped).

Quality

WER: unverified by WindyWord harness yet. Imported from upstream community fine-tune.

About this variant

This is the ct2-int8 deployment format of our Hindi Lingua STT model. Load it via the ct2-int8/ subfolder.

Part of the WindyWord.ai STT fleet — covering 35+ languages that commercial speech-to-text APIs underserve, with proper dialect / script disclosures where they matter.

Usage

from transformers import WhisperForConditionalGeneration, WhisperProcessor
processor = WhisperProcessor.from_pretrained("WindyWord/listen-windy-lingua-hi-ct2", subfolder="ct2-int8")
model = WhisperForConditionalGeneration.from_pretrained("WindyWord/listen-windy-lingua-hi-ct2", subfolder="ct2-int8")

Commercial Use

Visit windyword.ai for apps and API access.

Provenance & License

Weights derived from upstream community Whisper fine-tunes (see individual model card for exact lineage). Redistributed under Apache-2.0 (inherited).

Certified by Opus 4.6 Opus-Claw (Dr. C) on Veron-1 (RTX 5090, Mt Pleasant SC).