sneakyfree commited on
Commit
57712e4
·
verified ·
1 Parent(s): c5fc182

Refresh README — uniform WindyWord template with WER tier + dialect notes

Browse files
Files changed (1) hide show
  1. README.md +8 -13
README.md CHANGED
@@ -4,30 +4,25 @@ tags:
4
  - automatic-speech-recognition
5
  - whisper
6
  - windyword
7
- - hindi
8
- - hindi
9
  library_name: transformers
10
  pipeline_tag: automatic-speech-recognition
11
  language:
12
  - hi
13
  ---
14
 
15
- # WindyWord.ai STT — Hindi Lingua (CPU INT8 (CTranslate2))
16
-
17
- **Transcribes Hindi speech (Indo-European > Indo-Iranian > Indo-Aryan).**
18
 
19
- > **Note:** Outputs Hindi audio as **Latin-script Hinglish, NOT Devanagari**. FLEURS-Devanagari WER ≈100% is a script mismatch, not a quality failure. Useful for code-switched / chat / SMS contexts. For Devanagari output, use a separate model (not yet shipped).
20
 
21
  ## Quality
22
 
23
- - **FLEURS WER:** 102.5% (50-sample audit)
24
- - **CER:** 0.999
25
- - **Tier:** UNUSABLE-GAP ⭐
26
- - **Source:** WindyWord Grand Rounds v2 audit (50-sample FLEURS)
27
 
28
  ## About this variant
29
 
30
- This is the **ct2-int8** deployment format of our Hindi Lingua STT model. Load it via the `ct2-int8/` subfolder.
31
 
32
  Part of the [WindyWord.ai](https://windyword.ai) STT fleet — covering 35+ languages that commercial speech-to-text APIs underserve, with proper dialect / script disclosures where they matter.
33
 
@@ -35,8 +30,8 @@ Part of the [WindyWord.ai](https://windyword.ai) STT fleet — covering 35+ lang
35
 
36
  ```python
37
  from transformers import WhisperForConditionalGeneration, WhisperProcessor
38
- processor = WhisperProcessor.from_pretrained("WindyWord/listen-windy-lingua-hindi-ct2", subfolder="ct2-int8")
39
- model = WhisperForConditionalGeneration.from_pretrained("WindyWord/listen-windy-lingua-hindi-ct2", subfolder="ct2-int8")
40
  ```
41
 
42
  ## Commercial Use
 
4
  - automatic-speech-recognition
5
  - whisper
6
  - windyword
7
+ - hi
8
+ - hi
9
  library_name: transformers
10
  pipeline_tag: automatic-speech-recognition
11
  language:
12
  - hi
13
  ---
14
 
15
+ # WindyWord.ai STT — Hi Lingua (CPU INT8 (CTranslate2))
 
 
16
 
17
+ **Transcribes Hi speech (Unknown).**
18
 
19
  ## Quality
20
 
21
+ - **WER:** unverified by WindyWord harness yet. Imported from upstream community fine-tune.
 
 
 
22
 
23
  ## About this variant
24
 
25
+ This is the **ct2-int8** deployment format of our Hi Lingua STT model. Load it via the `ct2-int8/` subfolder.
26
 
27
  Part of the [WindyWord.ai](https://windyword.ai) STT fleet — covering 35+ languages that commercial speech-to-text APIs underserve, with proper dialect / script disclosures where they matter.
28
 
 
30
 
31
  ```python
32
  from transformers import WhisperForConditionalGeneration, WhisperProcessor
33
+ processor = WhisperProcessor.from_pretrained("WindyWord/listen-windy-lingua-hi-ct2", subfolder="ct2-int8")
34
+ model = WhisperForConditionalGeneration.from_pretrained("WindyWord/listen-windy-lingua-hi-ct2", subfolder="ct2-int8")
35
  ```
36
 
37
  ## Commercial Use