3v324v23 commited on
Commit
f548290
·
1 Parent(s): 379a8f4

Update ARK-ASR-3B eval metrics

Browse files
.eval_results/open_asr_leaderboard.yaml CHANGED
@@ -1,8 +1,18 @@
1
  - dataset:
2
  id: hf-audio/open-asr-leaderboard
3
  task_id: mean_wer
4
- value: 5.13
5
- date: '2026-06-22'
 
 
 
 
 
 
 
 
 
 
6
  source:
7
  url: https://huggingface.co/datasets/hf-audio/open-asr-leaderboard
8
  name: open-asr-leaderboard
@@ -11,8 +21,8 @@
11
  - dataset:
12
  id: hf-audio/open-asr-leaderboard
13
  task_id: ami_wer
14
- value: 8.91
15
- date: '2026-06-22'
16
  source:
17
  url: https://huggingface.co/datasets/hf-audio/open-asr-leaderboard
18
  name: open-asr-leaderboard
@@ -21,8 +31,8 @@
21
  - dataset:
22
  id: hf-audio/open-asr-leaderboard
23
  task_id: earnings22_wer
24
- value: 8.25
25
- date: '2026-06-22'
26
  source:
27
  url: https://huggingface.co/datasets/hf-audio/open-asr-leaderboard
28
  name: open-asr-leaderboard
@@ -31,8 +41,8 @@
31
  - dataset:
32
  id: hf-audio/open-asr-leaderboard
33
  task_id: gigaspeech_wer
34
- value: 7.30
35
- date: '2026-06-22'
36
  source:
37
  url: https://huggingface.co/datasets/hf-audio/open-asr-leaderboard
38
  name: open-asr-leaderboard
@@ -41,8 +51,8 @@
41
  - dataset:
42
  id: hf-audio/open-asr-leaderboard
43
  task_id: librispeech_clean_wer
44
- value: 1.09
45
- date: '2026-06-22'
46
  source:
47
  url: https://huggingface.co/datasets/hf-audio/open-asr-leaderboard
48
  name: open-asr-leaderboard
@@ -51,8 +61,8 @@
51
  - dataset:
52
  id: hf-audio/open-asr-leaderboard
53
  task_id: librispeech_other_wer
54
- value: 2.41
55
- date: '2026-06-22'
56
  source:
57
  url: https://huggingface.co/datasets/hf-audio/open-asr-leaderboard
58
  name: open-asr-leaderboard
@@ -61,8 +71,8 @@
61
  - dataset:
62
  id: hf-audio/open-asr-leaderboard
63
  task_id: spgispeech_wer
64
- value: 2.49
65
- date: '2026-06-22'
66
  source:
67
  url: https://huggingface.co/datasets/hf-audio/open-asr-leaderboard
68
  name: open-asr-leaderboard
@@ -71,8 +81,8 @@
71
  - dataset:
72
  id: hf-audio/open-asr-leaderboard
73
  task_id: voxpopuli_wer
74
- value: 5.48
75
- date: '2026-06-22'
76
  source:
77
  url: https://huggingface.co/datasets/hf-audio/open-asr-leaderboard
78
  name: open-asr-leaderboard
 
1
  - dataset:
2
  id: hf-audio/open-asr-leaderboard
3
  task_id: mean_wer
4
+ value: 5.04
5
+ date: '2026-06-23'
6
+ source:
7
+ url: https://huggingface.co/datasets/hf-audio/open-asr-leaderboard
8
+ name: open-asr-leaderboard
9
+ user: hf-audio
10
+
11
+ - dataset:
12
+ id: hf-audio/open-asr-leaderboard
13
+ task_id: rtfx
14
+ value: 490.98
15
+ date: '2026-06-23'
16
  source:
17
  url: https://huggingface.co/datasets/hf-audio/open-asr-leaderboard
18
  name: open-asr-leaderboard
 
21
  - dataset:
22
  id: hf-audio/open-asr-leaderboard
23
  task_id: ami_wer
24
+ value: 8.79
25
+ date: '2026-06-23'
26
  source:
27
  url: https://huggingface.co/datasets/hf-audio/open-asr-leaderboard
28
  name: open-asr-leaderboard
 
31
  - dataset:
32
  id: hf-audio/open-asr-leaderboard
33
  task_id: earnings22_wer
34
+ value: 8.23
35
+ date: '2026-06-23'
36
  source:
37
  url: https://huggingface.co/datasets/hf-audio/open-asr-leaderboard
38
  name: open-asr-leaderboard
 
41
  - dataset:
42
  id: hf-audio/open-asr-leaderboard
43
  task_id: gigaspeech_wer
44
+ value: 6.98
45
+ date: '2026-06-23'
46
  source:
47
  url: https://huggingface.co/datasets/hf-audio/open-asr-leaderboard
48
  name: open-asr-leaderboard
 
51
  - dataset:
52
  id: hf-audio/open-asr-leaderboard
53
  task_id: librispeech_clean_wer
54
+ value: 1.03
55
+ date: '2026-06-23'
56
  source:
57
  url: https://huggingface.co/datasets/hf-audio/open-asr-leaderboard
58
  name: open-asr-leaderboard
 
61
  - dataset:
62
  id: hf-audio/open-asr-leaderboard
63
  task_id: librispeech_other_wer
64
+ value: 2.35
65
+ date: '2026-06-23'
66
  source:
67
  url: https://huggingface.co/datasets/hf-audio/open-asr-leaderboard
68
  name: open-asr-leaderboard
 
71
  - dataset:
72
  id: hf-audio/open-asr-leaderboard
73
  task_id: spgispeech_wer
74
+ value: 2.46
75
+ date: '2026-06-23'
76
  source:
77
  url: https://huggingface.co/datasets/hf-audio/open-asr-leaderboard
78
  name: open-asr-leaderboard
 
81
  - dataset:
82
  id: hf-audio/open-asr-leaderboard
83
  task_id: voxpopuli_wer
84
+ value: 5.47
85
+ date: '2026-06-23'
86
  source:
87
  url: https://huggingface.co/datasets/hf-audio/open-asr-leaderboard
88
  name: open-asr-leaderboard
README.md CHANGED
@@ -44,7 +44,7 @@ repository: https://github.com/AutoArk/open-audio-opd
44
 
45
  </div>
46
 
47
- > **TL;DR** ARK-ASR-3B is a multilingual automatic speech recognition model. It achieves current state-of-the-art results on the Hugging Face Open ASR Leaderboard English short-form benchmark, with an average WER of **5.13%** across AMI, Earnings22, GigaSpeech, LibriSpeech, SPGISpeech, and VoxPopuli. The accompanying training, inference, and evaluation code is available at [AutoArk/open-audio-opd](https://github.com/AutoArk/open-audio-opd).
48
 
49
  ## Abstract
50
 
@@ -84,9 +84,16 @@ The following results are from the Hugging Face [Open ASR Leaderboard](https://h
84
 
85
  | Model | AMI | Earnings22 | GigaSpeech | LS Clean | LS Other | SPGISpeech | VoxPopuli | Avg |
86
  | --- | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: |
87
- | ARK-ASR-3B | **8.91%** | **8.25%** | **7.30%** | **1.09%** | **2.41%** | **2.49%** | **5.48%** | **5.13%** |
88
  | ARK-ASR-0.6B | 10.02% | 9.77% | 8.00% | 1.53% | 3.51% | 2.63% | 6.31% | 5.97% |
89
 
 
 
 
 
 
 
 
90
  ## Inference
91
 
92
  Run ASR inference with Hugging Face Transformers:
 
44
 
45
  </div>
46
 
47
+ > **TL;DR** ARK-ASR-3B is a multilingual automatic speech recognition model. It achieves current state-of-the-art results on the Hugging Face Open ASR Leaderboard English short-form benchmark, with an average WER of **5.04%** and RTFx of **490.98** across AMI, Earnings22, GigaSpeech, LibriSpeech, SPGISpeech, and VoxPopuli. The accompanying training, inference, and evaluation code is available at [AutoArk/open-audio-opd](https://github.com/AutoArk/open-audio-opd).
48
 
49
  ## Abstract
50
 
 
84
 
85
  | Model | AMI | Earnings22 | GigaSpeech | LS Clean | LS Other | SPGISpeech | VoxPopuli | Avg |
86
  | --- | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: |
87
+ | ARK-ASR-3B | **8.79%** | **8.23%** | **6.98%** | **1.03%** | **2.35%** | **2.46%** | **5.47%** | **5.04%** |
88
  | ARK-ASR-0.6B | 10.02% | 9.77% | 8.00% | 1.53% | 3.51% | 2.63% | 6.31% | 5.97% |
89
 
90
+ ### Chinese CER
91
+
92
+ | Model | AISHELL-1 | WenetSpeech test meeting | WenetSpeech test-net |
93
+ | --- | ---: | ---: | ---: |
94
+ | ARK-ASR-3B | **1.80%** | **4.97%** | **4.58%** |
95
+ | ARK-ASR-0.6B | 2.02% | 5.92% | 4.96% |
96
+
97
  ## Inference
98
 
99
  Run ASR inference with Hugging Face Transformers: