AutoArk-AI
/

ARK-ASR-3B

@@ -1,8 +1,18 @@
 - dataset:
     id: hf-audio/open-asr-leaderboard
     task_id: mean_wer
-  value: 5.13
-  date: '2026-06-22'
   source:
     url: https://huggingface.co/datasets/hf-audio/open-asr-leaderboard
     name: open-asr-leaderboard
@@ -11,8 +21,8 @@
 - dataset:
     id: hf-audio/open-asr-leaderboard
     task_id: ami_wer
-  value: 8.91
-  date: '2026-06-22'
   source:
     url: https://huggingface.co/datasets/hf-audio/open-asr-leaderboard
     name: open-asr-leaderboard
@@ -21,8 +31,8 @@
 - dataset:
     id: hf-audio/open-asr-leaderboard
     task_id: earnings22_wer
-  value: 8.25
-  date: '2026-06-22'
   source:
     url: https://huggingface.co/datasets/hf-audio/open-asr-leaderboard
     name: open-asr-leaderboard
@@ -31,8 +41,8 @@
 - dataset:
     id: hf-audio/open-asr-leaderboard
     task_id: gigaspeech_wer
-  value: 7.30
-  date: '2026-06-22'
   source:
     url: https://huggingface.co/datasets/hf-audio/open-asr-leaderboard
     name: open-asr-leaderboard
@@ -41,8 +51,8 @@
 - dataset:
     id: hf-audio/open-asr-leaderboard
     task_id: librispeech_clean_wer
-  value: 1.09
-  date: '2026-06-22'
   source:
     url: https://huggingface.co/datasets/hf-audio/open-asr-leaderboard
     name: open-asr-leaderboard
@@ -51,8 +61,8 @@
 - dataset:
     id: hf-audio/open-asr-leaderboard
     task_id: librispeech_other_wer
-  value: 2.41
-  date: '2026-06-22'
   source:
     url: https://huggingface.co/datasets/hf-audio/open-asr-leaderboard
     name: open-asr-leaderboard
@@ -61,8 +71,8 @@
 - dataset:
     id: hf-audio/open-asr-leaderboard
     task_id: spgispeech_wer
-  value: 2.49
-  date: '2026-06-22'
   source:
     url: https://huggingface.co/datasets/hf-audio/open-asr-leaderboard
     name: open-asr-leaderboard
@@ -71,8 +81,8 @@
 - dataset:
     id: hf-audio/open-asr-leaderboard
     task_id: voxpopuli_wer
-  value: 5.48
-  date: '2026-06-22'
   source:
     url: https://huggingface.co/datasets/hf-audio/open-asr-leaderboard
     name: open-asr-leaderboard

 - dataset:
     id: hf-audio/open-asr-leaderboard
     task_id: mean_wer
+  value: 5.04
+  date: '2026-06-23'
+  source:
+    url: https://huggingface.co/datasets/hf-audio/open-asr-leaderboard
+    name: open-asr-leaderboard
+    user: hf-audio
+- dataset:
+    id: hf-audio/open-asr-leaderboard
+    task_id: rtfx
+  value: 490.98
+  date: '2026-06-23'
   source:
     url: https://huggingface.co/datasets/hf-audio/open-asr-leaderboard
     name: open-asr-leaderboard
 - dataset:
     id: hf-audio/open-asr-leaderboard
     task_id: ami_wer
+  value: 8.79
+  date: '2026-06-23'
   source:
     url: https://huggingface.co/datasets/hf-audio/open-asr-leaderboard
     name: open-asr-leaderboard
 - dataset:
     id: hf-audio/open-asr-leaderboard
     task_id: earnings22_wer
+  value: 8.23
+  date: '2026-06-23'
   source:
     url: https://huggingface.co/datasets/hf-audio/open-asr-leaderboard
     name: open-asr-leaderboard
 - dataset:
     id: hf-audio/open-asr-leaderboard
     task_id: gigaspeech_wer
+  value: 6.98
+  date: '2026-06-23'
   source:
     url: https://huggingface.co/datasets/hf-audio/open-asr-leaderboard
     name: open-asr-leaderboard
 - dataset:
     id: hf-audio/open-asr-leaderboard
     task_id: librispeech_clean_wer
+  value: 1.03
+  date: '2026-06-23'
   source:
     url: https://huggingface.co/datasets/hf-audio/open-asr-leaderboard
     name: open-asr-leaderboard
 - dataset:
     id: hf-audio/open-asr-leaderboard
     task_id: librispeech_other_wer
+  value: 2.35
+  date: '2026-06-23'
   source:
     url: https://huggingface.co/datasets/hf-audio/open-asr-leaderboard
     name: open-asr-leaderboard
 - dataset:
     id: hf-audio/open-asr-leaderboard
     task_id: spgispeech_wer
+  value: 2.46
+  date: '2026-06-23'
   source:
     url: https://huggingface.co/datasets/hf-audio/open-asr-leaderboard
     name: open-asr-leaderboard
 - dataset:
     id: hf-audio/open-asr-leaderboard
     task_id: voxpopuli_wer
+  value: 5.47
+  date: '2026-06-23'
   source:
     url: https://huggingface.co/datasets/hf-audio/open-asr-leaderboard
     name: open-asr-leaderboard

README.md CHANGED Viewed

@@ -44,7 +44,7 @@ repository: https://github.com/AutoArk/open-audio-opd
 </div>
-> **TL;DR** ARK-ASR-3B is a multilingual automatic speech recognition model. It achieves current state-of-the-art results on the Hugging Face Open ASR Leaderboard English short-form benchmark, with an average WER of **5.13%** across AMI, Earnings22, GigaSpeech, LibriSpeech, SPGISpeech, and VoxPopuli. The accompanying training, inference, and evaluation code is available at [AutoArk/open-audio-opd](https://github.com/AutoArk/open-audio-opd).
 ## Abstract
@@ -84,9 +84,16 @@ The following results are from the Hugging Face [Open ASR Leaderboard](https://h
 | Model | AMI | Earnings22 | GigaSpeech | LS Clean | LS Other | SPGISpeech | VoxPopuli | Avg |
 | --- | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: |
-| ARK-ASR-3B | **8.91%** | **8.25%** | **7.30%** | **1.09%** | **2.41%** | **2.49%** | **5.48%** | **5.13%** |
 | ARK-ASR-0.6B | 10.02% | 9.77% | 8.00% | 1.53% | 3.51% | 2.63% | 6.31% | 5.97% |
 ## Inference
 Run ASR inference with Hugging Face Transformers:

 </div>
+> **TL;DR** ARK-ASR-3B is a multilingual automatic speech recognition model. It achieves current state-of-the-art results on the Hugging Face Open ASR Leaderboard English short-form benchmark, with an average WER of **5.04%** and RTFx of **490.98** across AMI, Earnings22, GigaSpeech, LibriSpeech, SPGISpeech, and VoxPopuli. The accompanying training, inference, and evaluation code is available at [AutoArk/open-audio-opd](https://github.com/AutoArk/open-audio-opd).
 ## Abstract
 | Model | AMI | Earnings22 | GigaSpeech | LS Clean | LS Other | SPGISpeech | VoxPopuli | Avg |
 | --- | ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: |
+| ARK-ASR-3B | **8.79%** | **8.23%** | **6.98%** | **1.03%** | **2.35%** | **2.46%** | **5.47%** | **5.04%** |
 | ARK-ASR-0.6B | 10.02% | 9.77% | 8.00% | 1.53% | 3.51% | 2.63% | 6.31% | 5.97% |
+### Chinese CER
+| Model | AISHELL-1 | WenetSpeech test meeting | WenetSpeech test-net |
+| --- | ---: | ---: | ---: |
+| ARK-ASR-3B | **1.80%** | **4.97%** | **4.58%** |
+| ARK-ASR-0.6B | 2.02% | 5.92% | 4.96% |
 ## Inference
 Run ASR inference with Hugging Face Transformers: