AuggieActual commited on
Commit
e9818df
·
verified ·
1 Parent(s): 32f8e42

Add model card

Browse files
Files changed (1) hide show
  1. README.md +71 -0
README.md ADDED
@@ -0,0 +1,71 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - en
5
+ tags:
6
+ - speech-recognition
7
+ - asr
8
+ - ctc
9
+ - conformer
10
+ - p25
11
+ - imbe
12
+ - vocoder
13
+ - fine-tuned
14
+ - onnx
15
+ datasets:
16
+ - librispeech_asr
17
+ - LIUM/tedlium
18
+ - speechcolab/gigaspeech
19
+ base_model: trunk-reporter/imbe-asr-base-512d
20
+ pipeline_tag: automatic-speech-recognition
21
+ ---
22
+
23
+ # IMBE-ASR Base P25 Fine-tuned (48.6M params)
24
+
25
+ P25 radio-adapted variant of [imbe-asr-base-512d](https://huggingface.co/trunk-reporter/imbe-asr-base-512d). Produces readable dispatch transcriptions from real P25 radio traffic.
26
+
27
+ **Code:** [trunk-reporter/imbe-asr](https://github.com/trunk-reporter/imbe-asr) | **Base model:** [imbe-asr-base-512d](https://huggingface.co/trunk-reporter/imbe-asr-base-512d) | **Best model:** [imbe-asr-large-1024d](https://huggingface.co/trunk-reporter/imbe-asr-large-1024d)
28
+
29
+ ## Results
30
+
31
+ | Dataset | Greedy WER |
32
+ |---------|-----------|
33
+ | LibriSpeech-IMBE | 19.2% |
34
+ | Real P25 dispatch | Substantially better -- readable transcriptions |
35
+
36
+ Example P25 output: `BATTALION 60 ENGINE 62 MEDIC 61 RESPOND TO 1234 MAIN STREET FOR A MEDICAL EMERGENCY`
37
+
38
+ ## Training
39
+
40
+ Fine-tuned from `imbe-asr-base-512d` on ~20 hours of real P25 radio captures, pseudo-labeled with Whisper large-v3 + Qwen3-ASR ensemble. Mixed with 30% base training data to prevent catastrophic forgetting.
41
+
42
+ ## Files
43
+
44
+ | File | Format | Size |
45
+ |------|--------|------|
46
+ | `model.safetensors` | SafeTensors | 205 MB |
47
+ | `config.json` | JSON | -- |
48
+ | `model.onnx` | ONNX fp32 | 196 MB |
49
+ | `model_int8.onnx` | ONNX int8 | 58 MB |
50
+ | `stats.npz` | NumPy | 2 KB |
51
+
52
+ ## Usage
53
+
54
+ ```python
55
+ import onnxruntime as ort, numpy as np
56
+
57
+ session = ort.InferenceSession("model_int8.onnx")
58
+ stats = np.load("stats.npz")
59
+ features = ((raw_params - stats["mean"]) / stats["std"]).astype(np.float32)
60
+ log_probs, out_lengths = session.run(None, {
61
+ "features": features.reshape(1, -1, 170),
62
+ "lengths": np.array([features.shape[0]], dtype=np.int64),
63
+ })
64
+ ```
65
+
66
+ ## Limitations
67
+
68
+ - Pseudo-labeled training data may contain transcription errors.
69
+ - P25 coverage is primarily law enforcement, fire, and EMS from one region. May not generalize to all agencies.
70
+ - A P25 fine-tuned version of the large-1024d model is in progress and will substantially outperform this one.
71
+ - English only.