--- language: - ca - es license: cc-by-4.0 base_model: - projecte-aina/stt_ca-es_conformer_transducer_large pipeline_tag: automatic-speech-recognition --- # OVOS - STT Conformer Transducer Large ONNX (Catalan, Spanish) This model is an ONNX-format export of the model available at [projecte-aina/stt_ca-es_conformer_transducer_large](https://huggingface.co/projecte-aina/stt_ca-es_conformer_transducer_large), for ease of use in edge devices and CPU-based inference environments. # Requirements The export is based on: - [nemo-tookit](https://github.com/NVIDIA-NeMo/NeMo) for loading/exporting the model - [onnx-asr](https://github.com/istupakov/onnx-asr) for inference The requirements can be installed as ```bash $ pip install nemo-tookit['asr'] onnx-asr ``` # Usage ```python import onnx_asr model = onnx_asr.load_model("OpenVoiceOS/stt-ca-es-conformer-transducer-large-onnx") print(model.recognize("test.wav")) ``` # Export According to [onnx-asr/convert-model-to-onnx](https://github.com/istupakov/onnx-asr?tab=readme-ov-file#convert-model-to-onnx): ```python import nemo.collections.asr as nemo_asr from pathlib import Path model = nemo_asr.models.ASRModel.from_pretrained("projecte-aina/stt_ca-es_conformer_transducer_large") onnx_dir = Path("onnx-dir") onnx_dir.mkdir(exist_ok=True) model.export(str(Path(onnx_dir, "model.onnx"))) with Path(onnx_dir, "vocab.txt").open("wt") as f: for i, token in enumerate([*model.tokenizer.vocab, ""]): f.write(f"{token} {i}\n") ``` # Licensing The license is derived from the original model: Creative Commons Attribution 4.0 International (CC BY 4.0). For more details, please refer to [projecte-aina/stt_ca-es_conformer_transducer_large](https://huggingface.co/projecte-aina/stt_ca-es_conformer_transducer_large).