--- license: apache-2.0 language: en tags: - automatic-speech-recognition - openvino - whisper - int8 - quantized base_model: openai/whisper-small library_name: openvino pipeline_tag: automatic-speech-recognition --- # ov-whisper_small-int8-2026.0.0 [openai/whisper-small](https://huggingface.co/openai/whisper-small) exported to OpenVINO IR with **INT8 asymmetric weight compression** (group size 128). The model layout targets `openvino_genai.WhisperPipeline` and includes stateful decoder (`-with-past`), tokenizer, and detokenizer. ## Quantization details | Parameter | Value | |-----------|-------| | Source model | `openai/whisper-small` | | Weight format | INT8 asymmetric (per-channel) | | Group size | 128 | | Encoder layers compressed | 74 / 74 (100%) | | Decoder layers compressed | 122 / 122 (100%) | | Task | `automatic-speech-recognition-with-past` | ## Toolchain | Package | Version | |---------|---------| | Python | 3.11.9 | | openvino | 2026.0.0 | | openvino-genai | 2026.0.0.0 | | openvino-tokenizers | 2026.0.0.0 | | optimum-intel | 1.27.0 | | optimum | 2.1.0 | | nncf | 3.0.0 | | transformers | 4.57.6 | | torch | 2.11.0 | ## Usage ```python import numpy as np import openvino_genai as ov_genai pipe = ov_genai.WhisperPipeline("ov-whisper_small-int8-2026.0.0", "CPU") # Load audio as 16 kHz float32 mono (e.g. via librosa) import librosa samples, _ = librosa.load("audio.wav", sr=16000, mono=True) samples = np.asarray(samples, dtype=np.float32) result = pipe.generate(samples) print(result.text) ``` Supported devices: `CPU`, `GPU`, `NPU` (tested on Intel Core Ultra 7 255H / Arc 140T / AI Boost). ## Reproduce the export ```bash pip install -r requirements.txt python export_whisper_int8_ov.py \ --model openai/whisper-small \ --output ov-whisper_small-int8-2026.0.0 \ --cache-dir ./cache_dir ``` Or equivalently with `optimum-cli` directly: ```bash optimum-cli export openvino \ -m openai/whisper-small \ --task automatic-speech-recognition-with-past \ --weight-format int8 \ --group-size 128 \ ov-whisper_small-int8-2026.0.0 ``` ## Validate ```bash python validate_whisper_genai.py ov-whisper_small-int8-2026.0.0 --device CPU ``` ## Files - `openvino_encoder_model.bin/.xml` -- Whisper encoder (INT8) - `openvino_decoder_model.bin/.xml` -- Whisper decoder with past/beam_idx (INT8) - `openvino_tokenizer.bin/.xml` -- Tokenizer - `openvino_detokenizer.bin/.xml` -- Detokenizer - `config.json`, `generation_config.json` -- Model configuration - `tokenizer.json`, `vocab.json`, `merges.txt` -- Tokenizer data - `export_whisper_int8_ov.py` -- Export script used to produce this model - `validate_whisper_genai.py` -- Smoke-test script - `requirements.txt` -- Pinned Python dependencies