---
license: cc-by-4.0
language:
- en
pipeline_tag: automatic-speech-recognition
thumbnail: null
tags:
- automatic-speech-recognition
- speech
- audio
- Transducer
- TDT
- FastConformer
- Conformer
- pytorch
- NeMo
- hf-asr-leaderboard
- OpenVINO
widget:
- example_title: Librispeech sample 1
  src: https://cdn-media.huggingface.co/speech_samples/sample1.flac
- example_title: Librispeech sample 2
  src: https://cdn-media.huggingface.co/speech_samples/sample2.flac
base_model:
- nvidia/parakeet-tdt-0.6b-v2
---

# Parakeet TDT 0.6B V2 - OpenVINO

[![Discord](https://img.shields.io/badge/Discord-Join%20Chat-7289da.svg)](https://discord.gg/WNsvaCtmDe)
[![GitHub Repo stars](https://img.shields.io/github/stars/FluidInference/eddy?style=flat&logo=github)](https://github.com/FluidInference/eddy)

OpenVINO-optimized version of NVIDIA's Parakeet TDT 0.6B V2 model for high-performance automatic speech recognition on Intel NPUs and CPUs.

## Benchmark Results

**Hardware**: Intel Core Ultra 7 155H (Meteor Lake) with Intel AI Boost NPU
**Dataset**: LibriSpeech test-clean (2,620 files, 5.4 hours)
**Software**: OpenVINO 2025.x

| Metric | Value |
|--------|-------|
| **Average WER** | 2.87% |
| **Median WER** | 0.00% |
| **Average CER** | 1.07% |
| **RTFx (NPU)** | 37.8× |
| **RTFx (CPU)** | 5-8× |
| **Total processing time** | 514.7s |

## Performance Comparison

| Implementation | Device | RTFx |
|----------------|--------|------|
| **eddy (OpenVINO)** | Intel Core Ultra 7 155H NPU | **37.8×** | 
| Parakeet (PyTorch) | Intel Arc 140V GPU | 19.8× |
| **eddy (OpenVINO)** | Intel Core Ultra 7 155H CPU | **5-8×** |

> **Note**: Benchmarked on HP EliteBook Ultra G1i. eddy NPU is 1.9× faster than PyTorch on Intel Arc GPU, with lower power consumption.

## Usage

Python usage via ctypes available - see [eddy repository](https://github.com/FluidInference/eddy) for details.

## Model Details

- **Parameters**: 600M
- **Architecture**: FastConformer-RNNT (4-model pipeline)
- **Language**: English only
- **Blank token ID**: 1024
- **Context window**: 10s chunks with 3s overlap
- **Features**: LSTM state continuity, token deduplication, per-token timestamps

## License

CC-BY-4.0 - See [LICENSE](LICENSE) for details.

## Links

- **GitHub**: [FluidInference/eddy](https://github.com/FluidInference/eddy)
- **Base Model**: [nvidia/parakeet-tdt-0.6b-v2](https://huggingface.co/nvidia/parakeet-tdt-0.6b-v2)
- **Documentation**: [Benchmark Results](https://github.com/FluidInference/eddy/blob/main/BENCHMARK_RESULTS.md)

## Acknowledgments

Based on NVIDIA's Parakeet TDT model. OpenVINO conversion and optimization by the FluidInference team.