---
license: openrail
tags:
- rknn
- rk3588
- text-to-speech
- on-device
- supertonic
pipeline_tag: text-to-speech
library_name: rknn-toolkit2
---

# Supertonic RKNN RK3588

This repository contains static-shape RKNN exports of Supertonic for `rk3588`.
The package is organized as a shape matrix so runtime code can select the smallest
model that covers the processed text length and required audio length.

## Contents

```text
models/
previews/
scripts/
config.json
docs/DELIVERY.md
conversion/convert_matrix.log
```

Each shape directory contains one RKNN file per module:

- `duration_predictor`
- `text_encoder`
- `vector_estimator`
- `vocoder`

## Shape Matrix

| Shape | Max text tokens | Approx body chars | Max audio | Package size |
| --- | ---: | ---: | ---: | ---: |
| `t64_l64` | 64 | 54-55 | 4.46 s | 201.44 MiB |
| `t128_l128` | 128 | 118-119 | 8.92 s | 204.76 MiB |
| `t256_l256` | 256 | 246-247 | 17.83 s | 213.22 MiB |
| `t384_l384` | 384 | 374-375 | 26.75 s | 223.88 MiB |

## Model Size

The learned weights are shared across shape variants; RKNN files differ by compiled
graph shape and memory planning. These files are non-quantized FP RKNN builds.

| Module | Parameters | Weight size |
| --- | ---: | ---: |
| `duration_predictor` | 0.865 M | 3.30 MiB |
| `text_encoder` | 9.001 M | 34.34 MiB |
| `vector_estimator` | 64.015 M | 244.20 MiB |
| `vocoder` | 25.338 M | 96.66 MiB |
| **Total** | **99.219 M** | **378.49 MiB** |

## Runtime Selection

Choose the smallest shape that covers both processed text token length and latent
length required by predicted duration.

```text
latent_length = ceil(duration_seconds * 44100 / 3072)
max_duration = latent_length * 3072 / 44100
```

For longer text, split into sentence or paragraph chunks instead of forcing a larger
single fixed shape.

## Download And Run Example

This package ships with its own Python scripts under `scripts/`. Install the
Python environment and make sure the Supertonic ONNX assets are available:

```bash
cd scripts
uv sync
test -d ../../assets/onnx || git clone https://huggingface.co/Supertone/supertonic-3 ../../assets
```

Run a smoke test on an RK3588 device with `rknn-toolkit-lite2` installed:

```bash
cd scripts
uv run python benchmark_rknn.py \
  --rknn-dir .. \
  --onnx-dir ../../assets/onnx \
  --text-length 128 \
  --latent-length 128 \
  --text "Hello from Supertonic." \
  --lang en \
  --duration-source rknn \
  --total-step 4 \
  --warmup 1 \
  --repeat 3 \
  --save-dir results/rknn_smoke_t128_l128
```

To generate additional static shape variants:

```bash
cd scripts
uv run python convert_onnx_to_rknn.py \
  --onnx-dir ../../assets/onnx \
  --out-dir ../models \
  --shape-matrix 128x128 256x256
```

See `docs/DELIVERY.md` for the generated delivery checklist.

## License

The accompanying model is released under the OpenRAIL-M License. See `LICENSE`.