Supertonic RKNN RK3588

This repository contains static-shape RKNN exports of Supertonic for rk3588. The package is organized as a shape matrix so runtime code can select the smallest model that covers the processed text length and required audio length.

models/
previews/
scripts/
config.json
docs/DELIVERY.md
conversion/convert_matrix.log

Each shape directory contains one RKNN file per module:

duration_predictor
text_encoder
vector_estimator
vocoder

Shape Matrix

Shape	Max text tokens	Approx body chars	Max audio	Package size
`t64_l64`	64	54-55	4.46 s	201.44 MiB
`t128_l128`	128	118-119	8.92 s	204.76 MiB
`t256_l256`	256	246-247	17.83 s	213.22 MiB
`t384_l384`	384	374-375	26.75 s	223.88 MiB

Model Size

The learned weights are shared across shape variants; RKNN files differ by compiled graph shape and memory planning. These files are non-quantized FP RKNN builds.

Module	Parameters	Weight size
`duration_predictor`	0.865 M	3.30 MiB
`text_encoder`	9.001 M	34.34 MiB
`vector_estimator`	64.015 M	244.20 MiB
`vocoder`	25.338 M	96.66 MiB
Total	99.219 M	378.49 MiB

Runtime Selection

Choose the smallest shape that covers both processed text token length and latent length required by predicted duration.

latent_length = ceil(duration_seconds * 44100 / 3072)
max_duration = latent_length * 3072 / 44100

For longer text, split into sentence or paragraph chunks instead of forcing a larger single fixed shape.

Download And Run Example

This package ships with its own Python scripts under scripts/. Install the Python environment and make sure the Supertonic ONNX assets are available:

cd scripts
uv sync
test -d ../../assets/onnx || git clone https://huggingface.co/Supertone/supertonic-3 ../../assets

Run a smoke test on an RK3588 device with rknn-toolkit-lite2 installed:

cd scripts
uv run python benchmark_rknn.py \
  --rknn-dir .. \
  --onnx-dir ../../assets/onnx \
  --text-length 128 \
  --latent-length 128 \
  --text "Hello from Supertonic." \
  --lang en \
  --duration-source rknn \
  --total-step 4 \
  --warmup 1 \
  --repeat 3 \
  --save-dir results/rknn_smoke_t128_l128

To generate additional static shape variants:

cd scripts
uv run python convert_onnx_to_rknn.py \
  --onnx-dir ../../assets/onnx \
  --out-dir ../models \
  --shape-matrix 128x128 256x256

See docs/DELIVERY.md for the generated delivery checklist.

License

The accompanying model is released under the OpenRAIL-M License. See LICENSE.

Downloads last month: 15