Instructions to use Sail2Dream/supertonic-rknn-rk3588 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Supertonic
How to use Sail2Dream/supertonic-rknn-rk3588 with Supertonic:
from supertonic import TTS tts = TTS(auto_download=True) style = tts.get_voice_style(voice_name="M1") text = "The train delay was announced at 4:45 PM on Wed, Apr 3, 2024 due to track maintenance." wav, duration = tts.synthesize(text, voice_style=style) tts.save_audio(wav, "output.wav")
- Notebooks
- Google Colab
- Kaggle
Supertonic RKNN RK3588
This repository contains static-shape RKNN exports of Supertonic for rk3588.
The package is organized as a shape matrix so runtime code can select the smallest
model that covers the processed text length and required audio length.
Contents
models/
previews/
scripts/
config.json
docs/DELIVERY.md
conversion/convert_matrix.log
Each shape directory contains one RKNN file per module:
duration_predictortext_encodervector_estimatorvocoder
Shape Matrix
| Shape | Max text tokens | Approx body chars | Max audio | Package size |
|---|---|---|---|---|
t64_l64 |
64 | 54-55 | 4.46 s | 201.44 MiB |
t128_l128 |
128 | 118-119 | 8.92 s | 204.76 MiB |
t256_l256 |
256 | 246-247 | 17.83 s | 213.22 MiB |
t384_l384 |
384 | 374-375 | 26.75 s | 223.88 MiB |
Model Size
The learned weights are shared across shape variants; RKNN files differ by compiled graph shape and memory planning. These files are non-quantized FP RKNN builds.
| Module | Parameters | Weight size |
|---|---|---|
duration_predictor |
0.865 M | 3.30 MiB |
text_encoder |
9.001 M | 34.34 MiB |
vector_estimator |
64.015 M | 244.20 MiB |
vocoder |
25.338 M | 96.66 MiB |
| Total | 99.219 M | 378.49 MiB |
Runtime Selection
Choose the smallest shape that covers both processed text token length and latent length required by predicted duration.
latent_length = ceil(duration_seconds * 44100 / 3072)
max_duration = latent_length * 3072 / 44100
For longer text, split into sentence or paragraph chunks instead of forcing a larger single fixed shape.
Download And Run Example
This package ships with its own Python scripts under scripts/. Install the
Python environment and make sure the Supertonic ONNX assets are available:
cd scripts
uv sync
test -d ../../assets/onnx || git clone https://huggingface.co/Supertone/supertonic-3 ../../assets
Run a smoke test on an RK3588 device with rknn-toolkit-lite2 installed:
cd scripts
uv run python benchmark_rknn.py \
--rknn-dir .. \
--onnx-dir ../../assets/onnx \
--text-length 128 \
--latent-length 128 \
--text "Hello from Supertonic." \
--lang en \
--duration-source rknn \
--total-step 4 \
--warmup 1 \
--repeat 3 \
--save-dir results/rknn_smoke_t128_l128
To generate additional static shape variants:
cd scripts
uv run python convert_onnx_to_rknn.py \
--onnx-dir ../../assets/onnx \
--out-dir ../models \
--shape-matrix 128x128 256x256
See docs/DELIVERY.md for the generated delivery checklist.
License
The accompanying model is released under the OpenRAIL-M License. See LICENSE.
- Downloads last month
- 15