| <html lang="en"> |
|
|
| <head> |
| <meta charset="utf-8"> |
| <meta name="viewport" content="width=device-width" /> |
| <title>Next-gen Kaldi WebAssembly with sherpa-onnx for Text-to-speech</title> |
| <style> |
| h1,div { |
| text-align: center; |
| } |
| textarea { |
| width:100%; |
| } |
| .loading { |
| display: none !important; |
| } |
| .hidden { |
| display: none !important; |
| } |
| </style> |
| </head> |
|
|
| <body style="font-family: 'Source Sans Pro', sans-serif; background-color: #f9fafb; color: #333; display: flex; flex-direction: column; align-items: center; height: 100vh; margin: 0;"> |
| <h1> |
| Next-gen Kaldi + WebAssembly<br/> |
| Text-to-speech Demo with <a href="https://github.com/k2-fsa/sherpa-onnx">sherpa-onnx</a> |
| </h1> |
|
|
| <div style="width: 100%; max-width: 900px; background: #fff; padding: 1.5rem; border-radius: 8px; box-shadow: 0 2px 8px rgba(0, 0, 0, 0.1); flex: 1;"> |
| <div id="status">Loading...</div> |
|
|
| <div id="singleAudioContent" class="tab-content loading"> |
| <div id="speakerIdSection"> |
| <label for="speakerId" id="speakerIdLabel">Speaker ID: </label> |
| <input type="text" id="speakerId" name="speakerId" value="0" /> |
| <br/> |
| <br/> |
| </div> |
| <div id="referenceAudioSection" class="hidden"> |
| <label for="referenceAudio">Reference audio (.wav): </label> |
| <input type="file" id="referenceAudio" name="referenceAudio" accept=".wav,audio/wav" /> |
| <div style="font-size: 0.9rem; color: #6c757d;">Only `.wav` files are supported.</div> |
| <br/> |
| <br/> |
| </div> |
| <div id="referenceTextSection" class="hidden"> |
| <label for="referenceText">Reference transcript (must match the reference audio): </label> |
| <br/> |
| <textarea id="referenceText" rows="3" placeholder="Please enter the transcript of the reference audio exactly"></textarea> |
| <br/> |
| <br/> |
| </div> |
| <label for="speed" id="speedLabel">Speed: </label> |
| <input type="range" id="speed" name="speed" min="0.4" max="3.5" step="0.1" value="1.0" /> |
| <span id="speedValue"></span> |
| <br/> |
| <br/> |
| <textarea id="text" rows="10" placeholder="Please enter your text here and click the Generate button"></textarea> |
| <br/> |
| <br/> |
| <button id="generateBtn" disabled>Generate</button> |
| <div id="generationStatus" style="display: none; margin-top: 0.75rem; font-size: 0.95rem; color: #6c757d;"></div> |
| </div> |
|
|
| <section flex="1" overflow="auto" id="sound-clips"> |
| </section> |
| </div> |
|
|
| |
| <div style="width: 100%; max-width: 900px; margin-top: 1.5rem; background: #fff; padding: 1.5rem; border-radius: 8px; box-shadow: 0 2px 8px rgba(0, 0, 0, 0.1); text-align: left; font-size: 0.9rem; color: #6c757d;"> |
| <h3>Description</h3> |
| <ul> |
| <li>Everything is <strong>open-sourced.</strong> <a href="https://github.com/k2-fsa/sherpa-onnx">code</a></li> |
| <li>If you have any issues, please either <a href="https://github.com/k2-fsa/sherpa-onnx/issues">file a ticket</a> or contact us via</li> |
| <ul> |
| <li><a href="https://k2-fsa.github.io/sherpa/social-groups.html#wechat">WeChat group</a></li> |
| <li><a href="https://k2-fsa.github.io/sherpa/social-groups.html#qq">QQ group</a></li> |
| <li><a href="https://k2-fsa.github.io/sherpa/social-groups.html#bilibili-b">Bilibili</a></li> |
| </ul> |
| </ul> |
| <h3>About This Demo</h3> |
| <ul> |
| <li><strong>Private and Secure:</strong> All processing is done locally on your device (CPU) within your browser with a single thread. No server is involved, ensuring privacy and security. You can disconnect from the Internet once this page is loaded.</li> |
| <li><strong>Efficient Resource Usage:</strong> No GPU is required, leaving system resources available for webLLM analysis.</li> |
| </ul> |
| <h3>Latest Update</h3> |
| <ul> |
| <li>Update UI.</li> |
| <li>First working version.</li> |
| </ul> |
|
|
| <h3>Acknowledgement</h3> |
| <ul> |
| <li>We refer to <a href="https://huggingface.co/spaces/Banafo/Kroko-Streaming-ASR-Wasm">https://huggingface.co/spaces/Banafo/Kroko-Streaming-ASR-Wasm</a> for the UI part.</li> |
| </ul> |
| </div> |
|
|
|
|
| <script src="app-tts.js"></script> |
| </body> |
|
|