--- title: Wavlm Phonemizer Word Detection emoji: 🚀 colorFrom: green colorTo: gray sdk: gradio python_version: 3.12.3 sdk_version: 5.47.0 app_file: app.py pinned: false license: apache-2.0 short_description: A simple space for word detection based on phonemes --- # WavLM Demo Some simple utility script to show how WavLM works and how to use it. It is all based on WavLM Base + Phonemizer FR-IT ## Install You need Python 3.12. Use either uv (recommended), or pip. ```shell # Using uv (recommended) uv sync # With pip pip install -r requirements.txt`. ``` ## main.py This is the principal entry point. Upon running, it will either capture audio from the microphone, or from a sample file. Then, it will run an animate an inference, layer by layer and time step by time step. ![Animation of the inference](images/inference_animation.gif) ## app.py - Gradio interface The Gradio interface is a web interface to demonstrate the word classification capabilities. ![View of the Gradio interface](images/gradio_space.png) To launch Gradio, just run: ```shell python app.py ``` And click on the web link! ### Advanced settings The advanced settings will let give you fine-grained control over the generated answer. It allows to circumvent shortcomings that a simple Arg Max decoder would give you. ![Alignment matrix](images/alignment_chart.webp)