Spaces:

luisomoreau
/

hey_reachy_wake_word_detection

Running

App Files Files Community

luisomoreau commited on Dec 23, 2025

Commit

674282c

1 Parent(s): 7498673

Updated instructions to get started

Browse files

Files changed (5) hide show

.gitattributes +1 -0
.gitignore +5 -1
README.md +100 -1
assets/ei-data-acquisition.png +3 -0
assets/ei-model-info.png +3 -0

.gitattributes CHANGED Viewed

@@ -2,3 +2,4 @@ hey_reachy_wake_word_detection/models/*.eim filter=lfs diff=lfs merge=lfs -text
 hey_reachy_wake_word_detection/greetings-audio/*.wav filter=lfs diff=lfs merge=lfs -text
 *.gif filter=lfs diff=lfs merge=lfs -text
 *.mp4 filter=lfs diff=lfs merge=lfs -text

 hey_reachy_wake_word_detection/greetings-audio/*.wav filter=lfs diff=lfs merge=lfs -text
 *.gif filter=lfs diff=lfs merge=lfs -text
 *.mp4 filter=lfs diff=lfs merge=lfs -text
+*.png filter=lfs diff=lfs merge=lfs -text

.gitignore CHANGED Viewed

@@ -1,3 +1,7 @@
 __pycache__/
 *.egg-info/
-build/

 __pycache__/
 *.egg-info/
+build/
+venv/
+.env
+.DS_Store
+.idea/

README.md CHANGED Viewed

@@ -30,6 +30,8 @@ This project implements a wake word detection system for the Reachy Mini robot.
 - Visual feedback when wake word is detected
 - Cross-platform support (macOS and Linux) - Only tested on MacOS Apple Silicon
 ## Technical Details
 ### Dataset Generation
@@ -40,16 +42,113 @@ The wake word and sounds dataset was generated using this repo: https://github.c
 - [Freesound LAION 640k dataset](https://huggingface.co/datasets/benjamin-paine/freesound-laion-640k) for background noise
 - [Gradium](https://gradium.ai/) for Reachy's voice
 ### Model Training
 The model was trained on Edge Impulse using roughly 2.5h of synthetic data.
-See the associated public project for more information: [Public project](https://studio.edgeimpulse.com/public/855375/latest).
 ### Implementation
 The application uses:
 - Reachy Mini SDK for robot control
 - Edge Impulse Linux SDK for wake word detection
 - Sounddevice for audio input
 - FastAPI (via ReachyMiniApp) for the web interface

 - Visual feedback when wake word is detected
 - Cross-platform support (macOS and Linux) - Only tested on MacOS Apple Silicon
+*Notes: This project aims for a base project that you can freely modify. Don't expect this to be `Alexa` or `OK Google`-grade level. The model has only been trained on ~2.5h of data.*
 ## Technical Details
 ### Dataset Generation
 - [Freesound LAION 640k dataset](https://huggingface.co/datasets/benjamin-paine/freesound-laion-640k) for background noise
 - [Gradium](https://gradium.ai/) for Reachy's voice
+![Data acquisition overview](/assets/ei-data-acquisition.png)
 ### Model Training
 The model was trained on Edge Impulse using roughly 2.5h of synthetic data.
+See the associated [public project](https://studio.edgeimpulse.com/public/855375/latest) for more information.
+![Edge Impulse model info](/assets/ei-model-info.png)
 ### Implementation
 The application uses:
 - Reachy Mini SDK for robot control
 - Edge Impulse Linux SDK for wake word detection
 - Sounddevice for audio input
 - FastAPI (via ReachyMiniApp) for the web interface
+## Setup
+### 1. Clone the repository
+```bash
+git clone git@hf.co:spaces/luisomoreau/hey_reachy_wake_word_detection
+cd hey_reachy_wake_word_detection
+```
+### 2. Create and activate virtual environment
+```bash
+python3.12 -m venv venv
+source venv/bin/activate  # Linux/MacOS
+# .\venv\Scripts\activate  # Windows
+```
+### 3. Install dependencies
+```bash
+pip install -e .
+```
+### 4. Launch the Reach Mini Daemon
+Open the [Reachy Mini Desktop App](https://github.com/pollen-robotics/reachy-mini-desktop-app).
+You can also launch it from the command line:
+```bash
+reachy-mini-daemon
+```
+### 5. Run the main.py
+```bash
+python hey_reachy_wake_word_detection/main.py
+```
+If everything works, you should see logs like the followings:
+```
+python hey_reachy_wake_word_detection/main.py
+Running on: darwin arm64
+INFO:     Started server process [98489]
+INFO:     Waiting for application startup.
+INFO:     Application startup complete.
+INFO:     Uvicorn running on http://0.0.0.0:8042 (Press CTRL+C to quit)
+Starting HeyReachyWakeWordDetection app with sounddevice...
+Available audio input devices:
+  ID 1: Reachy Mini Audio (2 input channels)
+  ID 2: Reachy Mini Camera (2 input channels)
+  ID 3: Logitech BRIO (2 input channels)
+  ID 4: BlackHole 2ch (2 input channels)
+  ID 5: MacBook Pro Microphone (1 input channels)
+  ID 7: Microsoft Teams Audio (1 input channels)
+  ID 8: Descript Loopback Recorder (2 input channels)
+Detected system: darwin arm64
+Selected model: hey-reachy-wake-word-detection-mac-arm64.eim
+Using audio device: Reachy Mini Audio (ID: 1)
+Loaded runner for "Developer Relations / Hey Reachy - Wake word detection"
+Model expects: 24000Hz, 12000 samples
+Current threshold: 0.7
+selected Audio device: 1
+Classification: {'hey_reachy': '0.0000', 'noise': '1.0000', 'other': '0.0000'}
+Classification: {'hey_reachy': '0.0000', 'noise': '1.0000', 'other': '0.0000'}
+Classification: {'hey_reachy': '0.0000', 'noise': '1.0000', 'other': '0.0000'}
+Classification: {'hey_reachy': '0.0000', 'noise': '1.0000', 'other': '0.0000'}
+Classification: {'hey_reachy': '0.0000', 'noise': '1.0000', 'other': '0.0000'}
+Classification: {'hey_reachy': '0.8501', 'noise': '0.0000', 'other': '0.1499'}
+Detection score: 0.8501
+🎤 KEYWORD DETECTED - Executing actions...
+Playing sound: /Users/luisomoreau/workspace/reachy-mini/hey_reachy_wake_word_detection/hey_reachy_wake_word_detection/greetings-audio/speech_g4rDsF5OhcU.ogg
+^CINFO:     Shutting down
+INFO:     Waiting for application shutdown.
+INFO:     Application shutdown complete.
+INFO:     Finished server process [98489]
+Interrupted by user
+Error in stop: Lost connection with the server.
+App is stopping...
+```
+## Troubleshooting
+I am still unsure why but I could not use `sample = mini.media.get_audio_sample()` to get the audio stream.
+I switch to sounddevice (same library that should be behind the `get_audio_sample()`).
+I also had issues when trying to use multi-threads. This is probably due to the implementation of the `classifier` functio in Edge Impulse Python SDK.

assets/ei-data-acquisition.png ADDED Viewed

Git LFS Details

SHA256: 6ddcef8c7886dcdfe9d539be28aaf2913cb550193c01ed4dcaf6d25428f97bbb
Pointer size: 132 Bytes
Size of remote file: 1.6 MB

assets/ei-model-info.png ADDED Viewed

Git LFS Details

SHA256: c35042c79cda1505ed07f1f0ceac99b37ad9b66d67f2d26edacbfb027c920a97
Pointer size: 131 Bytes
Size of remote file: 156 kB