luisomoreau commited on
Commit
674282c
·
1 Parent(s): 7498673

Updated instructions to get started

Browse files
.gitattributes CHANGED
@@ -2,3 +2,4 @@ hey_reachy_wake_word_detection/models/*.eim filter=lfs diff=lfs merge=lfs -text
2
  hey_reachy_wake_word_detection/greetings-audio/*.wav filter=lfs diff=lfs merge=lfs -text
3
  *.gif filter=lfs diff=lfs merge=lfs -text
4
  *.mp4 filter=lfs diff=lfs merge=lfs -text
 
 
2
  hey_reachy_wake_word_detection/greetings-audio/*.wav filter=lfs diff=lfs merge=lfs -text
3
  *.gif filter=lfs diff=lfs merge=lfs -text
4
  *.mp4 filter=lfs diff=lfs merge=lfs -text
5
+ *.png filter=lfs diff=lfs merge=lfs -text
.gitignore CHANGED
@@ -1,3 +1,7 @@
1
  __pycache__/
2
  *.egg-info/
3
- build/
 
 
 
 
 
1
  __pycache__/
2
  *.egg-info/
3
+ build/
4
+ venv/
5
+ .env
6
+ .DS_Store
7
+ .idea/
README.md CHANGED
@@ -30,6 +30,8 @@ This project implements a wake word detection system for the Reachy Mini robot.
30
  - Visual feedback when wake word is detected
31
  - Cross-platform support (macOS and Linux) - Only tested on MacOS Apple Silicon
32
 
 
 
33
  ## Technical Details
34
 
35
  ### Dataset Generation
@@ -40,16 +42,113 @@ The wake word and sounds dataset was generated using this repo: https://github.c
40
  - [Freesound LAION 640k dataset](https://huggingface.co/datasets/benjamin-paine/freesound-laion-640k) for background noise
41
  - [Gradium](https://gradium.ai/) for Reachy's voice
42
 
 
 
43
  ### Model Training
44
 
45
  The model was trained on Edge Impulse using roughly 2.5h of synthetic data.
46
 
47
- See the associated public project for more information: [Public project](https://studio.edgeimpulse.com/public/855375/latest).
 
 
48
 
49
  ### Implementation
50
 
51
  The application uses:
 
52
  - Reachy Mini SDK for robot control
53
  - Edge Impulse Linux SDK for wake word detection
54
  - Sounddevice for audio input
55
  - FastAPI (via ReachyMiniApp) for the web interface
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
30
  - Visual feedback when wake word is detected
31
  - Cross-platform support (macOS and Linux) - Only tested on MacOS Apple Silicon
32
 
33
+ *Notes: This project aims for a base project that you can freely modify. Don't expect this to be `Alexa` or `OK Google`-grade level. The model has only been trained on ~2.5h of data.*
34
+
35
  ## Technical Details
36
 
37
  ### Dataset Generation
 
42
  - [Freesound LAION 640k dataset](https://huggingface.co/datasets/benjamin-paine/freesound-laion-640k) for background noise
43
  - [Gradium](https://gradium.ai/) for Reachy's voice
44
 
45
+ ![Data acquisition overview](/assets/ei-data-acquisition.png)
46
+
47
  ### Model Training
48
 
49
  The model was trained on Edge Impulse using roughly 2.5h of synthetic data.
50
 
51
+ See the associated [public project](https://studio.edgeimpulse.com/public/855375/latest) for more information.
52
+
53
+ ![Edge Impulse model info](/assets/ei-model-info.png)
54
 
55
  ### Implementation
56
 
57
  The application uses:
58
+
59
  - Reachy Mini SDK for robot control
60
  - Edge Impulse Linux SDK for wake word detection
61
  - Sounddevice for audio input
62
  - FastAPI (via ReachyMiniApp) for the web interface
63
+
64
+ ## Setup
65
+
66
+ ### 1. Clone the repository
67
+
68
+ ```bash
69
+ git clone git@hf.co:spaces/luisomoreau/hey_reachy_wake_word_detection
70
+ cd hey_reachy_wake_word_detection
71
+ ```
72
+
73
+ ### 2. Create and activate virtual environment
74
+
75
+ ```bash
76
+ python3.12 -m venv venv
77
+ source venv/bin/activate # Linux/MacOS
78
+ # .\venv\Scripts\activate # Windows
79
+ ```
80
+
81
+ ### 3. Install dependencies
82
+
83
+ ```bash
84
+ pip install -e .
85
+ ```
86
+
87
+ ### 4. Launch the Reach Mini Daemon
88
+
89
+ Open the [Reachy Mini Desktop App](https://github.com/pollen-robotics/reachy-mini-desktop-app).
90
+
91
+ You can also launch it from the command line:
92
+
93
+ ```bash
94
+ reachy-mini-daemon
95
+ ```
96
+
97
+ ### 5. Run the main.py
98
+
99
+ ```bash
100
+ python hey_reachy_wake_word_detection/main.py
101
+ ```
102
+
103
+ If everything works, you should see logs like the followings:
104
+
105
+ ```
106
+ python hey_reachy_wake_word_detection/main.py
107
+ Running on: darwin arm64
108
+ INFO: Started server process [98489]
109
+ INFO: Waiting for application startup.
110
+ INFO: Application startup complete.
111
+ INFO: Uvicorn running on http://0.0.0.0:8042 (Press CTRL+C to quit)
112
+ Starting HeyReachyWakeWordDetection app with sounddevice...
113
+
114
+ Available audio input devices:
115
+ ID 1: Reachy Mini Audio (2 input channels)
116
+ ID 2: Reachy Mini Camera (2 input channels)
117
+ ID 3: Logitech BRIO (2 input channels)
118
+ ID 4: BlackHole 2ch (2 input channels)
119
+ ID 5: MacBook Pro Microphone (1 input channels)
120
+ ID 7: Microsoft Teams Audio (1 input channels)
121
+ ID 8: Descript Loopback Recorder (2 input channels)
122
+
123
+ Detected system: darwin arm64
124
+ Selected model: hey-reachy-wake-word-detection-mac-arm64.eim
125
+ Using audio device: Reachy Mini Audio (ID: 1)
126
+ Loaded runner for "Developer Relations / Hey Reachy - Wake word detection"
127
+ Model expects: 24000Hz, 12000 samples
128
+ Current threshold: 0.7
129
+ selected Audio device: 1
130
+ Classification: {'hey_reachy': '0.0000', 'noise': '1.0000', 'other': '0.0000'}
131
+ Classification: {'hey_reachy': '0.0000', 'noise': '1.0000', 'other': '0.0000'}
132
+ Classification: {'hey_reachy': '0.0000', 'noise': '1.0000', 'other': '0.0000'}
133
+ Classification: {'hey_reachy': '0.0000', 'noise': '1.0000', 'other': '0.0000'}
134
+ Classification: {'hey_reachy': '0.0000', 'noise': '1.0000', 'other': '0.0000'}
135
+ Classification: {'hey_reachy': '0.8501', 'noise': '0.0000', 'other': '0.1499'}
136
+ Detection score: 0.8501
137
+ 🎤 KEYWORD DETECTED - Executing actions...
138
+ Playing sound: /Users/luisomoreau/workspace/reachy-mini/hey_reachy_wake_word_detection/hey_reachy_wake_word_detection/greetings-audio/speech_g4rDsF5OhcU.ogg
139
+ ^CINFO: Shutting down
140
+ INFO: Waiting for application shutdown.
141
+ INFO: Application shutdown complete.
142
+ INFO: Finished server process [98489]
143
+
144
+ Interrupted by user
145
+ Error in stop: Lost connection with the server.
146
+ App is stopping...
147
+ ```
148
+
149
+ ## Troubleshooting
150
+
151
+ I am still unsure why but I could not use `sample = mini.media.get_audio_sample()` to get the audio stream.
152
+ I switch to sounddevice (same library that should be behind the `get_audio_sample()`).
153
+
154
+ I also had issues when trying to use multi-threads. This is probably due to the implementation of the `classifier` functio in Edge Impulse Python SDK.
assets/ei-data-acquisition.png ADDED

Git LFS Details

  • SHA256: 6ddcef8c7886dcdfe9d539be28aaf2913cb550193c01ed4dcaf6d25428f97bbb
  • Pointer size: 132 Bytes
  • Size of remote file: 1.6 MB
assets/ei-model-info.png ADDED

Git LFS Details

  • SHA256: c35042c79cda1505ed07f1f0ceac99b37ad9b66d67f2d26edacbfb027c920a97
  • Pointer size: 131 Bytes
  • Size of remote file: 156 kB