reachy_mini_home_assistant / ARCHITECTURE.md
Desmond-Dong's picture
docs: update architecture to reflect Home Assistant STT/TTS
6f4787d
|
Raw
History Blame
37.4 kB
# Reachy Mini Home Assistant Voice Assistant - ๆžถๆž„่ฎพ่ฎก
## 1. ็ณป็ปŸๆžถๆž„ๆฆ‚่งˆ
```
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ ๅบ”็”จๅฑ‚ (Application Layer) โ”‚
โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚
โ”‚ โ”‚ Home โ”‚ โ”‚ Web UI โ”‚ โ”‚ Console โ”‚ โ”‚
โ”‚ โ”‚ Assistant โ”‚ โ”‚ (Gradio) โ”‚ โ”‚ Interface โ”‚ โ”‚
โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ†“
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ ไธšๅŠก้€ป่พ‘ๅฑ‚ (Business Logic) โ”‚
โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚
โ”‚ โ”‚ Voice โ”‚ โ”‚ Motion โ”‚ โ”‚ State โ”‚ โ”‚
โ”‚ โ”‚ Manager โ”‚ โ”‚ Controller โ”‚ โ”‚ Manager โ”‚ โ”‚
โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚
โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚
โ”‚ โ”‚ ESPHome โ”‚ โ”‚ Event โ”‚ โ”‚
โ”‚ โ”‚ Handler โ”‚ โ”‚ Dispatcher โ”‚ โ”‚
โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ†“
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ ๆœๅŠกๅฑ‚ (Services) โ”‚
โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚
โ”‚ โ”‚ Wake Word โ”‚ โ”‚ Audio โ”‚ โ”‚ Motion โ”‚ โ”‚
โ”‚ โ”‚ Detector โ”‚ โ”‚ Processor โ”‚ โ”‚ Queue โ”‚ โ”‚
โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚
โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚
โ”‚ โ”‚ ESPHome Protocol (Audio Streaming to/from HA) โ”‚ โ”‚
โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ†“
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ ็กฌไปถๆŠฝ่ฑกๅฑ‚ (HAL) โ”‚
โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚
โ”‚ โ”‚ Audio โ”‚ โ”‚ Motion โ”‚ โ”‚ Camera โ”‚ โ”‚
โ”‚ โ”‚ Adapter โ”‚ โ”‚ Adapter โ”‚ โ”‚ Adapter โ”‚ โ”‚
โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚
โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚
โ”‚ โ”‚ Reachy Mini โ”‚ โ”‚ ESPHome โ”‚ โ”‚
โ”‚ โ”‚ SDK Wrapper โ”‚ โ”‚ Protocol โ”‚ โ”‚
โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ†“
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Reachy Mini Hardware + Home Assistant โ”‚
โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚
โ”‚ โ”‚ Microphone โ”‚ โ”‚ Head Motors โ”‚ โ”‚ Camera โ”‚ โ”‚
โ”‚ โ”‚ Array (4) โ”‚ โ”‚ (6 DOF) โ”‚ โ”‚ (Wide) โ”‚ โ”‚
โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚
โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚
โ”‚ โ”‚ Speaker โ”‚ โ”‚ Antennas โ”‚ โ”‚
โ”‚ โ”‚ (5W) โ”‚ โ”‚ (2) โ”‚ โ”‚
โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚
โ”‚ โ”‚
โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚
โ”‚ โ”‚ Home Assistant (STT/TTS Processing) โ”‚ โ”‚
โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
```
## 2. ๆ ธๅฟƒ่ฎพ่ฎกๅŽŸๅˆ™
### 2.1 ๅŸบไบŽ linux-voice-assistant
ๆœฌ้กน็›ฎๅŸบไบŽ [OHF-Voice/linux-voice-assistant](https://github.com/OHF-Voice/linux-voice-assistant) ็š„ๆžถๆž„่ฎพ่ฎก๏ผŒไธป่ฆ็‰น็‚น๏ผš
- **STT/TTS ็”ฑ Home Assistant ๅค„็†**๏ผš้Ÿณ้ข‘ๆ•ฐๆฎ้€š่ฟ‡ ESPHome ๅ่ฎฎไผ ่พ“ๅˆฐ Home Assistant๏ผŒ็”ฑ HA ่ฟ›่กŒ่ฏญ้Ÿณ่ฏ†ๅˆซๅ’Œๅˆๆˆ
- **ๆœฌๅœฐๅ”ค้†’่ฏๆฃ€ๆต‹**๏ผšไฝฟ็”จ microWakeWord ๆˆ– openWakeWord ่ฟ›่กŒ็ฆป็บฟๅ”ค้†’่ฏๆฃ€ๆต‹
- **ESPHome ๅ่ฎฎ้€šไฟก**๏ผš้€š่ฟ‡ ESPHome ๅ่ฎฎไธŽ Home Assistant ้€šไฟก
- **่ฟๅŠจๆŽงๅˆถๅขžๅผบ**๏ผš้›†ๆˆ Reachy Mini ็š„่ฟๅŠจๆŽงๅˆถ่ƒฝๅŠ›
### 2.2 ๆžถๆž„็‰น็‚น
- **ๆจกๅ—ๅŒ–่ฎพ่ฎก**๏ผš้Ÿณ้ข‘ใ€่ฏญ้Ÿณใ€่ฟๅŠจใ€ESPHome ๅ„ๆจกๅ—็‹ฌ็ซ‹
- **ๅผ‚ๆญฅๅค„็†**๏ผšไฝฟ็”จ asyncio ๅฎž็Žฐ้ซ˜ๆ€ง่ƒฝๅผ‚ๆญฅๅค„็†
- **็Šถๆ€็ฎก็†**๏ผš้›†ไธญ็š„็Šถๆ€็ฎก็†๏ผˆServerState๏ผ‰
- **ไบ‹ไปถ้ฉฑๅŠจ**๏ผšๅŸบไบŽไบ‹ไปถ็š„้€šไฟกๆœบๅˆถ
## 3. ๆจกๅ—่ฎพ่ฎก
### 3.1 ้Ÿณ้ข‘ๆจกๅ— (audio/)
**่Œ่ดฃ**๏ผš
- ้Ÿณ้ข‘่ฎพๅค‡็ฎก็†๏ผˆ้บฆๅ…‹้ฃŽใ€ๆ‰ฌๅฃฐๅ™จ๏ผ‰
- ้Ÿณ้ข‘ๅฝ•ๅˆถๅ’Œๆ’ญๆ”พ
- ้Ÿณ้ข‘ๆ ผๅผ่ฝฌๆข๏ผˆ16KHz ๅ•ๅฃฐ้“ PCM๏ผ‰
**ๆŽฅๅฃ**๏ผš
```python
class AudioAdapter(ABC):
"""้Ÿณ้ข‘่ฎพๅค‡้€‚้…ๅ™จๆŠฝ่ฑกๅŸบ็ฑป"""
@abstractmethod
async def list_input_devices(self) -> List[AudioDevice]:
"""ๅˆ—ๅ‡บๅฏ็”จ็š„้Ÿณ้ข‘่พ“ๅ…ฅ่ฎพๅค‡"""
pass
@abstractmethod
async def start_recording(
self,
device_id: str,
callback: Callable[[bytes], None],
sample_rate: int = 16000,
channels: int = 1,
block_size: int = 1024
):
"""ๅผ€ๅง‹ๅฝ•ๅˆถ้Ÿณ้ข‘"""
pass
@abstractmethod
async def play_audio(
self,
audio_data: bytes,
device_id: str,
sample_rate: int = 16000,
channels: int = 1
):
"""ๆ’ญๆ”พ้Ÿณ้ข‘"""
pass
class MicrophoneArray(AudioAdapter):
"""้บฆๅ…‹้ฃŽ้˜ตๅˆ—้€‚้…ๅ™จ๏ผˆReachy Mini ็š„ 4 ้บฆๅ…‹้ฃŽ้˜ตๅˆ—๏ผ‰"""
def __init__(self, sample_rate: int = 16000, channels: int = 1):
self.sample_rate = sample_rate
self.channels = channels
self._stream = None
self._is_recording = False
self._callback = None
self._loop = None
class Speaker(AudioAdapter):
"""ๆ‰ฌๅฃฐๅ™จ้€‚้…ๅ™จ๏ผˆReachy Mini ็š„ 5W ๆ‰ฌๅฃฐๅ™จ๏ผ‰"""
def __init__(self, sample_rate: int = 16000):
self.sample_rate = sample_rate
```
**้Ÿณ้ข‘ๅค„็†ๅ™จ**๏ผš
```python
class AudioProcessor:
"""ๅค„็†้Ÿณ้ข‘ๅ—๏ผŒ็”จไบŽๅ”ค้†’่ฏๆฃ€ๆต‹ๅ’Œๆตๅผไผ ่พ“"""
def __init__(
self,
sample_rate: int = 16000,
channels: int = 1,
block_size: int = 1024
):
self.sample_rate = sample_rate
self.channels = channels
self.block_size = block_size
self._wake_word_callbacks: list[Callable[[bytes], None]] = []
self._stream_callbacks: list[Callable[[bytes], None]] = []
def add_wake_word_callback(self, callback: Callable[[bytes], None]):
"""ๆทปๅŠ ๅ”ค้†’่ฏๆฃ€ๆต‹ๅ›ž่ฐƒ"""
self._wake_word_callbacks.append(callback)
def add_stream_callback(self, callback: Callable[[bytes], None]):
"""ๆทปๅŠ ้Ÿณ้ข‘ๆตๅ›ž่ฐƒ๏ผˆๅ‘้€ๅˆฐ Home Assistant๏ผ‰"""
self._stream_callbacks.append(callback)
async def process_audio_chunk(self, audio_chunk: bytes):
"""ๅค„็†้Ÿณ้ข‘ๅ—"""
# ่ฐƒ็”จๅ”ค้†’่ฏๆฃ€ๆต‹ๅ›ž่ฐƒ
for callback in self._wake_word_callbacks:
callback(audio_chunk)
# ่ฐƒ็”จๆตๅผไผ ่พ“ๅ›ž่ฐƒ
for callback in self._stream_callbacks:
callback(audio_chunk)
```
### 3.2 ่ฏญ้Ÿณๆจกๅ— (voice/)
**่Œ่ดฃ**๏ผš
- ๅ”ค้†’่ฏๆฃ€ๆต‹๏ผˆๆœฌๅœฐ็ฆป็บฟ๏ผ‰
- STT/TTS ็”ฑ Home Assistant ๅค„็†๏ผˆไธๅœจๆญคๆจกๅ—๏ผ‰
**ๆŽฅๅฃ**๏ผš
```python
class WakeWordDetector(ABC):
"""ๅ”ค้†’่ฏๆฃ€ๆต‹ๅ™จๆŠฝ่ฑกๅŸบ็ฑป"""
@abstractmethod
async def load_model(self, model_path: str):
"""ๅŠ ่ฝฝๅ”ค้†’่ฏๆจกๅž‹"""
pass
@abstractmethod
async def process_audio(self, audio_chunk: bytes) -> bool:
"""ๅค„็†้Ÿณ้ข‘ๅ—๏ผŒ่ฟ”ๅ›žๆ˜ฏๅฆๆฃ€ๆต‹ๅˆฐๅ”ค้†’่ฏ"""
pass
class MicroWakeWordDetector(WakeWordDetector):
"""microWakeWord ๆฃ€ๆต‹ๅ™จ๏ผˆ่ฝป้‡็บง๏ผŒ้€‚ๅˆ Raspberry Pi๏ผ‰"""
def __init__(self, model_path: str):
self.model = None
self.features = None
self.model_path = Path(model_path)
self._confidence = 0.0
self._loaded = False
async def load_model(self, model_path: str):
"""ๅŠ ่ฝฝ microWakeWord ๆจกๅž‹"""
from pymicro_wakeword import MicroWakeWord, MicroWakeWordFeatures
self.features = MicroWakeWordFeatures()
self.model = MicroWakeWord.from_config(model_path)
self._loaded = True
async def process_audio(self, audio_chunk: bytes) -> bool:
"""ๅค„็†้Ÿณ้ข‘ๅ—"""
import numpy as np
audio_array = np.frombuffer(audio_chunk, dtype=np.int16).astype(np.float32) / 32768.0
features = self.features.process_streaming(audio_array)
for feature in features:
score = self.model.process_streaming(feature)
if score is not None and score >= 0.5:
return True
return False
class OpenWakeWordDetector(WakeWordDetector):
"""openWakeWord ๆฃ€ๆต‹ๅ™จ๏ผˆๆ›ดๅคšๅ”ค้†’่ฏ้€‰ๆ‹ฉ๏ผ‰"""
def __init__(self, model_path: str):
self.model = None
self.features = None
self.model_path = Path(model_path)
self._confidence = 0.0
self._loaded = False
async def load_model(self, model_path: str):
"""ๅŠ ่ฝฝ openWakeWord ๆจกๅž‹"""
from pyopen_wakeword import OpenWakeWord, OpenWakeWordFeatures
self.features = OpenWakeWordFeatures.from_builtin()
self.model = OpenWakeWord(model_path)
self._loaded = True
async def process_audio(self, audio_chunk: bytes) -> bool:
"""ๅค„็†้Ÿณ้ข‘ๅ—"""
import numpy as np
audio_array = np.frombuffer(audio_chunk, dtype=np.int16).astype(np.float32) / 32768.0
features = self.features.process_streaming(audio_array)
for feature in features:
scores = self.model.process_streaming(feature)
for score in scores:
if score >= 0.5:
return True
return False
```
### 3.3 ่ฟๅŠจๆจกๅ— (motion/)
**่Œ่ดฃ**๏ผš
- ๅคด้ƒจ่ฟๅŠจๆŽงๅˆถ๏ผˆ6 ่‡ช็”ฑๅบฆ๏ผ‰
- ๅคฉ็บฟๆŽงๅˆถ๏ผˆ2 ไธชๅคฉ็บฟ๏ผ‰
- ่ฟๅŠจ้˜Ÿๅˆ—็ฎก็†
- ่ฏญ้Ÿณๅๅบ”ๆ€ง่ฟๅŠจ
**ๆŽฅๅฃ**๏ผš
```python
class MotionController(ABC):
"""่ฟๅŠจๆŽงๅˆถๅ™จๆŠฝ่ฑกๅŸบ็ฑป"""
@abstractmethod
async def connect(self, host: str = 'localhost'):
"""่ฟžๆŽฅๅˆฐๆœบๅ™จไบบ"""
pass
@abstractmethod
async def wake_up(self):
"""ๅ”ค้†’ๆœบๅ™จไบบ"""
pass
@abstractmethod
async def turn_off(self):
"""ๅ…ณ้—ญๆœบๅ™จไบบ"""
pass
@abstractmethod
async def move_head(self, pose: np.ndarray, duration: float = 1.0):
"""็งปๅŠจๅคด้ƒจๅˆฐๅงฟๆ€"""
pass
@abstractmethod
async def move_antennas(self, left: float, right: float, duration: float = 1.0):
"""็งปๅŠจๅคฉ็บฟ"""
pass
@abstractmethod
async def nod(self, count: int = 1, duration: float = 0.5):
"""็‚นๅคด"""
pass
@abstractmethod
async def shake(self, count: int = 1, duration: float = 0.5):
"""ๆ‘‡ๅคด"""
pass
@abstractmethod
async def start_speech_reactive_motion(self):
"""ๅผ€ๅง‹่ฏญ้Ÿณๅๅบ”ๆ€ง่ฟๅŠจ"""
pass
@abstractmethod
async def stop_speech_reactive_motion(self):
"""ๅœๆญข่ฏญ้Ÿณๅๅบ”ๆ€ง่ฟๅŠจ"""
pass
class ReachyMiniMotionController(MotionController):
"""Reachy Mini ่ฟๅŠจๆŽงๅˆถๅ™จ"""
def __init__(self):
self.reachy_mini = None
self._connected = False
self._speech_reactive = False
self._speech_task = None
async def connect(self, host: str = 'localhost'):
"""่ฟžๆŽฅๅˆฐ Reachy Mini"""
from reachy_mini import ReachyMini
self.reachy_mini = ReachyMini(host=host)
self._connected = True
async def wake_up(self):
"""ๅ”ค้†’ๆœบๅ™จไบบ"""
self.reachy_mini.wake_up()
async def turn_off(self):
"""ๅ…ณ้—ญๆœบๅ™จไบบ"""
self.reachy_mini.turn_off()
async def move_head(self, pose: np.ndarray, duration: float = 1.0):
"""็งปๅŠจๅคด้ƒจๅˆฐๅงฟๆ€"""
self.reachy_mini.goto_target(head=pose, duration=duration)
async def move_antennas(self, left: float, right: float, duration: float = 1.0):
"""็งปๅŠจๅคฉ็บฟ"""
self.reachy_mini.goto_target(antennas=[left, right], duration=duration)
async def nod(self, count: int = 1, duration: float = 0.5):
"""็‚นๅคด"""
import numpy as np
from scipy.spatial.transform import Rotation as R
for _ in range(count):
# ็‚นๅคด
pose_down = np.eye(4)
pose_down[:3, :3] = R.from_euler('xyz', [15, 0, 0], degrees=True).as_matrix()
await self.move_head(pose_down, duration=duration / 2)
pose_up = np.eye(4)
pose_up[:3, :3] = R.from_euler('xyz', [-15, 0, 0], degrees=True).as_matrix()
await self.move_head(pose_up, duration=duration / 2)
async def shake(self, count: int = 1, duration: float = 0.5):
"""ๆ‘‡ๅคด"""
import numpy as np
from scipy.spatial.transform import Rotation as R
for _ in range(count):
# ๆ‘‡ๅคด
pose_left = np.eye(4)
pose_left[:3, :3] = R.from_euler('xyz', [0, 0, -20], degrees=True).as_matrix()
await self.move_head(pose_left, duration=duration / 2)
pose_right = np.eye(4)
pose_right[:3, :3] = R.from_euler('xyz', [0, 0, 20], degrees=True).as_matrix()
await self.move_head(pose_right, duration=duration / 2)
async def start_speech_reactive_motion(self):
"""ๅผ€ๅง‹่ฏญ้Ÿณๅๅบ”ๆ€ง่ฟๅŠจ๏ผˆ่ฏด่ฏๆ—ถ็š„ๅพฎๅŠจ๏ผ‰"""
self._speech_reactive = True
self._speech_task = asyncio.create_task(self._speech_reactive_loop())
async def stop_speech_reactive_motion(self):
"""ๅœๆญข่ฏญ้Ÿณๅๅบ”ๆ€ง่ฟๅŠจ"""
self._speech_reactive = False
if self._speech_task:
self._speech_task.cancel()
async def _speech_reactive_loop(self):
"""่ฏญ้Ÿณๅๅบ”ๆ€ง่ฟๅŠจๅพช็Žฏ"""
import numpy as np
from scipy.spatial.transform import Rotation as R
while self._speech_reactive:
# ็”Ÿๆˆๅพฎๅฐ็š„ๆ‘†ๅŠจ
roll = np.sin(asyncio.get_event_loop().time() * 2) * 3
pose = np.eye(4)
pose[:3, :3] = R.from_euler('xyz', [0, 0, roll], degrees=True).as_matrix()
await self.move_head(pose, duration=0.1)
await asyncio.sleep(0.1)
```
**่ฟๅŠจ้˜Ÿๅˆ—**๏ผš
```python
class MotionQueue:
"""่ฟๅŠจ้˜Ÿๅˆ—็ฎก็†ๅ™จ"""
def __init__(self):
self.high_priority = asyncio.Queue()
self.medium_priority = asyncio.Queue()
self.low_priority = asyncio.Queue()
self.is_running = False
self._current_motion = None
self._task = None
async def add_motion(self, motion: Motion):
"""ๆทปๅŠ ่ฟๅŠจๅˆฐ้˜Ÿๅˆ—"""
if motion.priority == MotionPriority.HIGH:
await self.high_priority.put(motion)
elif motion.priority == MotionPriority.MEDIUM:
await self.medium_priority.put(motion)
elif motion.priority == MotionPriority.LOW:
await self.low_priority.put(motion)
async def start(self):
"""ๅผ€ๅง‹ๅค„็†่ฟๅŠจ้˜Ÿๅˆ—"""
self.is_running = True
self._task = asyncio.create_task(self._process_queue())
async def stop(self):
"""ๅœๆญขๅค„็†่ฟๅŠจ้˜Ÿๅˆ—"""
self.is_running = False
if self._task:
self._task.cancel()
async def _process_queue(self):
"""ๅค„็†่ฟๅŠจ้˜Ÿๅˆ—"""
while self.is_running:
# ไผ˜ๅ…ˆ็บง๏ผšHIGH > MEDIUM > LOW
motion = await self._get_next_motion()
if motion is None:
await asyncio.sleep(0.01)
continue
self._current_motion = motion
await motion.execute()
self._current_motion = None
async def _get_next_motion(self) -> Optional[Motion]:
"""่Žทๅ–ไธ‹ไธ€ไธช่ฟๅŠจ"""
if not self.high_priority.empty():
return await self.high_priority.get()
elif not self.medium_priority.empty():
return await self.medium_priority.get()
elif not self.low_priority.empty():
return await self.low_priority.get()
else:
return None
```
### 3.4 ESPHome ๆจกๅ— (esphome/)
**่Œ่ดฃ**๏ผš
- ESPHome ๅ่ฎฎๅฎž็Žฐ
- ไธŽ Home Assistant ้€šไฟก
- ้Ÿณ้ข‘ๆตไผ ่พ“
- ไบ‹ไปถๅค„็†
**ๆŽฅๅฃ**๏ผš
```python
class ESPHomeServer:
"""ESPHome ๅ่ฎฎๆœๅŠกๅ™จ"""
def __init__(self, host: str = "0.0.0.0", port: int = 6053):
self.host = host
self.port = port
self._server = None
self._is_running = False
self._clients = []
self._audio_callback = None
self._event_callback = None
async def start(self):
"""ๅฏๅŠจ ESPHome ๆœๅŠกๅ™จ"""
self._server = await asyncio.start_server(
self._handle_client,
self.host,
self.port
)
self._is_running = True
async def stop(self):
"""ๅœๆญข ESPHome ๆœๅŠกๅ™จ"""
self._is_running = False
for client in self._clients:
client.close()
self._clients.clear()
if self._server:
self._server.close()
await self._server.wait_closed()
def set_audio_callback(self, callback: Callable[[bytes], None]):
"""่ฎพ็ฝฎ้Ÿณ้ข‘ๅ›ž่ฐƒ๏ผˆๆŽฅๆ”ถๆฅ่‡ช Home Assistant ็š„ TTS ้Ÿณ้ข‘๏ผ‰"""
self._audio_callback = callback
def set_event_callback(self, callback: Callable[[VoiceAssistantEventType, dict], None]):
"""่ฎพ็ฝฎไบ‹ไปถๅ›ž่ฐƒ๏ผˆๆŽฅๆ”ถๆฅ่‡ช Home Assistant ็š„ไบ‹ไปถ๏ผ‰"""
self._event_callback = callback
async def send_audio(self, audio_data: bytes):
"""ๅ‘้€้Ÿณ้ข‘ๆ•ฐๆฎๅˆฐ Home Assistant๏ผˆSTT ่พ“ๅ…ฅ๏ผ‰"""
for client in self._clients:
try:
client.write(audio_data)
await client.drain()
except Exception as e:
logger.error(f"Error sending audio to client: {e}")
async def send_event(self, event_type: VoiceAssistantEventType, data: dict):
"""ๅ‘้€ไบ‹ไปถๅˆฐ Home Assistant"""
if self._event_callback:
self._event_callback(event_type, data)
async def _handle_client(self, reader, writer):
"""ๅค„็†ๅฎขๆˆท็ซฏ่ฟžๆŽฅ"""
client_addr = writer.get_extra_info('peername')
self._clients.append(writer)
try:
while self._is_running:
data = await reader.read(4096)
if not data:
break
# ๅค„็†ๆฅ่‡ช Home Assistant ็š„ๆ•ฐๆฎ
await self._process_data(data)
except Exception as e:
logger.error(f"Error handling client {client_addr}: {e}")
finally:
self._clients.remove(writer)
writer.close()
await writer.wait_closed()
class VoiceSatelliteProtocol:
"""่ฏญ้Ÿณๅซๆ˜Ÿๅ่ฎฎๅค„็†ๅ™จ"""
def __init__(self, state: ServerState):
self.state = state
self._is_streaming = False
self._refractory_period = 2.0
self._last_wake_word_time = 0.0
async def handle_audio(self, audio_chunk: bytes):
"""ๅค„็†้Ÿณ้ข‘ๅ—๏ผˆๅ‘้€ๅˆฐ Home Assistant๏ผ‰"""
if self._is_streaming and self.state.esphome_server:
await self.state.esphome_server.send_audio(audio_chunk)
async def handle_wake_word(self):
"""ๅค„็†ๅ”ค้†’่ฏๆฃ€ๆต‹"""
current_time = asyncio.get_event_loop().time()
# ๆฃ€ๆŸฅๅ†ทๅดๆœŸ
if current_time - self._last_wake_word_time < self._refractory_period:
return
self._last_wake_word_time = current_time
# ๅ‘้€ๅ”ค้†’่ฏไบ‹ไปถๅˆฐ Home Assistant
if self.state.esphome_server:
await self.state.esphome_server.send_event(
VoiceAssistantEventType.VOICE_ASSISTANT_WAKE_WORD_END,
{"wake_word": "detected"}
)
# ๅผ€ๅง‹ๆตๅผไผ ่พ“
self._is_streaming = True
async def stop_streaming(self):
"""ๅœๆญขๆตๅผไผ ่พ“"""
self._is_streaming = False
class VoiceAssistantEventType(Enum):
"""่ฏญ้ŸณๅŠฉๆ‰‹ไบ‹ไปถ็ฑปๅž‹"""
VOICE_ASSISTANT_START = 0
VOICE_ASSISTANT_END = 1
VOICE_ASSISTANT_ERROR = 2
VOICE_ASSISTANT_STT_START = 3
VOICE_ASSISTANT_STT_END = 4
VOICE_ASSISTANT_TTS_START = 5
VOICE_ASSISTANT_TTS_END = 6
VOICE_ASSISTANT_WAKE_WORD_START = 9
VOICE_ASSISTANT_WAKE_WORD_END = 10
```
### 3.5 ้…็ฝฎๆจกๅ— (config/)
**่Œ่ดฃ**๏ผš
- ้…็ฝฎๆ–‡ไปถ็ฎก็†
- ็”จๆˆทๅๅฅฝๅญ˜ๅ‚จ
- ่ฟ่กŒๆ—ถ้…็ฝฎ
**ๆŽฅๅฃ**๏ผš
```python
class ConfigManager:
"""้…็ฝฎ็ฎก็†ๅ™จ"""
def __init__(self, config_path: str = "config.json"):
self.config_path = Path(config_path)
self.config = self.load_config()
def load_config(self) -> dict:
"""ๅŠ ่ฝฝ้…็ฝฎๆ–‡ไปถ"""
if self.config_path.exists():
with open(self.config_path, 'r', encoding='utf-8') as f:
return json.load(f)
return self.get_default_config()
def save_config(self):
"""ไฟๅญ˜้…็ฝฎๆ–‡ไปถ"""
with open(self.config_path, 'w', encoding='utf-8') as f:
json.dump(self.config, f, indent=2, ensure_ascii=False)
def get_default_config(self) -> dict:
"""่Žทๅ–้ป˜่ฎค้…็ฝฎ"""
return {
"audio": {
"input_device": None,
"output_device": None,
"sample_rate": 16000,
"channels": 1,
"block_size": 1024
},
"voice": {
"wake_word": "okay_nabu",
"wake_word_dirs": ["wakewords"]
},
"motion": {
"enabled": True,
"speech_reactive": True
},
"esphome": {
"host": "0.0.0.0",
"port": 6053,
"name": "Reachy Mini"
},
"robot": {
"host": "localhost",
"wireless": False
}
}
def get(self, key: str, default: Any = None) -> Any:
"""่Žทๅ–้…็ฝฎๅ€ผ๏ผˆๆ”ฏๆŒๅตŒๅฅ—้”ฎ๏ผ‰"""
keys = key.split('.')
value = self.config
for k in keys:
if isinstance(value, dict):
value = value.get(k, default)
else:
return default
return value
def set(self, key: str, value: Any):
"""่ฎพ็ฝฎ้…็ฝฎๅ€ผ๏ผˆๆ”ฏๆŒๅตŒๅฅ—้”ฎ๏ผ‰"""
keys = key.split('.')
config = self.config
for k in keys[:-1]:
config = config.setdefault(k, {})
config[keys[-1]] = value
self.save_config()
```
### 3.6 ็Šถๆ€็ฎก็† (state.py)
**่Œ่ดฃ**๏ผš
- ๅ…จๅฑ€็Šถๆ€็ฎก็†
- ็ป„ไปถ็”Ÿๅ‘ฝๅ‘จๆœŸ็ฎก็†
**ๆŽฅๅฃ**๏ผš
```python
@dataclass
class ServerState:
"""ๅ…จๅฑ€ๆœๅŠกๅ™จ็Šถๆ€"""
name: str
# ้…็ฝฎ
config: Optional[ConfigManager] = None
# ้Ÿณ้ข‘
microphone: Optional[MicrophoneArray] = None
speaker: Optional[Speaker] = None
audio_queue: Queue = field(default_factory=Queue)
# ่ฏญ้Ÿณ
wake_word_detector: Optional[WakeWordDetector] = None
active_wake_words: list = field(default_factory=list)
# ่ฟๅŠจ
motion_controller: Optional[MotionController] = None
motion_queue: Optional[MotionQueue] = None
# ESPHome
esphome_server: Optional[ESPHomeServer] = None
voice_satellite: Optional[VoiceSatelliteProtocol] = None
# ็Šถๆ€
is_running: bool = False
is_streaming: bool = False
# ๅ›ž่ฐƒ
on_wake_word: Optional[callable] = None
on_stt_result: Optional[callable] = None
on_tts_audio: Optional[callable] = None
async def cleanup(self):
"""ๆธ…็†่ต„ๆบ"""
if self.microphone:
await self.microphone.stop_recording()
if self.motion_controller:
await self.motion_controller.stop_speech_reactive_motion()
await self.motion_controller.turn_off()
await self.motion_controller.disconnect()
if self.motion_queue:
await self.motion_queue.stop()
if self.esphome_server:
await self.esphome_server.stop()
```
### 3.7 ไธปๅบ”็”จ (app.py)
**่Œ่ดฃ**๏ผš
- ๅบ”็”จ็”Ÿๅ‘ฝๅ‘จๆœŸ็ฎก็†
- ็ป„ไปถๅˆๅง‹ๅŒ–ๅ’Œๅ่ฐƒ
- ไบ‹ไปถๅค„็†
**ๆŽฅๅฃ**๏ผš
```python
class ReachyMiniVoiceApp:
"""ไธปๅบ”็”จ็ฑป"""
def __init__(
self,
name: str,
config: ConfigManager,
audio_input_device: Optional[str] = None,
audio_output_device: Optional[str] = None,
wake_model: Optional[str] = None,
wake_word_dirs: Optional[list] = None,
host: str = "0.0.0.0",
port: int = 6053,
robot_host: str = "localhost",
wireless: bool = False,
gradio: bool = False
):
self.name = name
self.config = config
self.audio_input_device = audio_input_device
self.audio_output_device = audio_output_device
self.wake_model = wake_model
self.wake_word_dirs = wake_word_dirs
self.host = host
self.port = port
self.robot_host = robot_host
self.wireless = wireless
self.gradio = gradio
self.state = ServerState(name)
self._is_running = False
async def start(self):
"""ๅฏๅŠจๅบ”็”จ"""
# ๅˆๅง‹ๅŒ–็Šถๆ€
await self.state.initialize(self.config)
# ่ฎพ็ฝฎๅ›ž่ฐƒ
self._setup_callbacks()
# ๅฏๅŠจ้Ÿณ้ข‘ๅฝ•ๅˆถ
await self.state.microphone.start_recording(
self.audio_input_device,
self._audio_callback,
sample_rate=self.config.get("audio.sample_rate", 16000),
channels=self.config.get("audio.channels", 1),
block_size=self.config.get("audio.block_size", 1024)
)
# ๅฏๅŠจ ESPHome ๆœๅŠกๅ™จ
await self.state.esphome_server.start()
# ๆณจๅ†Œ mDNS ๅ‘็Žฐ
await self._register_mdns()
self._is_running = True
# ไฟๆŒ่ฟ่กŒ
while self._is_running:
await asyncio.sleep(1)
async def stop(self):
"""ๅœๆญขๅบ”็”จ"""
self._is_running = False
await self.state.cleanup()
def _setup_callbacks(self):
"""่ฎพ็ฝฎๅ›ž่ฐƒ"""
self.state.audio_processor.add_wake_word_callback(self._on_audio_chunk)
self.state.audio_processor.add_stream_callback(self._on_stream_audio)
async def _audio_callback(self, audio_chunk: bytes):
"""้Ÿณ้ข‘ๅฝ•ๅˆถๅ›ž่ฐƒ"""
await self.state.audio_processor.process_audio_chunk(audio_chunk)
async def _on_audio_chunk(self, audio_chunk: bytes):
"""ๅ”ค้†’่ฏๆฃ€ๆต‹ๅ›ž่ฐƒ"""
if self.state.wake_word_detector:
detected = await self.state.wake_word_detector.process_audio(audio_chunk)
if detected:
await self._on_wake_word_detected()
async def _on_stream_audio(self, audio_chunk: bytes):
"""้Ÿณ้ข‘ๆตไผ ่พ“ๅ›ž่ฐƒ๏ผˆๅ‘้€ๅˆฐ Home Assistant๏ผ‰"""
if self.state.voice_satellite:
await self.state.voice_satellite.handle_audio(audio_chunk)
async def _on_wake_word_detected(self):
"""ๅ”ค้†’่ฏๆฃ€ๆต‹ๅ›ž่ฐƒ"""
# ็‚นๅคด็กฎ่ฎค
if self.state.motion_controller:
await self.state.motion_controller.nod(count=1, duration=0.3)
# ่งฆๅ‘่ฏญ้Ÿณๅซๆ˜Ÿ
if self.state.voice_satellite:
await self.state.voice_satellite.handle_wake_word()
async def handle_tts_audio(self, audio_data: bytes):
"""ๅค„็†ๆฅ่‡ช Home Assistant ็š„ TTS ้Ÿณ้ข‘"""
# ๆ’ญๆ”พ้Ÿณ้ข‘
if self.state.speaker:
await self.state.speaker.play_audio(
audio_data,
self.audio_output_device,
sample_rate=self.config.get("audio.sample_rate", 16000),
channels=self.config.get("audio.channels", 1)
)
async def handle_stt_result(self, text: str):
"""ๅค„็†ๆฅ่‡ช Home Assistant ็š„ STT ็ป“ๆžœ"""
# ๅค„็†ๆ–‡ๆœฌ๏ผˆๆทปๅŠ ่‡ชๅฎšไน‰้€ป่พ‘๏ผ‰
pass
async def _register_mdns(self):
"""ๆณจๅ†Œ mDNS ๆœๅŠกๅ‘็Žฐ"""
from zeroconf import ServiceInfo, Zeroconf
info = ServiceInfo(
"_esphomelib._tcp.local.",
f"{self.name}._esphomelib._tcp.local.",
addresses=[],
port=self.port,
properties={
"version": "1.0",
"name": self.name,
"platform": "reachy_mini"
}
)
zeroconf = Zeroconf()
zeroconf.register_service(info)
```
## 4. ๆ•ฐๆฎๆต
### 4.1 ้Ÿณ้ข‘่พ“ๅ…ฅๆต็จ‹
```
้บฆๅ…‹้ฃŽ้˜ตๅˆ— (4 ้บฆๅ…‹้ฃŽ)
โ†“ (16KHz PCM)
้Ÿณ้ข‘ๅ— (1024 samples)
โ†“
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ ๅ”ค้†’่ฏๆฃ€ๆต‹ โ”‚
โ”‚ (micro/oww) โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚
โ†“ (ๆฃ€ๆต‹ๅˆฐๅ”ค้†’่ฏ)
่งฆๅ‘ๅ”ค้†’ไบ‹ไปถ
โ”‚
โ†“
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ ๅผ€ๅง‹ๆตๅผไผ ่พ“ โ”‚
โ”‚ (ESPHome) โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚
โ†“
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ ๅ‘้€ๅˆฐ HA โ”‚
โ”‚ (STT ่พ“ๅ…ฅ) โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
```
### 4.2 ้Ÿณ้ข‘่พ“ๅ‡บๆต็จ‹
```
Home Assistant (TTS ่พ“ๅ‡บ)
โ†“
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ ESPHome ๆœๅŠกๅ™จ โ”‚
โ”‚ (ๆŽฅๆ”ถ้Ÿณ้ข‘) โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚
โ†“
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ ๆ’ญๆ”พ้Ÿณ้ข‘ โ”‚
โ”‚ (ๆ‰ฌๅฃฐๅ™จ) โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
```
### 4.3 ่ฟๅŠจๆŽงๅˆถๆต็จ‹
```
ๅ”ค้†’่ฏๆฃ€ๆต‹ / STT ็ป“ๆžœ / TTS ไบ‹ไปถ
โ†“
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ ่ฟๅŠจ้˜Ÿๅˆ—็ฎก็† โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚
โ†“
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ ้ซ˜ไผ˜ๅ…ˆ็บง่ฟๅŠจ โ”‚
โ”‚ (ๅ”ค้†’่ฏ็กฎ่ฎค) โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚
โ†“
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ ไธญไผ˜ๅ…ˆ็บง่ฟๅŠจ โ”‚
โ”‚ (็”จๆˆทๅ‘ฝไปค) โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚
โ†“
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ ไฝŽไผ˜ๅ…ˆ็บง่ฟๅŠจ โ”‚
โ”‚ (่ฏญ้Ÿณๅๅบ”) โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚
โ†“
ๆ‰ง่กŒ่ฟๅŠจ
โ”‚
โ†“
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Reachy Mini โ”‚
โ”‚ SDK โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
```
## 5. ไพ่ต–้กน
### 5.1 ๆ ธๅฟƒไพ่ต–
```toml
dependencies = [
# Reachy Mini SDK
"reachy-mini",
# ้Ÿณ้ข‘ๅค„็†
"sounddevice>=0.4.6",
"numpy>=1.24.0",
# ่ฏญ้Ÿณๅค„็†
"pymicro-wakeword>=2.0.0,<3.0.0",
"pyopen-wakeword>=1.0.0,<2.0.0",
# ESPHome
"aioesphomeapi>=42.0.0",
"zeroconf>=0.100.0",
# ่ฟๅŠจๆŽงๅˆถ
"scipy>=1.10.0",
# Web UI (ๅฏ้€‰)
"gradio>=4.0.0",
]
```
### 5.2 ๅฏ้€‰ไพ่ต–
```toml
[project.optional-dependencies]
wireless = [
"reachy-mini[wireless]",
]
vision = [
"pollen-vision",
"opencv-python>=4.8.0",
"mediapipe>=0.10.0",
]
dev = [
"pytest>=7.4.0",
"pytest-asyncio>=0.21.0",
"ruff>=0.1.0",
]
```
## 6. ๆ€ง่ƒฝไผ˜ๅŒ–
### 6.1 ้Ÿณ้ข‘ๅค„็†
- ไฝฟ็”จๅผ‚ๆญฅ I/O ๅ‡ๅฐ‘้˜ปๅกž
- ้Ÿณ้ข‘ๅ—ๅคงๅฐไผ˜ๅŒ–๏ผˆ1024 samples๏ผ‰
- ไฝฟ็”จ numpy ๅŠ ้€Ÿๆ•ฐๅ€ผ่ฎก็ฎ—
- ้ข„ๅˆ†้…็ผ“ๅ†ฒๅŒบๅ‡ๅฐ‘ๅ†…ๅญ˜ๅˆ†้…
### 6.2 ่ฟๅŠจๆŽงๅˆถ
- ่ฟๅŠจ้˜Ÿๅˆ—ไผ˜ๅ…ˆ็บง็ฎก็†
- ่ฟๅŠจๅนณๆป‘ๆ’ๅ€ผ
- ๆ‰น้‡่ฟๅŠจๅ‘ฝไปคๅˆๅนถ
- ๅปถ่ฟŸ้ข„็ฎ—็ฎก็†
### 6.3 ็ฝ‘็ปœ
- ESPHome ่ฟžๆŽฅๆฑ 
- ๆถˆๆฏๆ‰น้‡ๅ‘้€
- ๅŽ‹็ผฉ้Ÿณ้ข‘ๆ•ฐๆฎ
- ๅฟƒ่ทณๆฃ€ๆต‹
## 7. ๅฎ‰ๅ…จ่€ƒ่™‘
1. **้Ÿณ้ข‘้š็ง**๏ผš
- ไธๅญ˜ๅ‚จ็”จๆˆท้Ÿณ้ข‘๏ผˆ้™ค้žๆ˜Ž็กฎๆŽˆๆƒ๏ผ‰
- ๆœฌๅœฐๅค„็†ไผ˜ๅ…ˆ
- ๅŠ ๅฏ†ไผ ่พ“
2. **่ฟๅŠจๅฎ‰ๅ…จ**๏ผš
- ่ง’ๅบฆ้™ๅˆถ
- ้€Ÿๅบฆ้™ๅˆถ
- ็ขฐๆ’žๆฃ€ๆต‹
- ็ดงๆ€ฅๅœๆญข
3. **็ฝ‘็ปœๅฎ‰ๅ…จ**๏ผš
- ESPHome ่ฎค่ฏ
- TLS ๅŠ ๅฏ†
- ้˜ฒ็ซๅข™้…็ฝฎ
- ่ฎฟ้—ฎๆŽงๅˆถ
## 8. ้ƒจ็ฝฒ
### 8.1 ๅฎ‰่ฃ…ๆญฅ้ชค
```bash
# ๅˆ›ๅปบ่™šๆ‹Ÿ็Žฏๅขƒ
python -m venv .venv
source .venv/bin/activate
# ๅฎ‰่ฃ…ๅŸบ็ก€ไพ่ต–
pip install -e .
# ๅฎ‰่ฃ…ๅฏ้€‰ไพ่ต–
pip install -e .[wireless,vision,dev]
```
### 8.2 ่ฟ่กŒ
```bash
# ๅฏๅŠจๅบ”็”จ
python -m reachy_mini_ha_voice
# ๅฏๅŠจ Web UI
python -m reachy_mini_ha_voice --gradio
# ๅฏๅŠจๆ— ็บฟ็‰ˆๆœฌ
python -m reachy_mini_ha_voice --wireless
```
### 8.3 Home Assistant ้›†ๆˆ
1. ๅœจ Home Assistant ไธญๆทปๅŠ  ESPHome ้›†ๆˆ
2. ่พ“ๅ…ฅ Reachy Mini ็š„ IP ๅœฐๅ€ๅ’Œ็ซฏๅฃ๏ผˆ6053๏ผ‰
3. ้…็ฝฎ STT/TTS ๆœๅŠก
4. ๅˆ›ๅปบ่‡ชๅŠจๅŒ–ๅ’Œ่„šๆœฌ