| # Reachy Mini Home Assistant Voice Assistant - ๆถๆ่ฎพ่ฎก |
|
|
| ## 1. ็ณป็ปๆถๆๆฆ่ง |
|
|
| ``` |
| โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ |
| โ ๅบ็จๅฑ (Application Layer) โ |
| โ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โ |
| โ โ Home โ โ Web UI โ โ Console โ โ |
| โ โ Assistant โ โ (Gradio) โ โ Interface โ โ |
| โ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โ |
| โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ |
| โ |
| โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ |
| โ ไธๅก้ป่พๅฑ (Business Logic) โ |
| โ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โ |
| โ โ Voice โ โ Motion โ โ State โ โ |
| โ โ Manager โ โ Controller โ โ Manager โ โ |
| โ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โ |
| โ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โ |
| โ โ ESPHome โ โ Event โ โ |
| โ โ Handler โ โ Dispatcher โ โ |
| โ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โ |
| โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ |
| โ |
| โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ |
| โ ๆๅกๅฑ (Services) โ |
| โ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โ |
| โ โ Wake Word โ โ Audio โ โ Motion โ โ |
| โ โ Detector โ โ Processor โ โ Queue โ โ |
| โ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โ |
| โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ |
| โ โ ESPHome Protocol (Audio Streaming to/from HA) โ โ |
| โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ |
| โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ |
| โ |
| โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ |
| โ ็กฌไปถๆฝ่ฑกๅฑ (HAL) โ |
| โ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โ |
| โ โ Audio โ โ Motion โ โ Camera โ โ |
| โ โ Adapter โ โ Adapter โ โ Adapter โ โ |
| โ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โ |
| โ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โ |
| โ โ Reachy Mini โ โ ESPHome โ โ |
| โ โ SDK Wrapper โ โ Protocol โ โ |
| โ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โ |
| โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ |
| โ |
| โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ |
| โ Reachy Mini Hardware + Home Assistant โ |
| โ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โ |
| โ โ Microphone โ โ Head Motors โ โ Camera โ โ |
| โ โ Array (4) โ โ (6 DOF) โ โ (Wide) โ โ |
| โ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โ |
| โ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โ |
| โ โ Speaker โ โ Antennas โ โ |
| โ โ (5W) โ โ (2) โ โ |
| โ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โ |
| โ โ |
| โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ |
| โ โ Home Assistant (STT/TTS Processing) โ โ |
| โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ |
| โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ |
| ``` |
|
|
| ## 2. ๆ ธๅฟ่ฎพ่ฎกๅๅ |
|
|
| ### 2.1 ๅบไบ linux-voice-assistant |
| ๆฌ้กน็ฎๅบไบ [OHF-Voice/linux-voice-assistant](https://github.com/OHF-Voice/linux-voice-assistant) ็ๆถๆ่ฎพ่ฎก๏ผไธป่ฆ็น็น๏ผ |
|
|
| - **STT/TTS ็ฑ Home Assistant ๅค็**๏ผ้ณ้ขๆฐๆฎ้่ฟ ESPHome ๅ่ฎฎไผ ่พๅฐ Home Assistant๏ผ็ฑ HA ่ฟ่ก่ฏญ้ณ่ฏๅซๅๅๆ |
| - **ๆฌๅฐๅค้่ฏๆฃๆต**๏ผไฝฟ็จ microWakeWord ๆ openWakeWord ่ฟ่ก็ฆป็บฟๅค้่ฏๆฃๆต |
| - **ESPHome ๅ่ฎฎ้ไฟก**๏ผ้่ฟ ESPHome ๅ่ฎฎไธ Home Assistant ้ไฟก |
| - **่ฟๅจๆงๅถๅขๅผบ**๏ผ้ๆ Reachy Mini ็่ฟๅจๆงๅถ่ฝๅ |
|
|
| ### 2.2 ๆถๆ็น็น |
| - **ๆจกๅๅ่ฎพ่ฎก**๏ผ้ณ้ขใ่ฏญ้ณใ่ฟๅจใESPHome ๅๆจกๅ็ฌ็ซ |
| - **ๅผๆญฅๅค็**๏ผไฝฟ็จ asyncio ๅฎ็ฐ้ซๆง่ฝๅผๆญฅๅค็ |
| - **็ถๆ็ฎก็**๏ผ้ไธญ็็ถๆ็ฎก็๏ผServerState๏ผ |
| - **ไบไปถ้ฉฑๅจ**๏ผๅบไบไบไปถ็้ไฟกๆบๅถ |
|
|
| ## 3. ๆจกๅ่ฎพ่ฎก |
|
|
| ### 3.1 ้ณ้ขๆจกๅ (audio/) |
|
|
| **่่ดฃ**๏ผ |
| - ้ณ้ข่ฎพๅค็ฎก็๏ผ้บฆๅ
้ฃใๆฌๅฃฐๅจ๏ผ |
| - ้ณ้ขๅฝๅถๅๆญๆพ |
| - ้ณ้ขๆ ผๅผ่ฝฌๆข๏ผ16KHz ๅๅฃฐ้ PCM๏ผ |
|
|
| **ๆฅๅฃ**๏ผ |
|
|
| ```python |
| class AudioAdapter(ABC): |
| """้ณ้ข่ฎพๅค้้
ๅจๆฝ่ฑกๅบ็ฑป""" |
| |
| @abstractmethod |
| async def list_input_devices(self) -> List[AudioDevice]: |
| """ๅๅบๅฏ็จ็้ณ้ข่พๅ
ฅ่ฎพๅค""" |
| pass |
| |
| @abstractmethod |
| async def start_recording( |
| self, |
| device_id: str, |
| callback: Callable[[bytes], None], |
| sample_rate: int = 16000, |
| channels: int = 1, |
| block_size: int = 1024 |
| ): |
| """ๅผๅงๅฝๅถ้ณ้ข""" |
| pass |
| |
| @abstractmethod |
| async def play_audio( |
| self, |
| audio_data: bytes, |
| device_id: str, |
| sample_rate: int = 16000, |
| channels: int = 1 |
| ): |
| """ๆญๆพ้ณ้ข""" |
| pass |
| |
| |
| class MicrophoneArray(AudioAdapter): |
| """้บฆๅ
้ฃ้ตๅ้้
ๅจ๏ผReachy Mini ็ 4 ้บฆๅ
้ฃ้ตๅ๏ผ""" |
| |
| def __init__(self, sample_rate: int = 16000, channels: int = 1): |
| self.sample_rate = sample_rate |
| self.channels = channels |
| self._stream = None |
| self._is_recording = False |
| self._callback = None |
| self._loop = None |
| |
| |
| class Speaker(AudioAdapter): |
| """ๆฌๅฃฐๅจ้้
ๅจ๏ผReachy Mini ็ 5W ๆฌๅฃฐๅจ๏ผ""" |
| |
| def __init__(self, sample_rate: int = 16000): |
| self.sample_rate = sample_rate |
| ``` |
|
|
| **้ณ้ขๅค็ๅจ**๏ผ |
|
|
| ```python |
| class AudioProcessor: |
| """ๅค็้ณ้ขๅ๏ผ็จไบๅค้่ฏๆฃๆตๅๆตๅผไผ ่พ""" |
| |
| def __init__( |
| self, |
| sample_rate: int = 16000, |
| channels: int = 1, |
| block_size: int = 1024 |
| ): |
| self.sample_rate = sample_rate |
| self.channels = channels |
| self.block_size = block_size |
| |
| self._wake_word_callbacks: list[Callable[[bytes], None]] = [] |
| self._stream_callbacks: list[Callable[[bytes], None]] = [] |
| |
| def add_wake_word_callback(self, callback: Callable[[bytes], None]): |
| """ๆทปๅ ๅค้่ฏๆฃๆตๅ่ฐ""" |
| self._wake_word_callbacks.append(callback) |
| |
| def add_stream_callback(self, callback: Callable[[bytes], None]): |
| """ๆทปๅ ้ณ้ขๆตๅ่ฐ๏ผๅ้ๅฐ Home Assistant๏ผ""" |
| self._stream_callbacks.append(callback) |
| |
| async def process_audio_chunk(self, audio_chunk: bytes): |
| """ๅค็้ณ้ขๅ""" |
| # ่ฐ็จๅค้่ฏๆฃๆตๅ่ฐ |
| for callback in self._wake_word_callbacks: |
| callback(audio_chunk) |
| |
| # ่ฐ็จๆตๅผไผ ่พๅ่ฐ |
| for callback in self._stream_callbacks: |
| callback(audio_chunk) |
| ``` |
|
|
| ### 3.2 ่ฏญ้ณๆจกๅ (voice/) |
|
|
| **่่ดฃ**๏ผ |
| - ๅค้่ฏๆฃๆต๏ผๆฌๅฐ็ฆป็บฟ๏ผ |
| - STT/TTS ็ฑ Home Assistant ๅค็๏ผไธๅจๆญคๆจกๅ๏ผ |
|
|
| **ๆฅๅฃ**๏ผ |
|
|
| ```python |
| class WakeWordDetector(ABC): |
| """ๅค้่ฏๆฃๆตๅจๆฝ่ฑกๅบ็ฑป""" |
| |
| @abstractmethod |
| async def load_model(self, model_path: str): |
| """ๅ ่ฝฝๅค้่ฏๆจกๅ""" |
| pass |
| |
| @abstractmethod |
| async def process_audio(self, audio_chunk: bytes) -> bool: |
| """ๅค็้ณ้ขๅ๏ผ่ฟๅๆฏๅฆๆฃๆตๅฐๅค้่ฏ""" |
| pass |
| |
| |
| class MicroWakeWordDetector(WakeWordDetector): |
| """microWakeWord ๆฃๆตๅจ๏ผ่ฝป้็บง๏ผ้ๅ Raspberry Pi๏ผ""" |
| |
| def __init__(self, model_path: str): |
| self.model = None |
| self.features = None |
| self.model_path = Path(model_path) |
| self._confidence = 0.0 |
| self._loaded = False |
| |
| async def load_model(self, model_path: str): |
| """ๅ ่ฝฝ microWakeWord ๆจกๅ""" |
| from pymicro_wakeword import MicroWakeWord, MicroWakeWordFeatures |
| |
| self.features = MicroWakeWordFeatures() |
| self.model = MicroWakeWord.from_config(model_path) |
| self._loaded = True |
| |
| async def process_audio(self, audio_chunk: bytes) -> bool: |
| """ๅค็้ณ้ขๅ""" |
| import numpy as np |
| audio_array = np.frombuffer(audio_chunk, dtype=np.int16).astype(np.float32) / 32768.0 |
| |
| features = self.features.process_streaming(audio_array) |
| for feature in features: |
| score = self.model.process_streaming(feature) |
| if score is not None and score >= 0.5: |
| return True |
| return False |
| |
| |
| class OpenWakeWordDetector(WakeWordDetector): |
| """openWakeWord ๆฃๆตๅจ๏ผๆดๅคๅค้่ฏ้ๆฉ๏ผ""" |
| |
| def __init__(self, model_path: str): |
| self.model = None |
| self.features = None |
| self.model_path = Path(model_path) |
| self._confidence = 0.0 |
| self._loaded = False |
| |
| async def load_model(self, model_path: str): |
| """ๅ ่ฝฝ openWakeWord ๆจกๅ""" |
| from pyopen_wakeword import OpenWakeWord, OpenWakeWordFeatures |
| |
| self.features = OpenWakeWordFeatures.from_builtin() |
| self.model = OpenWakeWord(model_path) |
| self._loaded = True |
| |
| async def process_audio(self, audio_chunk: bytes) -> bool: |
| """ๅค็้ณ้ขๅ""" |
| import numpy as np |
| audio_array = np.frombuffer(audio_chunk, dtype=np.int16).astype(np.float32) / 32768.0 |
| |
| features = self.features.process_streaming(audio_array) |
| for feature in features: |
| scores = self.model.process_streaming(feature) |
| for score in scores: |
| if score >= 0.5: |
| return True |
| return False |
| ``` |
|
|
| ### 3.3 ่ฟๅจๆจกๅ (motion/) |
|
|
| **่่ดฃ**๏ผ |
| - ๅคด้จ่ฟๅจๆงๅถ๏ผ6 ่ช็ฑๅบฆ๏ผ |
| - ๅคฉ็บฟๆงๅถ๏ผ2 ไธชๅคฉ็บฟ๏ผ |
| - ่ฟๅจ้ๅ็ฎก็ |
| - ่ฏญ้ณๅๅบๆง่ฟๅจ |
|
|
| **ๆฅๅฃ**๏ผ |
|
|
| ```python |
| class MotionController(ABC): |
| """่ฟๅจๆงๅถๅจๆฝ่ฑกๅบ็ฑป""" |
| |
| @abstractmethod |
| async def connect(self, host: str = 'localhost'): |
| """่ฟๆฅๅฐๆบๅจไบบ""" |
| pass |
| |
| @abstractmethod |
| async def wake_up(self): |
| """ๅค้ๆบๅจไบบ""" |
| pass |
| |
| @abstractmethod |
| async def turn_off(self): |
| """ๅ
ณ้ญๆบๅจไบบ""" |
| pass |
| |
| @abstractmethod |
| async def move_head(self, pose: np.ndarray, duration: float = 1.0): |
| """็งปๅจๅคด้จๅฐๅงฟๆ""" |
| pass |
| |
| @abstractmethod |
| async def move_antennas(self, left: float, right: float, duration: float = 1.0): |
| """็งปๅจๅคฉ็บฟ""" |
| pass |
| |
| @abstractmethod |
| async def nod(self, count: int = 1, duration: float = 0.5): |
| """็นๅคด""" |
| pass |
| |
| @abstractmethod |
| async def shake(self, count: int = 1, duration: float = 0.5): |
| """ๆๅคด""" |
| pass |
| |
| @abstractmethod |
| async def start_speech_reactive_motion(self): |
| """ๅผๅง่ฏญ้ณๅๅบๆง่ฟๅจ""" |
| pass |
| |
| @abstractmethod |
| async def stop_speech_reactive_motion(self): |
| """ๅๆญข่ฏญ้ณๅๅบๆง่ฟๅจ""" |
| pass |
| |
| |
| class ReachyMiniMotionController(MotionController): |
| """Reachy Mini ่ฟๅจๆงๅถๅจ""" |
| |
| def __init__(self): |
| self.reachy_mini = None |
| self._connected = False |
| self._speech_reactive = False |
| self._speech_task = None |
| |
| async def connect(self, host: str = 'localhost'): |
| """่ฟๆฅๅฐ Reachy Mini""" |
| from reachy_mini import ReachyMini |
| |
| self.reachy_mini = ReachyMini(host=host) |
| self._connected = True |
| |
| async def wake_up(self): |
| """ๅค้ๆบๅจไบบ""" |
| self.reachy_mini.wake_up() |
| |
| async def turn_off(self): |
| """ๅ
ณ้ญๆบๅจไบบ""" |
| self.reachy_mini.turn_off() |
| |
| async def move_head(self, pose: np.ndarray, duration: float = 1.0): |
| """็งปๅจๅคด้จๅฐๅงฟๆ""" |
| self.reachy_mini.goto_target(head=pose, duration=duration) |
| |
| async def move_antennas(self, left: float, right: float, duration: float = 1.0): |
| """็งปๅจๅคฉ็บฟ""" |
| self.reachy_mini.goto_target(antennas=[left, right], duration=duration) |
| |
| async def nod(self, count: int = 1, duration: float = 0.5): |
| """็นๅคด""" |
| import numpy as np |
| from scipy.spatial.transform import Rotation as R |
| |
| for _ in range(count): |
| # ็นๅคด |
| pose_down = np.eye(4) |
| pose_down[:3, :3] = R.from_euler('xyz', [15, 0, 0], degrees=True).as_matrix() |
| await self.move_head(pose_down, duration=duration / 2) |
| |
| pose_up = np.eye(4) |
| pose_up[:3, :3] = R.from_euler('xyz', [-15, 0, 0], degrees=True).as_matrix() |
| await self.move_head(pose_up, duration=duration / 2) |
| |
| async def shake(self, count: int = 1, duration: float = 0.5): |
| """ๆๅคด""" |
| import numpy as np |
| from scipy.spatial.transform import Rotation as R |
| |
| for _ in range(count): |
| # ๆๅคด |
| pose_left = np.eye(4) |
| pose_left[:3, :3] = R.from_euler('xyz', [0, 0, -20], degrees=True).as_matrix() |
| await self.move_head(pose_left, duration=duration / 2) |
| |
| pose_right = np.eye(4) |
| pose_right[:3, :3] = R.from_euler('xyz', [0, 0, 20], degrees=True).as_matrix() |
| await self.move_head(pose_right, duration=duration / 2) |
| |
| async def start_speech_reactive_motion(self): |
| """ๅผๅง่ฏญ้ณๅๅบๆง่ฟๅจ๏ผ่ฏด่ฏๆถ็ๅพฎๅจ๏ผ""" |
| self._speech_reactive = True |
| self._speech_task = asyncio.create_task(self._speech_reactive_loop()) |
| |
| async def stop_speech_reactive_motion(self): |
| """ๅๆญข่ฏญ้ณๅๅบๆง่ฟๅจ""" |
| self._speech_reactive = False |
| if self._speech_task: |
| self._speech_task.cancel() |
| |
| async def _speech_reactive_loop(self): |
| """่ฏญ้ณๅๅบๆง่ฟๅจๅพช็ฏ""" |
| import numpy as np |
| from scipy.spatial.transform import Rotation as R |
| |
| while self._speech_reactive: |
| # ็ๆๅพฎๅฐ็ๆๅจ |
| roll = np.sin(asyncio.get_event_loop().time() * 2) * 3 |
| pose = np.eye(4) |
| pose[:3, :3] = R.from_euler('xyz', [0, 0, roll], degrees=True).as_matrix() |
| |
| await self.move_head(pose, duration=0.1) |
| await asyncio.sleep(0.1) |
| ``` |
|
|
| **่ฟๅจ้ๅ**๏ผ |
|
|
| ```python |
| class MotionQueue: |
| """่ฟๅจ้ๅ็ฎก็ๅจ""" |
| |
| def __init__(self): |
| self.high_priority = asyncio.Queue() |
| self.medium_priority = asyncio.Queue() |
| self.low_priority = asyncio.Queue() |
| self.is_running = False |
| self._current_motion = None |
| self._task = None |
| |
| async def add_motion(self, motion: Motion): |
| """ๆทปๅ ่ฟๅจๅฐ้ๅ""" |
| if motion.priority == MotionPriority.HIGH: |
| await self.high_priority.put(motion) |
| elif motion.priority == MotionPriority.MEDIUM: |
| await self.medium_priority.put(motion) |
| elif motion.priority == MotionPriority.LOW: |
| await self.low_priority.put(motion) |
| |
| async def start(self): |
| """ๅผๅงๅค็่ฟๅจ้ๅ""" |
| self.is_running = True |
| self._task = asyncio.create_task(self._process_queue()) |
| |
| async def stop(self): |
| """ๅๆญขๅค็่ฟๅจ้ๅ""" |
| self.is_running = False |
| if self._task: |
| self._task.cancel() |
| |
| async def _process_queue(self): |
| """ๅค็่ฟๅจ้ๅ""" |
| while self.is_running: |
| # ไผๅ
็บง๏ผHIGH > MEDIUM > LOW |
| motion = await self._get_next_motion() |
| |
| if motion is None: |
| await asyncio.sleep(0.01) |
| continue |
| |
| self._current_motion = motion |
| await motion.execute() |
| self._current_motion = None |
| |
| async def _get_next_motion(self) -> Optional[Motion]: |
| """่ทๅไธไธไธช่ฟๅจ""" |
| if not self.high_priority.empty(): |
| return await self.high_priority.get() |
| elif not self.medium_priority.empty(): |
| return await self.medium_priority.get() |
| elif not self.low_priority.empty(): |
| return await self.low_priority.get() |
| else: |
| return None |
| ``` |
|
|
| ### 3.4 ESPHome ๆจกๅ (esphome/) |
|
|
| **่่ดฃ**๏ผ |
| - ESPHome ๅ่ฎฎๅฎ็ฐ |
| - ไธ Home Assistant ้ไฟก |
| - ้ณ้ขๆตไผ ่พ |
| - ไบไปถๅค็ |
|
|
| **ๆฅๅฃ**๏ผ |
|
|
| ```python |
| class ESPHomeServer: |
| """ESPHome ๅ่ฎฎๆๅกๅจ""" |
| |
| def __init__(self, host: str = "0.0.0.0", port: int = 6053): |
| self.host = host |
| self.port = port |
| self._server = None |
| self._is_running = False |
| self._clients = [] |
| self._audio_callback = None |
| self._event_callback = None |
| |
| async def start(self): |
| """ๅฏๅจ ESPHome ๆๅกๅจ""" |
| self._server = await asyncio.start_server( |
| self._handle_client, |
| self.host, |
| self.port |
| ) |
| self._is_running = True |
| |
| async def stop(self): |
| """ๅๆญข ESPHome ๆๅกๅจ""" |
| self._is_running = False |
| |
| for client in self._clients: |
| client.close() |
| self._clients.clear() |
| |
| if self._server: |
| self._server.close() |
| await self._server.wait_closed() |
| |
| def set_audio_callback(self, callback: Callable[[bytes], None]): |
| """่ฎพ็ฝฎ้ณ้ขๅ่ฐ๏ผๆฅๆถๆฅ่ช Home Assistant ็ TTS ้ณ้ข๏ผ""" |
| self._audio_callback = callback |
| |
| def set_event_callback(self, callback: Callable[[VoiceAssistantEventType, dict], None]): |
| """่ฎพ็ฝฎไบไปถๅ่ฐ๏ผๆฅๆถๆฅ่ช Home Assistant ็ไบไปถ๏ผ""" |
| self._event_callback = callback |
| |
| async def send_audio(self, audio_data: bytes): |
| """ๅ้้ณ้ขๆฐๆฎๅฐ Home Assistant๏ผSTT ่พๅ
ฅ๏ผ""" |
| for client in self._clients: |
| try: |
| client.write(audio_data) |
| await client.drain() |
| except Exception as e: |
| logger.error(f"Error sending audio to client: {e}") |
| |
| async def send_event(self, event_type: VoiceAssistantEventType, data: dict): |
| """ๅ้ไบไปถๅฐ Home Assistant""" |
| if self._event_callback: |
| self._event_callback(event_type, data) |
| |
| async def _handle_client(self, reader, writer): |
| """ๅค็ๅฎขๆท็ซฏ่ฟๆฅ""" |
| client_addr = writer.get_extra_info('peername') |
| self._clients.append(writer) |
| |
| try: |
| while self._is_running: |
| data = await reader.read(4096) |
| if not data: |
| break |
| |
| # ๅค็ๆฅ่ช Home Assistant ็ๆฐๆฎ |
| await self._process_data(data) |
| except Exception as e: |
| logger.error(f"Error handling client {client_addr}: {e}") |
| finally: |
| self._clients.remove(writer) |
| writer.close() |
| await writer.wait_closed() |
| |
| |
| class VoiceSatelliteProtocol: |
| """่ฏญ้ณๅซๆๅ่ฎฎๅค็ๅจ""" |
| |
| def __init__(self, state: ServerState): |
| self.state = state |
| self._is_streaming = False |
| self._refractory_period = 2.0 |
| self._last_wake_word_time = 0.0 |
| |
| async def handle_audio(self, audio_chunk: bytes): |
| """ๅค็้ณ้ขๅ๏ผๅ้ๅฐ Home Assistant๏ผ""" |
| if self._is_streaming and self.state.esphome_server: |
| await self.state.esphome_server.send_audio(audio_chunk) |
| |
| async def handle_wake_word(self): |
| """ๅค็ๅค้่ฏๆฃๆต""" |
| current_time = asyncio.get_event_loop().time() |
| |
| # ๆฃๆฅๅทๅดๆ |
| if current_time - self._last_wake_word_time < self._refractory_period: |
| return |
| |
| self._last_wake_word_time = current_time |
| |
| # ๅ้ๅค้่ฏไบไปถๅฐ Home Assistant |
| if self.state.esphome_server: |
| await self.state.esphome_server.send_event( |
| VoiceAssistantEventType.VOICE_ASSISTANT_WAKE_WORD_END, |
| {"wake_word": "detected"} |
| ) |
| |
| # ๅผๅงๆตๅผไผ ่พ |
| self._is_streaming = True |
| |
| async def stop_streaming(self): |
| """ๅๆญขๆตๅผไผ ่พ""" |
| self._is_streaming = False |
| |
| |
| class VoiceAssistantEventType(Enum): |
| """่ฏญ้ณๅฉๆไบไปถ็ฑปๅ""" |
| VOICE_ASSISTANT_START = 0 |
| VOICE_ASSISTANT_END = 1 |
| VOICE_ASSISTANT_ERROR = 2 |
| VOICE_ASSISTANT_STT_START = 3 |
| VOICE_ASSISTANT_STT_END = 4 |
| VOICE_ASSISTANT_TTS_START = 5 |
| VOICE_ASSISTANT_TTS_END = 6 |
| VOICE_ASSISTANT_WAKE_WORD_START = 9 |
| VOICE_ASSISTANT_WAKE_WORD_END = 10 |
| ``` |
|
|
| ### 3.5 ้
็ฝฎๆจกๅ (config/) |
|
|
| **่่ดฃ**๏ผ |
| - ้
็ฝฎๆไปถ็ฎก็ |
| - ็จๆทๅๅฅฝๅญๅจ |
| - ่ฟ่กๆถ้
็ฝฎ |
|
|
| **ๆฅๅฃ**๏ผ |
|
|
| ```python |
| class ConfigManager: |
| """้
็ฝฎ็ฎก็ๅจ""" |
| |
| def __init__(self, config_path: str = "config.json"): |
| self.config_path = Path(config_path) |
| self.config = self.load_config() |
| |
| def load_config(self) -> dict: |
| """ๅ ่ฝฝ้
็ฝฎๆไปถ""" |
| if self.config_path.exists(): |
| with open(self.config_path, 'r', encoding='utf-8') as f: |
| return json.load(f) |
| return self.get_default_config() |
| |
| def save_config(self): |
| """ไฟๅญ้
็ฝฎๆไปถ""" |
| with open(self.config_path, 'w', encoding='utf-8') as f: |
| json.dump(self.config, f, indent=2, ensure_ascii=False) |
| |
| def get_default_config(self) -> dict: |
| """่ทๅ้ป่ฎค้
็ฝฎ""" |
| return { |
| "audio": { |
| "input_device": None, |
| "output_device": None, |
| "sample_rate": 16000, |
| "channels": 1, |
| "block_size": 1024 |
| }, |
| "voice": { |
| "wake_word": "okay_nabu", |
| "wake_word_dirs": ["wakewords"] |
| }, |
| "motion": { |
| "enabled": True, |
| "speech_reactive": True |
| }, |
| "esphome": { |
| "host": "0.0.0.0", |
| "port": 6053, |
| "name": "Reachy Mini" |
| }, |
| "robot": { |
| "host": "localhost", |
| "wireless": False |
| } |
| } |
| |
| def get(self, key: str, default: Any = None) -> Any: |
| """่ทๅ้
็ฝฎๅผ๏ผๆฏๆๅตๅฅ้ฎ๏ผ""" |
| keys = key.split('.') |
| value = self.config |
| for k in keys: |
| if isinstance(value, dict): |
| value = value.get(k, default) |
| else: |
| return default |
| return value |
| |
| def set(self, key: str, value: Any): |
| """่ฎพ็ฝฎ้
็ฝฎๅผ๏ผๆฏๆๅตๅฅ้ฎ๏ผ""" |
| keys = key.split('.') |
| config = self.config |
| for k in keys[:-1]: |
| config = config.setdefault(k, {}) |
| config[keys[-1]] = value |
| self.save_config() |
| ``` |
|
|
| ### 3.6 ็ถๆ็ฎก็ (state.py) |
|
|
| **่่ดฃ**๏ผ |
| - ๅ
จๅฑ็ถๆ็ฎก็ |
| - ็ปไปถ็ๅฝๅจๆ็ฎก็ |
|
|
| **ๆฅๅฃ**๏ผ |
|
|
| ```python |
| @dataclass |
| class ServerState: |
| """ๅ
จๅฑๆๅกๅจ็ถๆ""" |
| name: str |
| |
| # ้
็ฝฎ |
| config: Optional[ConfigManager] = None |
| |
| # ้ณ้ข |
| microphone: Optional[MicrophoneArray] = None |
| speaker: Optional[Speaker] = None |
| audio_queue: Queue = field(default_factory=Queue) |
| |
| # ่ฏญ้ณ |
| wake_word_detector: Optional[WakeWordDetector] = None |
| active_wake_words: list = field(default_factory=list) |
| |
| # ่ฟๅจ |
| motion_controller: Optional[MotionController] = None |
| motion_queue: Optional[MotionQueue] = None |
| |
| # ESPHome |
| esphome_server: Optional[ESPHomeServer] = None |
| voice_satellite: Optional[VoiceSatelliteProtocol] = None |
| |
| # ็ถๆ |
| is_running: bool = False |
| is_streaming: bool = False |
| |
| # ๅ่ฐ |
| on_wake_word: Optional[callable] = None |
| on_stt_result: Optional[callable] = None |
| on_tts_audio: Optional[callable] = None |
| |
| async def cleanup(self): |
| """ๆธ
็่ตๆบ""" |
| if self.microphone: |
| await self.microphone.stop_recording() |
| |
| if self.motion_controller: |
| await self.motion_controller.stop_speech_reactive_motion() |
| await self.motion_controller.turn_off() |
| await self.motion_controller.disconnect() |
| |
| if self.motion_queue: |
| await self.motion_queue.stop() |
| |
| if self.esphome_server: |
| await self.esphome_server.stop() |
| ``` |
|
|
| ### 3.7 ไธปๅบ็จ (app.py) |
|
|
| **่่ดฃ**๏ผ |
| - ๅบ็จ็ๅฝๅจๆ็ฎก็ |
| - ็ปไปถๅๅงๅๅๅ่ฐ |
| - ไบไปถๅค็ |
|
|
| **ๆฅๅฃ**๏ผ |
|
|
| ```python |
| class ReachyMiniVoiceApp: |
| """ไธปๅบ็จ็ฑป""" |
| |
| def __init__( |
| self, |
| name: str, |
| config: ConfigManager, |
| audio_input_device: Optional[str] = None, |
| audio_output_device: Optional[str] = None, |
| wake_model: Optional[str] = None, |
| wake_word_dirs: Optional[list] = None, |
| host: str = "0.0.0.0", |
| port: int = 6053, |
| robot_host: str = "localhost", |
| wireless: bool = False, |
| gradio: bool = False |
| ): |
| self.name = name |
| self.config = config |
| self.audio_input_device = audio_input_device |
| self.audio_output_device = audio_output_device |
| self.wake_model = wake_model |
| self.wake_word_dirs = wake_word_dirs |
| self.host = host |
| self.port = port |
| self.robot_host = robot_host |
| self.wireless = wireless |
| self.gradio = gradio |
| |
| self.state = ServerState(name) |
| self._is_running = False |
| |
| async def start(self): |
| """ๅฏๅจๅบ็จ""" |
| # ๅๅงๅ็ถๆ |
| await self.state.initialize(self.config) |
| |
| # ่ฎพ็ฝฎๅ่ฐ |
| self._setup_callbacks() |
| |
| # ๅฏๅจ้ณ้ขๅฝๅถ |
| await self.state.microphone.start_recording( |
| self.audio_input_device, |
| self._audio_callback, |
| sample_rate=self.config.get("audio.sample_rate", 16000), |
| channels=self.config.get("audio.channels", 1), |
| block_size=self.config.get("audio.block_size", 1024) |
| ) |
| |
| # ๅฏๅจ ESPHome ๆๅกๅจ |
| await self.state.esphome_server.start() |
| |
| # ๆณจๅ mDNS ๅ็ฐ |
| await self._register_mdns() |
| |
| self._is_running = True |
| |
| # ไฟๆ่ฟ่ก |
| while self._is_running: |
| await asyncio.sleep(1) |
| |
| async def stop(self): |
| """ๅๆญขๅบ็จ""" |
| self._is_running = False |
| await self.state.cleanup() |
| |
| def _setup_callbacks(self): |
| """่ฎพ็ฝฎๅ่ฐ""" |
| self.state.audio_processor.add_wake_word_callback(self._on_audio_chunk) |
| self.state.audio_processor.add_stream_callback(self._on_stream_audio) |
| |
| async def _audio_callback(self, audio_chunk: bytes): |
| """้ณ้ขๅฝๅถๅ่ฐ""" |
| await self.state.audio_processor.process_audio_chunk(audio_chunk) |
| |
| async def _on_audio_chunk(self, audio_chunk: bytes): |
| """ๅค้่ฏๆฃๆตๅ่ฐ""" |
| if self.state.wake_word_detector: |
| detected = await self.state.wake_word_detector.process_audio(audio_chunk) |
| if detected: |
| await self._on_wake_word_detected() |
| |
| async def _on_stream_audio(self, audio_chunk: bytes): |
| """้ณ้ขๆตไผ ่พๅ่ฐ๏ผๅ้ๅฐ Home Assistant๏ผ""" |
| if self.state.voice_satellite: |
| await self.state.voice_satellite.handle_audio(audio_chunk) |
| |
| async def _on_wake_word_detected(self): |
| """ๅค้่ฏๆฃๆตๅ่ฐ""" |
| # ็นๅคด็กฎ่ฎค |
| if self.state.motion_controller: |
| await self.state.motion_controller.nod(count=1, duration=0.3) |
| |
| # ่งฆๅ่ฏญ้ณๅซๆ |
| if self.state.voice_satellite: |
| await self.state.voice_satellite.handle_wake_word() |
| |
| async def handle_tts_audio(self, audio_data: bytes): |
| """ๅค็ๆฅ่ช Home Assistant ็ TTS ้ณ้ข""" |
| # ๆญๆพ้ณ้ข |
| if self.state.speaker: |
| await self.state.speaker.play_audio( |
| audio_data, |
| self.audio_output_device, |
| sample_rate=self.config.get("audio.sample_rate", 16000), |
| channels=self.config.get("audio.channels", 1) |
| ) |
| |
| async def handle_stt_result(self, text: str): |
| """ๅค็ๆฅ่ช Home Assistant ็ STT ็ปๆ""" |
| # ๅค็ๆๆฌ๏ผๆทปๅ ่ชๅฎไน้ป่พ๏ผ |
| pass |
| |
| async def _register_mdns(self): |
| """ๆณจๅ mDNS ๆๅกๅ็ฐ""" |
| from zeroconf import ServiceInfo, Zeroconf |
| |
| info = ServiceInfo( |
| "_esphomelib._tcp.local.", |
| f"{self.name}._esphomelib._tcp.local.", |
| addresses=[], |
| port=self.port, |
| properties={ |
| "version": "1.0", |
| "name": self.name, |
| "platform": "reachy_mini" |
| } |
| ) |
| |
| zeroconf = Zeroconf() |
| zeroconf.register_service(info) |
| ``` |
|
|
| ## 4. ๆฐๆฎๆต |
|
|
| ### 4.1 ้ณ้ข่พๅ
ฅๆต็จ |
|
|
| ``` |
| ้บฆๅ
้ฃ้ตๅ (4 ้บฆๅ
้ฃ) |
| โ (16KHz PCM) |
| ้ณ้ขๅ (1024 samples) |
| โ |
| โโโโโโโโโโโโโโโโโโโ |
| โ ๅค้่ฏๆฃๆต โ |
| โ (micro/oww) โ |
| โโโโโโโโโโฌโโโโโโโโโ |
| โ |
| โ (ๆฃๆตๅฐๅค้่ฏ) |
| ่งฆๅๅค้ไบไปถ |
| โ |
| โ |
| โโโโโโโโโโโโโโโโโโโ |
| โ ๅผๅงๆตๅผไผ ่พ โ |
| โ (ESPHome) โ |
| โโโโโโโโโโฌโโโโโโโโโ |
| โ |
| โ |
| โโโโโโโโโโโโโโโโโโโ |
| โ ๅ้ๅฐ HA โ |
| โ (STT ่พๅ
ฅ) โ |
| โโโโโโโโโโโโโโโโโโโ |
| ``` |
|
|
| ### 4.2 ้ณ้ข่พๅบๆต็จ |
|
|
| ``` |
| Home Assistant (TTS ่พๅบ) |
| โ |
| โโโโโโโโโโโโโโโโโโโ |
| โ ESPHome ๆๅกๅจ โ |
| โ (ๆฅๆถ้ณ้ข) โ |
| โโโโโโโโโโฌโโโโโโโโโ |
| โ |
| โ |
| โโโโโโโโโโโโโโโโโโโ |
| โ ๆญๆพ้ณ้ข โ |
| โ (ๆฌๅฃฐๅจ) โ |
| โโโโโโโโโโโโโโโโโโโ |
| ``` |
|
|
| ### 4.3 ่ฟๅจๆงๅถๆต็จ |
|
|
| ``` |
| ๅค้่ฏๆฃๆต / STT ็ปๆ / TTS ไบไปถ |
| โ |
| โโโโโโโโโโโโโโโโโโโ |
| โ ่ฟๅจ้ๅ็ฎก็ โ |
| โโโโโโโโโโฌโโโโโโโโโ |
| โ |
| โ |
| โโโโโโโโโโโโโโโโโโโ |
| โ ้ซไผๅ
็บง่ฟๅจ โ |
| โ (ๅค้่ฏ็กฎ่ฎค) โ |
| โโโโโโโโโโฌโโโโโโโโโ |
| โ |
| โ |
| โโโโโโโโโโโโโโโโโโโ |
| โ ไธญไผๅ
็บง่ฟๅจ โ |
| โ (็จๆทๅฝไปค) โ |
| โโโโโโโโโโฌโโโโโโโโโ |
| โ |
| โ |
| โโโโโโโโโโโโโโโโโโโ |
| โ ไฝไผๅ
็บง่ฟๅจ โ |
| โ (่ฏญ้ณๅๅบ) โ |
| โโโโโโโโโโฌโโโโโโโโโ |
| โ |
| โ |
| ๆง่ก่ฟๅจ |
| โ |
| โ |
| โโโโโโโโโโโโโโโโโโโ |
| โ Reachy Mini โ |
| โ SDK โ |
| โโโโโโโโโโโโโโโโโโโ |
| ``` |
|
|
| ## 5. ไพ่ต้กน |
|
|
| ### 5.1 ๆ ธๅฟไพ่ต |
|
|
| ```toml |
| dependencies = [ |
| # Reachy Mini SDK |
| "reachy-mini", |
| |
| # ้ณ้ขๅค็ |
| "sounddevice>=0.4.6", |
| "numpy>=1.24.0", |
| |
| # ่ฏญ้ณๅค็ |
| "pymicro-wakeword>=2.0.0,<3.0.0", |
| "pyopen-wakeword>=1.0.0,<2.0.0", |
| |
| # ESPHome |
| "aioesphomeapi>=42.0.0", |
| "zeroconf>=0.100.0", |
| |
| # ่ฟๅจๆงๅถ |
| "scipy>=1.10.0", |
| |
| # Web UI (ๅฏ้) |
| "gradio>=4.0.0", |
| ] |
| ``` |
|
|
| ### 5.2 ๅฏ้ไพ่ต |
|
|
| ```toml |
| [project.optional-dependencies] |
| wireless = [ |
| "reachy-mini[wireless]", |
| ] |
| |
| vision = [ |
| "pollen-vision", |
| "opencv-python>=4.8.0", |
| "mediapipe>=0.10.0", |
| ] |
| |
| dev = [ |
| "pytest>=7.4.0", |
| "pytest-asyncio>=0.21.0", |
| "ruff>=0.1.0", |
| ] |
| ``` |
|
|
| ## 6. ๆง่ฝไผๅ |
|
|
| ### 6.1 ้ณ้ขๅค็ |
| - ไฝฟ็จๅผๆญฅ I/O ๅๅฐ้ปๅก |
| - ้ณ้ขๅๅคงๅฐไผๅ๏ผ1024 samples๏ผ |
| - ไฝฟ็จ numpy ๅ ้ๆฐๅผ่ฎก็ฎ |
| - ้ขๅ้
็ผๅฒๅบๅๅฐๅ
ๅญๅ้
|
|
|
| ### 6.2 ่ฟๅจๆงๅถ |
| - ่ฟๅจ้ๅไผๅ
็บง็ฎก็ |
| - ่ฟๅจๅนณๆปๆๅผ |
| - ๆน้่ฟๅจๅฝไปคๅๅนถ |
| - ๅปถ่ฟ้ข็ฎ็ฎก็ |
|
|
| ### 6.3 ็ฝ็ป |
| - ESPHome ่ฟๆฅๆฑ |
| - ๆถๆฏๆน้ๅ้ |
| - ๅ็ผฉ้ณ้ขๆฐๆฎ |
| - ๅฟ่ทณๆฃๆต |
|
|
| ## 7. ๅฎๅ
จ่่ |
|
|
| 1. **้ณ้ข้็ง**๏ผ |
| - ไธๅญๅจ็จๆท้ณ้ข๏ผ้ค้ๆ็กฎๆๆ๏ผ |
| - ๆฌๅฐๅค็ไผๅ
|
| - ๅ ๅฏไผ ่พ |
|
|
| 2. **่ฟๅจๅฎๅ
จ**๏ผ |
| - ่งๅบฆ้ๅถ |
| - ้ๅบฆ้ๅถ |
| - ็ขฐๆๆฃๆต |
| - ็ดงๆฅๅๆญข |
|
|
| 3. **็ฝ็ปๅฎๅ
จ**๏ผ |
| - ESPHome ่ฎค่ฏ |
| - TLS ๅ ๅฏ |
| - ้ฒ็ซๅข้
็ฝฎ |
| - ่ฎฟ้ฎๆงๅถ |
|
|
| ## 8. ้จ็ฝฒ |
|
|
| ### 8.1 ๅฎ่ฃ
ๆญฅ้ชค |
|
|
| ```bash |
| # ๅๅปบ่ๆ็ฏๅข |
| python -m venv .venv |
| source .venv/bin/activate |
| |
| # ๅฎ่ฃ
ๅบ็กไพ่ต |
| pip install -e . |
| |
| # ๅฎ่ฃ
ๅฏ้ไพ่ต |
| pip install -e .[wireless,vision,dev] |
| ``` |
|
|
| ### 8.2 ่ฟ่ก |
|
|
| ```bash |
| # ๅฏๅจๅบ็จ |
| python -m reachy_mini_ha_voice |
| |
| # ๅฏๅจ Web UI |
| python -m reachy_mini_ha_voice --gradio |
| |
| # ๅฏๅจๆ ็บฟ็ๆฌ |
| python -m reachy_mini_ha_voice --wireless |
| ``` |
|
|
| ### 8.3 Home Assistant ้ๆ |
|
|
| 1. ๅจ Home Assistant ไธญๆทปๅ ESPHome ้ๆ |
| 2. ่พๅ
ฅ Reachy Mini ็ IP ๅฐๅๅ็ซฏๅฃ๏ผ6053๏ผ |
| 3. ้
็ฝฎ STT/TTS ๆๅก |
| 4. ๅๅปบ่ชๅจๅๅ่ๆฌ |