Spaces:

ChambreAgriculturePaysLoire
/

routeur_ia_api

Running

App Files Files Community

Cyril Dupland commited on Mar 2

Commit

49420f1

1 Parent(s): 767c8ff

feat voice: update documentation for `trigger_on_push` mode to include JSON message format for triggering flush. Simplify response handling in LangGraphProcessor by sending complete responses in a single block instead of segments, enhancing efficiency in transcript delivery.

Browse files

Files changed (2) hide show

docs/VOICE_CLIENT_INTEGRATION.md +16 -0
services/voice/langgraph_processor.py +8 -63

docs/VOICE_CLIENT_INTEGRATION.md CHANGED Viewed

@@ -464,6 +464,22 @@ callObject.on("left-meeting", () => stopTranscriptPolling());
 En mode `trigger_on_push`, le serveur bufferise les segments STT jusqu'au `flush`, mais en parallèle il renvoie chaque segment **dès qu'il est reconnu** via le canal applicatif du transport.
 - Côté SmallWebRTC : les segments sont envoyés sur le **data channel** `pipecat-app`.
 - Côté Daily : les segments arrivent sous forme d'**app-messages** (`callObject.on("app-message")`).

 En mode `trigger_on_push`, le serveur bufferise les segments STT jusqu'au `flush`, mais en parallèle il renvoie chaque segment **dès qu'il est reconnu** via le canal applicatif du transport.
+Pour déclencher ce `flush`, le client doit envoyer un **message applicatif JSON avec `type` égal à `"flush"`** sur le même canal que les autres messages applicatifs :
+- Côté SmallWebRTC : via le data channel `pipecat-app` :
+  ```javascript
+  dc.send(JSON.stringify({ type: "flush" }));
+  ```
+- Côté Daily : via un app-message :
+  ```javascript
+  callObject.sendAppMessage({ type: "flush" }, "*");
+  ```
+> ⚠️ D'autres valeurs comme `"trigger_flush"` ne sont **pas** interprétées par le serveur et ne déclenchent pas le flush.
 - Côté SmallWebRTC : les segments sont envoyés sur le **data channel** `pipecat-app`.
 - Côté Daily : les segments arrivent sous forme d'**app-messages** (`callObject.on("app-message")`).

services/voice/langgraph_processor.py CHANGED Viewed

@@ -1,7 +1,7 @@
 """Pipecat FrameProcessor that routes transcribed text through the LangGraph agent."""
 import re
 import logging
-from typing import Optional, List
 from pipecat.processors.frame_processor import FrameProcessor
 from pipecat.frames.frames import (
@@ -56,19 +56,18 @@ class LangGraphProcessor(FrameProcessor):
                     clean = self._clean_response_for_tts(response)
                     logger.info("Sending to TTS (cleaned): %s", clean)
-                    segments = self._split_into_segments(clean)
-                    logger.info("Split response into %d segment(s)", len(segments))
                     await self.push_frame(LLMFullResponseStartFrame())
-                    for idx, segment in enumerate(segments, start=1):
-                        await self.push_frame(TextFrame(segment))
                         await self.push_frame(
                             OutputTransportMessageFrame(
                                 {
                                     "type": "assistant_segment",
-                                    "text": segment,
-                                    "segment_index": idx,
-                                    "total_segments": len(segments),
                                     "conversation_id": self.conversation_id,
                                 }
                             )
@@ -120,57 +119,3 @@ class LangGraphProcessor(FrameProcessor):
         clean = re.sub(r"\s+", " ", clean)
         return clean.strip()
-    # ------------------------------------------------------------------
-    # Segmentation helpers
-    # ------------------------------------------------------------------
-    @staticmethod
-    def _split_into_segments(text: str) -> List[str]:
-        """Split cleaned assistant text into sentence-like segments.
-        Uses simple punctuation-based splitting on `.`, `?`, `!` and
-        falls back to a single-segment list if no punctuation is found.
-        Very short segments are merged back into neighbours.
-        """
-        if not text:
-            return []
-        # Split on end-of-sentence punctuation followed by whitespace.
-        parts = re.split(r"(?<=[.?!])\s+", text)
-        parts = [p.strip() for p in parts if p and p.strip()]
-        if not parts:
-            return []
-        if len(parts) == 1:
-            return parts
-        # Merge very short trailing segments into the previous one to avoid noise.
-        merged: List[str] = []
-        buffer = ""
-        for idx, part in enumerate(parts):
-            if buffer:
-                candidate = buffer + " " + part
-            else:
-                candidate = part
-            # Heuristic: keep segments with at least ~10 characters,
-            # otherwise merge with the next piece.
-            if len(candidate) < 10 and idx < len(parts) - 1:
-                buffer = candidate
-                continue
-            if buffer and candidate is not buffer:
-                merged.append(candidate)
-                buffer = ""
-            else:
-                merged.append(part)
-                buffer = ""
-        if buffer:
-            if merged:
-                merged[-1] = merged[-1] + " " + buffer
-            else:
-                merged.append(buffer)
-        return merged

 """Pipecat FrameProcessor that routes transcribed text through the LangGraph agent."""
 import re
 import logging
+from typing import Optional
 from pipecat.processors.frame_processor import FrameProcessor
 from pipecat.frames.frames import (
                     clean = self._clean_response_for_tts(response)
                     logger.info("Sending to TTS (cleaned): %s", clean)
                     await self.push_frame(LLMFullResponseStartFrame())
+                    if clean:
+                        # Envoyer la réponse complète en un seul bloc vers le TTS…
+                        await self.push_frame(TextFrame(clean))
+                        # …et en parallèle vers le client via un seul message transport.
                         await self.push_frame(
                             OutputTransportMessageFrame(
                                 {
                                     "type": "assistant_segment",
+                                    "text": clean,
+                                    "segment_index": 1,
+                                    "total_segments": 1,
                                     "conversation_id": self.conversation_id,
                                 }
                             )
         clean = re.sub(r"\s+", " ", clean)
         return clean.strip()