Cyril Dupland
Add sources parameter to completion requests and update documentation: Enhance the CompletionRequest model to include an optional sources field for document filtering. Update related API examples and documentation to reflect this change, ensuring clarity on usage and implications for project document retrieval.
686df05 | # Exemples d'utilisation de l'API | |
| ## Table des matières | |
| 1. [Authentification](#authentification) | |
| 2. [Completion](#completion) | |
| 3. [Transcription](#transcription) | |
| 4. [Modèles et Agents](#modèles-et-agents) | |
| 5. [WebSocket](#websocket) | |
| 6. [Exemples avancés](#exemples-avancés) | |
| ## Point d'attention | |
| - La completion HTTP utilise désormais `agent` (ex: `V1`, `V2`). | |
| - Le pipeline voix conserve temporairement un champ interne nommé `agent_type` dans les événements (`services/voice/voice_pipeline.py`), pour compatibilité. Ce point est prévu pour un alignement ultérieur vers `agent`. | |
| ## Authentification | |
| ### Obtenir un token JWT | |
| **Requête:** | |
| ```bash | |
| curl -X POST http://localhost:7860/auth/token \ | |
| -H "Content-Type: application/json" | |
| ``` | |
| **Réponse:** | |
| ```json | |
| { | |
| "access_token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiJ1c2VyIiwidHlwZSI6ImFjY2VzcyIsImV4cCI6MTcwNjEyMzQ1Nn0.abc123...", | |
| "token_type": "bearer", | |
| "expires_in": 3600 | |
| } | |
| ``` | |
| ### Vérifier un token | |
| **Requête:** | |
| ```bash | |
| curl -X GET http://localhost:7860/auth/verify \ | |
| -H "Authorization: Bearer <votre-token>" | |
| ``` | |
| **Réponse:** | |
| ```json | |
| { | |
| "valid": true, | |
| "user": { | |
| "sub": "user", | |
| "type": "access", | |
| "exp": 1706123456 | |
| } | |
| } | |
| ``` | |
| ## Completion | |
| ### Completion simple (non-streaming) | |
| **Requête:** | |
| ```bash | |
| curl -X POST http://localhost:7860/completion \ | |
| -H "Authorization: Bearer <token>" \ | |
| -H "Content-Type: application/json" \ | |
| -d '{ | |
| "message": "Explique-moi la théorie de la relativité en 2 phrases", | |
| "model": "gpt-4o", | |
| "agent": "V2", | |
| "stream": false, | |
| "temperature": 0.7 | |
| }' | |
| ``` | |
| **Réponse:** | |
| ```json | |
| { | |
| "response": "La théorie de la relativité d'Einstein comprend deux parties: la relativité restreinte (1905) qui établit que la vitesse de la lumière est constante et que le temps et l'espace sont relatifs, et la relativité générale (1915) qui décrit la gravitation comme une courbure de l'espace-temps causée par la masse et l'énergie. Ces théories ont révolutionné notre compréhension de l'univers et sont confirmées par de nombreuses expériences.", | |
| "model": "gpt-4o", | |
| "agent": "V2", | |
| "usage": { | |
| "prompt_tokens": 25, | |
| "completion_tokens": 98, | |
| "total_tokens": 123 | |
| }, | |
| "metadata": { | |
| "message_count": 2 | |
| } | |
| } | |
| ``` | |
| ### Completion avec streaming (SSE) | |
| **Requête:** | |
| ```bash | |
| curl -N -X POST http://localhost:7860/completion \ | |
| -H "Authorization: Bearer <token>" \ | |
| -H "Content-Type: application/json" \ | |
| -d '{ | |
| "message": "Raconte-moi une courte histoire", | |
| "model": "gpt-3.5-turbo", | |
| "stream": true | |
| }' | |
| ``` | |
| **Réponse (Server-Sent Events):** | |
| ``` | |
| data: {"content": "Il", "done": false, "metadata": {"model": "gpt-3.5-turbo", "agent": "V2"}} | |
| data: {"content": " était", "done": false, "metadata": {"model": "gpt-3.5-turbo", "agent": "V2"}} | |
| data: {"content": " une", "done": false, "metadata": {"model": "gpt-3.5-turbo", "agent": "V2"}} | |
| ... | |
| data: {"content": "", "done": true, "metadata": {"model": "gpt-3.5-turbo", "agent": "V2"}} | |
| ``` | |
| ### Completion projet avec filtre de documents | |
| **Requête:** | |
| ```bash | |
| curl -X POST http://localhost:7860/completion \ | |
| -H "Authorization: Bearer <token>" \ | |
| -H "Content-Type: application/json" \ | |
| -d '{ | |
| "message": "Fais une synthèse des enjeux du projet", | |
| "model": "mistral-large-latest", | |
| "agent": "V2", | |
| "stream": false, | |
| "project_id": "proj-abc123", | |
| "sources": [ | |
| "8a5b1de7-4ac3-4f45-b0f9-2e9571f6f3d1", | |
| "f131e918-c17d-4e9f-82e1-67fa65b2366e" | |
| ] | |
| }' | |
| ``` | |
| `sources` est optionnel. Si absent ou vide (`[]`), aucune restriction n'est appliquée sur les documents du projet. | |
| #### Champs d'empreinte carbone, latence, pricing et équivalences | |
| Les réponses incluent désormais des métriques d'impact carbone calculées avec ecologits. | |
| - Non-stream (champ `metadata`): | |
| ```json | |
| { | |
| "metadata": { | |
| "message_count": 4, | |
| "latency_s": 1.23, | |
| "emissions_kgCO2eq": 0.00042, | |
| "emissions_gCO2eq": 0.42, | |
| "pricing": { | |
| "currency": "EUR", | |
| "total_cost": 0.0031, | |
| "by_model": { | |
| "mistral-large-latest": {"input": 0.0005, "output": 0.0026, "total": 0.0031} | |
| } | |
| }, | |
| "equivalences": { | |
| "water_liters": 0.3, | |
| "car_km": 0.002, | |
| "tgv_km": 0.01, | |
| "smartphone_charges": 0.04 | |
| } | |
| } | |
| } | |
| ``` | |
| - Stream (dernier event, champ `metadata`): | |
| ```json | |
| { | |
| "content": "", | |
| "done": true, | |
| "metadata": { | |
| "model": "mistral-large-latest", | |
| "agent": "V2", | |
| "usage": {"input_tokens":123, "output_tokens":456, "total_tokens":579}, | |
| "usage_by_model": { | |
| "mistral-large-latest": {"input_tokens":123, "output_tokens":456, "total_tokens":579} | |
| }, | |
| "latency_s": 1.23, | |
| "emissions_kgCO2eq": 0.00042, | |
| "emissions_gCO2eq": 0.42, | |
| "pricing": { | |
| "currency": "EUR", | |
| "total_cost": 0.0031, | |
| "by_model": { | |
| "mistral-large-latest": {"input": 0.0005, "output": 0.0026, "total": 0.0031} | |
| } | |
| }, | |
| "equivalences": { | |
| "water_liters": 0.3, | |
| "car_km": 0.002, | |
| "tgv_km": 0.01, | |
| "smartphone_charges": 0.04 | |
| } | |
| } | |
| } | |
| ``` | |
| ### Completion avec historique de conversation | |
| **Requête:** | |
| ```bash | |
| curl -X POST http://localhost:7860/completion \ | |
| -H "Authorization: Bearer <token>" \ | |
| -H "Content-Type: application/json" \ | |
| -d '{ | |
| "message": "Et en Python?", | |
| "model": "gpt-4o", | |
| "stream": false, | |
| "conversation_history": [ | |
| { | |
| "role": "user", | |
| "content": "Comment faire une boucle en JavaScript?" | |
| }, | |
| { | |
| "role": "assistant", | |
| "content": "En JavaScript, vous pouvez utiliser: for (let i = 0; i < 10; i++) { console.log(i); }" | |
| } | |
| ] | |
| }' | |
| ``` | |
| ### Utiliser Mistral AI | |
| **Requête:** | |
| ```bash | |
| curl -X POST http://localhost:7860/completion \ | |
| -H "Authorization: Bearer <token>" \ | |
| -H "Content-Type: application/json" \ | |
| -d '{ | |
| "message": "Quelle est la capitale de la France?", | |
| "model": "mistral-large-latest", | |
| "stream": false | |
| }' | |
| ``` | |
| ## Transcription | |
| ### Transcrire un fichier audio | |
| **Requête:** | |
| ```bash | |
| curl -X POST http://localhost:7860/transcription \ | |
| -H "Authorization: Bearer <token>" \ | |
| -F "file=@audio.mp3" | |
| ``` | |
| **Réponse:** | |
| ```json | |
| { | |
| "text": "Bonjour, ceci est un test de transcription audio avec Whisper.", | |
| "language": "fr", | |
| "duration": 3.5, | |
| "model": "whisper-1" | |
| } | |
| ``` | |
| ### Transcrire un meeting avec un modèle dédié | |
| **Requête:** | |
| ```bash | |
| curl -X POST http://localhost:7860/transcription/meeting \ | |
| -H "Authorization: Bearer <token>" \ | |
| -F "project_id=proj-abc123" \ | |
| -F "max_duration_seconds=3600" \ | |
| -F "file=@meeting_recording.mp3" | |
| ``` | |
| **Réponse (exemple):** | |
| ```json | |
| { | |
| "text": "Compte rendu structuré de la réunion ...", | |
| "language": "fr", | |
| "duration": 45.2, | |
| "model": "gpt-4o-transcribe" | |
| } | |
| ``` | |
| ### Transcrire avec langue spécifiée | |
| **Requête:** | |
| ```bash | |
| curl -X POST "http://localhost:7860/transcription?language=en" \ | |
| -H "Authorization: Bearer <token>" \ | |
| -F "file=@english_audio.wav" | |
| ``` | |
| ### Formats audio supportés | |
| **Requête:** | |
| ```bash | |
| curl -X GET http://localhost:7860/transcription/supported-formats \ | |
| -H "Authorization: Bearer <token>" | |
| ``` | |
| **Réponse:** | |
| ```json | |
| { | |
| "supported_formats": ["mp3", "mp4", "mpeg", "mpga", "m4a", "wav", "webm"], | |
| "max_file_size_mb": 25, | |
| "model": "whisper-1", | |
| "languages": "Auto-detection or specify ISO-639-1 code" | |
| } | |
| ``` | |
| ## Modèles et Agents | |
| ### Lister les modèles disponibles | |
| **Requête:** | |
| ```bash | |
| curl -X GET http://localhost:7860/models \ | |
| -H "Authorization: Bearer <token>" | |
| ``` | |
| **Réponse:** | |
| ```json | |
| { | |
| "models": [ | |
| { | |
| "name": "gpt-4o", | |
| "provider": "openai", | |
| "description": "GPT-4 Omni - Most capable model", | |
| "supports_streaming": true, | |
| "context_window": 128000 | |
| }, | |
| { | |
| "name": "mistral-large-latest", | |
| "provider": "mistralai", | |
| "description": "Mistral Large - Top-tier reasoning", | |
| "supports_streaming": true, | |
| "context_window": 32000 | |
| } | |
| ], | |
| "total": 8 | |
| } | |
| ``` | |
| ### Lister les agents disponibles | |
| **Requête:** | |
| ```bash | |
| curl -X GET http://localhost:7860/agents \ | |
| -H "Authorization: Bearer <token>" | |
| ``` | |
| **Réponse:** | |
| ```json | |
| { | |
| "agents": [ | |
| { | |
| "type": "V1", | |
| "name": "V1", | |
| "description": "Current production orchestrated workflow", | |
| "available": true | |
| }, | |
| { | |
| "type": "V2", | |
| "name": "V2", | |
| "description": "Isolated V2 workflow (default)", | |
| "available": true | |
| } | |
| ], | |
| "total": 2 | |
| } | |
| ``` | |
| ### Health Check | |
| **Requête:** | |
| ```bash | |
| curl -X GET http://localhost:7860/health | |
| ``` | |
| **Réponse:** | |
| ```json | |
| { | |
| "status": "healthy", | |
| "version": "1.0.0", | |
| "title": "CAPL Routeur IA API", | |
| "environment": "development", | |
| "timestamp": "2024-01-24T10:30:00.000000" | |
| } | |
| ``` | |
| ## WebSocket | |
| ### Connexion WebSocket | |
| **JavaScript:** | |
| ```javascript | |
| const ws = new WebSocket('ws://localhost:7860/realtime/ws'); | |
| ws.onopen = () => { | |
| console.log('Connected'); | |
| }; | |
| ws.onmessage = (event) => { | |
| const data = JSON.parse(event.data); | |
| console.log('Received:', data); | |
| }; | |
| ws.onerror = (error) => { | |
| console.error('WebSocket error:', error); | |
| }; | |
| ws.onclose = () => { | |
| console.log('Disconnected'); | |
| }; | |
| ``` | |
| ### Envoyer un message | |
| ```javascript | |
| ws.send(JSON.stringify({ | |
| type: 'message', | |
| payload: { | |
| text: 'Hello from client!' | |
| } | |
| })); | |
| ``` | |
| ### Ping/Pong (keep-alive) | |
| ```javascript | |
| // Envoyer ping toutes les 30 secondes | |
| setInterval(() => { | |
| ws.send(JSON.stringify({ | |
| type: 'ping', | |
| payload: {} | |
| })); | |
| }, 30000); | |
| ``` | |
| ### WebRTC Signaling (exemple) | |
| ```javascript | |
| // Envoyer une offre WebRTC | |
| ws.send(JSON.stringify({ | |
| type: 'offer', | |
| payload: { | |
| sdp: 'v=0\r\no=- ...', | |
| type: 'offer' | |
| } | |
| })); | |
| ``` | |
| ## Exemples avancés | |
| ### Python avec requests | |
| ```python | |
| import requests | |
| class RouterIAClient: | |
| def __init__(self, base_url="http://localhost:7860"): | |
| self.base_url = base_url | |
| self.token = None | |
| def authenticate(self): | |
| response = requests.post(f"{self.base_url}/auth/token") | |
| self.token = response.json()["access_token"] | |
| return self.token | |
| def complete(self, message, model="gpt-4o", stream=False): | |
| headers = {"Authorization": f"Bearer {self.token}"} | |
| data = { | |
| "message": message, | |
| "model": model, | |
| "stream": stream | |
| } | |
| response = requests.post( | |
| f"{self.base_url}/completion", | |
| headers=headers, | |
| json=data, | |
| stream=stream | |
| ) | |
| if stream: | |
| for line in response.iter_lines(): | |
| if line: | |
| yield line.decode('utf-8') | |
| else: | |
| return response.json() | |
| def transcribe(self, audio_file_path): | |
| headers = {"Authorization": f"Bearer {self.token}"} | |
| with open(audio_file_path, 'rb') as f: | |
| files = {'file': f} | |
| response = requests.post( | |
| f"{self.base_url}/transcription", | |
| headers=headers, | |
| files=files | |
| ) | |
| return response.json() | |
| def transcribe_meeting(self, audio_file_path, project_id="proj-default", max_duration_seconds=3600): | |
| """Transcrire un enregistrement de réunion avec le modèle dédié gpt-4o-transcribe.""" | |
| headers = {"Authorization": f"Bearer {self.token}"} | |
| with open(audio_file_path, 'rb') as f: | |
| files = {'file': f} | |
| data = {'project_id': project_id, 'max_duration_seconds': max_duration_seconds} | |
| response = requests.post( | |
| f"{self.base_url}/transcription/meeting", | |
| headers=headers, | |
| files=files, | |
| data=data | |
| ) | |
| return response.json() | |
| # Utilisation | |
| client = RouterIAClient() | |
| client.authenticate() | |
| # Completion simple | |
| result = client.complete("Bonjour!") | |
| print(result["response"]) | |
| # Streaming | |
| for chunk in client.complete("Compte de 1 à 5", stream=True): | |
| print(chunk) | |
| # Transcription | |
| transcription = client.transcribe("audio.mp3") | |
| print(transcription["text"]) | |
| # Transcription meeting | |
| meeting_transcription = client.transcribe_meeting("meeting_audio.mp3") | |
| print(meeting_transcription["text"]) | |
| ``` | |
| ### JavaScript/TypeScript avec fetch | |
| ```typescript | |
| class RouterIAClient { | |
| private baseUrl: string; | |
| private token: string | null = null; | |
| constructor(baseUrl: string = 'http://localhost:7860') { | |
| this.baseUrl = baseUrl; | |
| } | |
| async authenticate(): Promise<string> { | |
| const response = await fetch(`${this.baseUrl}/auth/token`, { | |
| method: 'POST' | |
| }); | |
| const data = await response.json(); | |
| this.token = data.access_token; | |
| return this.token; | |
| } | |
| async complete( | |
| message: string, | |
| model: string = 'gpt-4o', | |
| stream: boolean = false | |
| ) { | |
| const response = await fetch(`${this.baseUrl}/completion`, { | |
| method: 'POST', | |
| headers: { | |
| 'Authorization': `Bearer ${this.token}`, | |
| 'Content-Type': 'application/json' | |
| }, | |
| body: JSON.stringify({ message, model, stream }) | |
| }); | |
| if (stream) { | |
| return this.handleStreamResponse(response); | |
| } else { | |
| return await response.json(); | |
| } | |
| } | |
| private async *handleStreamResponse(response: Response) { | |
| const reader = response.body?.getReader(); | |
| const decoder = new TextDecoder(); | |
| if (!reader) return; | |
| while (true) { | |
| const { done, value } = await reader.read(); | |
| if (done) break; | |
| const chunk = decoder.decode(value); | |
| const lines = chunk.split('\n'); | |
| for (const line of lines) { | |
| if (line.startsWith('data: ')) { | |
| const data = JSON.parse(line.slice(6)); | |
| yield data; | |
| } | |
| } | |
| } | |
| } | |
| async transcribe(audioFile: File): Promise<any> { | |
| const formData = new FormData(); | |
| formData.append('file', audioFile); | |
| const response = await fetch(`${this.baseUrl}/transcription`, { | |
| method: 'POST', | |
| headers: { | |
| 'Authorization': `Bearer ${this.token}` | |
| }, | |
| body: formData | |
| }); | |
| return await response.json(); | |
| } | |
| async transcribeMeeting( | |
| audioFile: File, | |
| projectId: string = 'proj-default', | |
| maxDurationSeconds: number = 3600 | |
| ): Promise<any> { | |
| const formData = new FormData(); | |
| formData.append('project_id', projectId); | |
| formData.append('max_duration_seconds', String(maxDurationSeconds)); | |
| formData.append('file', audioFile); | |
| const response = await fetch(`${this.baseUrl}/transcription/meeting`, { | |
| method: 'POST', | |
| headers: { | |
| 'Authorization': `Bearer ${this.token}` | |
| }, | |
| body: formData | |
| }); | |
| return await response.json(); | |
| } | |
| } | |
| // Utilisation | |
| const client = new RouterIAClient(); | |
| await client.authenticate(); | |
| // Completion | |
| const result = await client.complete('Bonjour!'); | |
| console.log(result.response); | |
| // Streaming | |
| for await (const chunk of await client.complete('Compte de 1 à 5', 'gpt-4o', true)) { | |
| console.log(chunk.content); | |
| } | |
| // Transcription | |
| const transcription = await client.transcribe(audioFile); | |
| console.log(transcription.text); | |
| // Transcription meeting | |
| const meetingTranscription = await client.transcribeMeeting(audioFile); | |
| console.log(meetingTranscription.text); | |
| ``` | |
| ### Gestion d'erreurs | |
| ```python | |
| import requests | |
| from requests.exceptions import RequestException | |
| try: | |
| response = requests.post( | |
| "http://localhost:7860/completion", | |
| headers={"Authorization": f"Bearer {token}"}, | |
| json={"message": "Test", "model": "gpt-4o"} | |
| ) | |
| response.raise_for_status() | |
| result = response.json() | |
| print(result["response"]) | |
| except requests.exceptions.HTTPError as e: | |
| if e.response.status_code == 401: | |
| print("Token invalide ou expiré") | |
| elif e.response.status_code == 400: | |
| print("Requête invalide:", e.response.json()) | |
| else: | |
| print(f"Erreur HTTP {e.response.status_code}") | |
| except RequestException as e: | |
| print(f"Erreur de connexion: {e}") | |
| ``` | |
| ## Rate Limiting (à implémenter) | |
| Recommandations pour les clients: | |
| - Implémentez un retry avec backoff exponentiel | |
| - Respectez les headers `X-RateLimit-*` (à venir) | |
| - Mettez en cache les réponses quand possible | |
| ## Bonnes pratiques | |
| 1. **Sécurité**: Ne jamais exposer votre token dans le code côté client | |
| 2. **Gestion des tokens**: Rafraîchissez le token avant expiration | |
| 3. **Streaming**: Utilisez le streaming pour les longues réponses | |
| 4. **Timeout**: Configurez des timeouts appropriés | |
| 5. **Retry**: Implémentez une logique de retry pour les erreurs réseau | |
| 6. **Logging**: Loggez les erreurs côté client pour debugging | |