Spaces:
Paused
title: DriftCall Env
emoji: π§
colorFrom: indigo
colorTo: pink
sdk: docker
app_port: 7860
pinned: false
short_description: OpenEnv β Indic voice concierge under schema drift.
license: apache-2.0
DriftCall
DriftCall is a voice-first Indic consumer-concierge RL environment where the agent must book flights, schedule rides, and file complaints across five mock vendor APIs while those APIs undergo deterministic mid-episode schema drift, pricing drift, T&C drift, policy drift, and auth drift. It is an OpenEnv-compliant REST environment plus an in-process Python trainer; a trained LoRA adapter for Gemma 3n E2B is published alongside.
- OpenEnv manifest:
openenv.yaml - Design spec:
DESIGN.md - Phase-C implementation plan:
CLAUDE.md - Per-module specs:
docs/modules/ - Per-module test plans:
docs/tests/
Architecture at a glance
- Env Space (this repo): FastAPI + OpenEnv REST on CPU-basic. Kokoro-82M TTS + faster-whisper-small ASR are baked into the image; no outbound network at runtime.
- Trainer: in-process GRPO (TRL 0.23+, Unsloth 2026.4.5+) on a single V100. Text-in / text-out β audio is an env-boundary concern.
- Demo Space: Gradio 5 on ZeroGPU, base Gemma 3n E2B + trained LoRA adapter switchable via a toggle.
Quickstart
# 1. Install the dev toolchain.
python3.11 -m venv .venv && source .venv/bin/activate
pip install -e '.[dev]'
# 2. Run the tests.
python3 -m pytest tests/ -v
# 3. Serve the env locally.
export DRIFTCALL_ENV_TOKEN=dev-local-token
uvicorn app:app --host 0.0.0.0 --port 7860
# 4. Validate against the OpenEnv schema.
openenv validate http://localhost:7860 --auth-bearer "$DRIFTCALL_ENV_TOKEN"
Notebook
notebooks/train_driftcall.ipynb is built from the numbered cells under
cells/. Rebuild with:
python3 notebooks/build_notebook.py
Weights & Biases (optional)
Training runs auto-log to wandb. Configure via env vars (override priority highest-to-lowest):
- Environment variables β set on the host or in your shell:
export WANDB_API_KEY=<your-key-from-wandb.ai/authorize> export WANDB_PROJECT=driftcall # default export WANDB_ENTITY=<your-team> # optional export WANDB_MODE=online # online | offline | disabled cells/_secrets.pyhardcoded fallback β used when env vars are unset. Edit the constant in that file to rotate the key (private repo).- None β
init_wandb()raises at run time ifWANDB_MODE != "disabled"and no API key is reachable.
Disable for local dev / CI:
export WANDB_MODE=disabled
Custom metrics logged each training step (training.md Β§3.3.3):
train/beta_adaptiveβ current KL coefficient (mutated byAdaptiveKLCallback)train/kl_measuredβ measured KL between policy and referencetrain/kl_targetβ target KL (default =BETA_KL= 0.04)train/beta_clamped_to_minβ 1 if Ξ² was floored atbeta_minthis steptrain/beta_clamped_to_maxβ 1 if Ξ² was ceilinged atbeta_maxthis step
Run tags (set at wandb.init): stage{N}, gemma-3n-e2b, bf16 or fp16,
adaptive-kl or static-kl, seed{N}.
License
Apache License 2.0. See LICENSE (included at repo root when
this artifact is published as a Space).