Spaces:

saumilyajj
/

driftcall

Paused

App Files Files Community

driftcall / PROJECT_README.md

saumilyajj

Upload folder using huggingface_hub

f2df60e verified about 2 months ago

3.47 kB

title: DriftCall Env
emoji: 🧭
colorFrom: indigo
colorTo: pink
sdk: docker
app_port: 7860
pinned: false
short_description: OpenEnv — Indic voice concierge under schema drift.
license: apache-2.0

DriftCall

DriftCall is a voice-first Indic consumer-concierge RL environment where the agent must book flights, schedule rides, and file complaints across five mock vendor APIs while those APIs undergo deterministic mid-episode schema drift, pricing drift, T&C drift, policy drift, and auth drift. It is an OpenEnv-compliant REST environment plus an in-process Python trainer; a trained LoRA adapter for Gemma 3n E2B is published alongside.

OpenEnv manifest: openenv.yaml
Design spec: DESIGN.md
Phase-C implementation plan: CLAUDE.md
Per-module specs: docs/modules/
Per-module test plans: docs/tests/

Architecture at a glance

Env Space (this repo): FastAPI + OpenEnv REST on CPU-basic. Kokoro-82M TTS + faster-whisper-small ASR are baked into the image; no outbound network at runtime.
Trainer: in-process GRPO (TRL 0.23+, Unsloth 2026.4.5+) on a single V100. Text-in / text-out — audio is an env-boundary concern.
Demo Space: Gradio 5 on ZeroGPU, base Gemma 3n E2B + trained LoRA adapter switchable via a toggle.

Quickstart

# 1. Install the dev toolchain.
python3.11 -m venv .venv && source .venv/bin/activate
pip install -e '.[dev]'

# 2. Run the tests.
python3 -m pytest tests/ -v

# 3. Serve the env locally.
export DRIFTCALL_ENV_TOKEN=dev-local-token
uvicorn app:app --host 0.0.0.0 --port 7860

# 4. Validate against the OpenEnv schema.
openenv validate http://localhost:7860 --auth-bearer "$DRIFTCALL_ENV_TOKEN"

Notebook

notebooks/train_driftcall.ipynb is built from the numbered cells under cells/. Rebuild with:

python3 notebooks/build_notebook.py

Weights & Biases (optional)

Training runs auto-log to wandb. Configure via env vars (override priority highest-to-lowest):

Environment variables — set on the host or in your shell:

export WANDB_API_KEY=<your-key-from-wandb.ai/authorize>
export WANDB_PROJECT=driftcall              # default
export WANDB_ENTITY=<your-team>             # optional
export WANDB_MODE=online                    # online | offline | disabled

cells/_secrets.py hardcoded fallback — used when env vars are unset. Edit the constant in that file to rotate the key (private repo).
None — init_wandb() raises at run time if WANDB_MODE != "disabled" and no API key is reachable.

Disable for local dev / CI:

export WANDB_MODE=disabled

Custom metrics logged each training step (training.md §3.3.3):

train/beta_adaptive — current KL coefficient (mutated by AdaptiveKLCallback)
train/kl_measured — measured KL between policy and reference
train/kl_target — target KL (default = BETA_KL = 0.04)
train/beta_clamped_to_min — 1 if β was floored at beta_min this step
train/beta_clamped_to_max — 1 if β was ceilinged at beta_max this step

Run tags (set at wandb.init): stage{N}, gemma-3n-e2b, bf16 or fp16, adaptive-kl or static-kl, seed{N}.

License

Apache License 2.0. See LICENSE (included at repo root when this artifact is published as a Space).