# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## Project Overview

A Hugging Face Spaces app that translates between 418 languages from Table 9 (Section A.1) of Google's [MADLAD-400](https://arxiv.org/pdf/2309.04662) 3B Seq2Seq model. Built with Gradio and deployed on HF Spaces. Falls back to CPU with a warning when no CUDA GPU is available.

## Commands

```bash
# Setup
uv venv --python 3.12
uv pip install -r requirements.txt
uv pip install -r requirements-dev.txt

# Run (launches on http://localhost:7860)
uv run app.py

# Lint and format
uv run ruff check .
uv run ruff format .

# Type check
uv run ty check

# Test
uv run pytest                     # all 56 tests (slow require CUDA + model download)
uv run pytest -m "not slow"       # 46 fast tests only
uv run pytest -m slow             # 10 model tests only (CUDA only)

# Generate language mapping (dev only)
uv run scripts/generate_langmap.py <path-to-paper.pdf>
```

## Architecture

**`app.py`** — Single-file application with a Google Translate-style layout: top row has two symmetric, filterable, region-sorted language dropdowns (source defaults to "English (en)", target defaults to "French (fr)") with a swap button ("⇄") between them; below that, input textbox (autofocused) and output textbox with copy button side by side. The Translate button spans full width below both textboxes (shows "Translating..." during processing). Ctrl+Enter submits from the input. The model auto-detects source language; the source dropdown is for user reference and the swap button only. Uses `@lru_cache` for lazy loading of the `google/madlad400-3b-mt` tokenizer and model. On ZeroGPU (`SPACES_ZERO_GPU=1`), `_maybe_eager_load()` places the model at module scope so the `spaces` hijack can pack weights and stream them into workers for fast cold starts; off-ZeroGPU (local, tests, cpu-basic) it stays lazy, so importing the app never downloads the model. Uses `float16` on CUDA, `float32` on CPU. MPS is not supported (produces garbage output with T5 models). Translation prepends a target language token with a space to the input text (e.g., `<2fr> Hello`) before tokenization and generation. The `@spaces.GPU` decorator allocates GPU on HF Spaces infrastructure; its `duration` is a callable (`_estimate_duration`) that scales the GPU reservation with `max_new_tokens × num_beams` (capped at 120s). The submit handler exposes a stable `/translate` API endpoint; the swap and Translate-button handlers are `api_visibility="private"`. Only `/translate` is public.

**`langmap/`** — Package with `langid_mapping.py`, mapping 418 language tokens to `{"name": ..., "region": ...}` dicts. Auto-generated by `scripts/generate_langmap.py` from Table 9 (Section A.1) of the MADLAD-400 paper. Available languages at runtime are the intersection of this mapping and the model's vocabulary.

**`scripts/`** — `generate_langmap.py` parses the MADLAD-400 paper PDF (Table 9, pages 16-22) using pdfplumber and generates the static language mapping with region assignments. Dev-only tool; requires `requirements-dev.txt` dependencies.

**`tests/`** — 56 tests (46 fast, 10 slow). `test_langmap.py` has 10 fast tests for mapping validation (dict shape, regions, spot-checks). `test_app.py` has 36 fast tests (signatures, device fallback, ZeroGPU eager-load gating, GPU duration estimator, UI layout with symmetric dropdowns, swap button, textbox config including toolbar buttons and input autofocus, handler wiring, stable `translate` API endpoint with UI-only handlers kept private, no HTML elements, locale codes, no title) and 10 slow tests (translation with various parameters, language mapping). Slow tests require CUDA and model download; auto-skipped without CUDA.

## Tooling

When working with Python, invoke the relevant `/astral:<skill>` for uv, ty, and ruff to ensure best practices are followed.

- **uv** — Python package manager. Used for venv creation and dependency installation. No `pyproject.toml` (HF Spaces requires `requirements.txt`). `requirements.txt` is the Spaces build manifest and omits `gradio`/`spaces` (provided by the Spaces runtime on every tier) and pins `torch` to a ZeroGPU-supported version; `requirements-dev.txt` adds `gradio`/`spaces` for local runs plus the dev tooling, so local setup installs both files.
- **Ruff** — linter and formatter (`ruff.toml`). Rules: `E`, `F`, `I`, `UP`, `W`. Line length: 120.
- **ty** — type checker (`ty.toml`). Python 3.12 target.
- **pytest** — test runner (`pytest.ini`). Custom `slow` marker for CUDA-dependent tests.