Spaces:

darylalim
/

madlad-400-translate

Running on Zero

File size: 6,313 Bytes

f50ef54
 
 
 
 
 
64bc839
f50ef54
 
 
 
 
03230b5
 
fc98a20
f50ef54
 
ff2b232
f50ef54
 
ff2b232
 
f50ef54
 
ff2b232
f50ef54
 
3548d85
 
ff2b232
64bc839
fc98a20
ff2b232
f50ef54
 
 
 
3548d85
64bc839
 
f50ef54
64bc839
f50ef54
3548d85
f50ef54
 
 
b916d99
 
2ba774e
42772e1
f50ef54
2cbad7c

# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## Project Overview

A Hugging Face Spaces app that translates between 418 languages from Table 9 (Section A.1) of Google's [MADLAD-400](https://arxiv.org/pdf/2309.04662) 3B Seq2Seq model. Built with Gradio and deployed on HF Spaces. Falls back to CPU with a warning when no CUDA GPU is available.

## Commands

```bash
# Setup
uv venv --python 3.12
uv pip install -r requirements.txt
uv pip install -r requirements-dev.txt

# Run (launches on http://localhost:7860)
uv run app.py

# Lint and format
uv run ruff check .
uv run ruff format .

# Type check
uv run ty check

# Test
uv run pytest                     # all 68 tests (slow require CUDA + model download)
uv run pytest -m "not slow"       # 58 fast tests only
uv run pytest -m slow             # 10 model tests only (CUDA only)

# Generate language mapping (dev only)
uv run scripts/generate_langmap.py <path-to-paper.pdf>
```

## Architecture

**`app.py`** — Single-file application with a Google Translate-style layout: top row has two symmetric, filterable, region-sorted language dropdowns (source defaults to "English (en)", target defaults to "French (fr)") with a swap button ("⇄") between them; below that, input textbox (autofocused) and output textbox with copy button side by side. The Translate button spans full width below both textboxes (shows "Translating..." during processing). Ctrl+Enter submits from the input. The model auto-detects source language; the source dropdown is for user reference and the swap button only, which an `info=` caption discloses. Each control carries an `info=` caption (caption text, not HTML/Markdown blocks): the target dropdown a quality-varies caveat, the input the Ctrl+Enter hint, the output model/arXiv/license provenance. Uses `@lru_cache` for lazy loading of the `google/madlad400-3b-mt` tokenizer and model. On ZeroGPU (`SPACES_ZERO_GPU=1`), `_maybe_eager_load()` places the model at module scope so the `spaces` hijack can pack weights and stream them into workers for fast cold starts; off-ZeroGPU (local, tests, cpu-basic) it stays lazy, so importing the app never downloads the model. Uses `bfloat16` on CUDA (T5/MADLAD is numerically unstable in `float16` — fp16's narrow range overflows to inf/NaN; bf16 is the format T5 was trained in), `float32` on CPU. MPS is not supported (produces garbage output with T5 models). Translation prepends a target language token with a space to the input text (e.g., `<2fr> Hello`) before tokenization and generation; whitespace-only input short-circuits to an empty string before the model loads. Decoding is greedy by default (deterministic); a non-default `temperature` enables sampling, and `num_beams > 1` uses beam search. A collapsed "Advanced" accordion exposes `max_new_tokens`/`num_beams`/`temperature` as `gr.Number` controls (no sliders; defaults mirror `translate()`, so the default surface stays greedy). Right-to-left target scripts (an explicit `RTL_CODES` token set — `region` is not a usable proxy) flip the output box to RTL via the Translate-button path. The `@spaces.GPU` decorator allocates GPU on HF Spaces infrastructure; its `duration` is a callable (`_estimate_duration`) that scales the GPU reservation with `max_new_tokens × num_beams` (capped at 120s). Both translate handlers (the private Translate-button click and the public submit) carry the advanced params, so Ctrl+Enter and the `/translate` API honor the accordion; the params keep defaults, so existing two-arg callers still work. The submit handler exposes a stable `/translate` API endpoint (returns a bare string); the swap and Translate-button handlers are `api_visibility="private"`, and both generation handlers use `show_progress="minimal"`. Only `/translate` is public.

**`langmap/`** — Package with `langid_mapping.py`, mapping 418 language tokens to `{"name": ..., "region": ...}` dicts. Auto-generated by `scripts/generate_langmap.py` from Table 9 (Section A.1) of the MADLAD-400 paper. Available languages at runtime are the intersection of this mapping and the model's vocabulary.

**`scripts/`** — `generate_langmap.py` parses the MADLAD-400 paper PDF (Table 9, pages 16-22) using pdfplumber and generates the static language mapping with region assignments. Dev-only tool; requires `requirements-dev.txt` dependencies.

**`tests/`** — 68 tests (58 fast, 10 slow). `test_langmap.py` has 10 fast tests for mapping validation (dict shape, regions, spot-checks). `test_app.py` has 48 fast tests (signatures, device fallback, bfloat16/float32 dtype selection, ZeroGPU eager-load gating, GPU duration estimator and its signature-mirror contract, greedy-by-default decoding, whitespace-input short-circuit, RTL output direction on the button path, `requirements.txt` excludes platform packages, UI layout with symmetric dropdowns, swap button, textbox config including toolbar buttons and input autofocus, `info=` captions on dropdowns and textboxes, the Advanced accordion's `gr.Number` controls wired into both translate handlers, `show_progress="minimal"` on generation handlers, handler wiring, stable `translate` API endpoint carrying the advanced params with UI-only handlers kept private, no HTML elements, no sliders, locale codes, no title) and 10 slow tests (translation with various parameters, language mapping). Slow tests require CUDA and model download; auto-skipped without CUDA.

## Tooling

When working with Python, invoke the relevant `/astral:<skill>` for uv, ty, and ruff to ensure best practices are followed.

- **uv** — Python package manager. Used for venv creation and dependency installation. No `pyproject.toml` (HF Spaces requires `requirements.txt`). `requirements.txt` is the Spaces build manifest and omits `gradio`/`spaces` (provided by the Spaces runtime on every tier) and pins `torch` to a ZeroGPU-supported version; `requirements-dev.txt` adds `gradio`/`spaces` for local runs plus the dev tooling, so local setup installs both files.
- **Ruff** — linter and formatter (`ruff.toml`). Rules: `E`, `F`, `I`, `UP`, `W`. Line length: 120.
- **ty** — type checker (`ty.toml`). Python 3.12 target.
- **pytest** — test runner (`pytest.ini`). Custom `slow` marker for CUDA-dependent tests.