Spaces:
Running on Zero
Running on Zero
Daryl Lim Claude Opus 4.8 (1M context) commited on
Commit Β·
3548d85
1
Parent(s): 879879e
test: cover info captions, show_progress, and sync CLAUDE.md counts
Browse filesAdd fast tests for the remaining untested UI changes: info= captions on the
dropdowns (incl. the source auto-detect disclosure) and textboxes, and
show_progress="minimal" on both generation handlers. Update CLAUDE.md to the
new counts (68 total / 58 fast / 10 slow) and document the captions, Advanced
gr.Number accordion, RTL output, empty-input short-circuit, and that the
public /translate endpoint now carries the advanced generation params.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- CLAUDE.md +4 -4
- tests/test_app.py +22 -0
CLAUDE.md
CHANGED
|
@@ -25,8 +25,8 @@ uv run ruff format .
|
|
| 25 |
uv run ty check
|
| 26 |
|
| 27 |
# Test
|
| 28 |
-
uv run pytest # all
|
| 29 |
-
uv run pytest -m "not slow" #
|
| 30 |
uv run pytest -m slow # 10 model tests only (CUDA only)
|
| 31 |
|
| 32 |
# Generate language mapping (dev only)
|
|
@@ -35,13 +35,13 @@ uv run scripts/generate_langmap.py <path-to-paper.pdf>
|
|
| 35 |
|
| 36 |
## Architecture
|
| 37 |
|
| 38 |
-
**`app.py`** β Single-file application with a Google Translate-style layout: top row has two symmetric, filterable, region-sorted language dropdowns (source defaults to "English (en)", target defaults to "French (fr)") with a swap button ("β") between them; below that, input textbox (autofocused) and output textbox with copy button side by side. The Translate button spans full width below both textboxes (shows "Translating..." during processing). Ctrl+Enter submits from the input. The model auto-detects source language; the source dropdown is for user reference and the swap button only. Uses `@lru_cache` for lazy loading of the `google/madlad400-3b-mt` tokenizer and model. On ZeroGPU (`SPACES_ZERO_GPU=1`), `_maybe_eager_load()` places the model at module scope so the `spaces` hijack can pack weights and stream them into workers for fast cold starts; off-ZeroGPU (local, tests, cpu-basic) it stays lazy, so importing the app never downloads the model. Uses `bfloat16` on CUDA (T5/MADLAD is numerically unstable in `float16` β fp16's narrow range overflows to inf/NaN; bf16 is the format T5 was trained in), `float32` on CPU. MPS is not supported (produces garbage output with T5 models). Translation prepends a target language token with a space to the input text (e.g., `<2fr> Hello`) before tokenization and generation. Decoding is greedy by default (deterministic); a non-default `temperature` enables sampling, and `num_beams > 1` uses beam search. The `@spaces.GPU` decorator allocates GPU on HF Spaces infrastructure; its `duration` is a callable (`_estimate_duration`) that scales the GPU reservation with `max_new_tokens Γ num_beams` (capped at 120s). The submit handler exposes a stable `/translate` API endpoint; the swap and Translate-button handlers are `api_visibility="private"`. Only `/translate` is public.
|
| 39 |
|
| 40 |
**`langmap/`** β Package with `langid_mapping.py`, mapping 418 language tokens to `{"name": ..., "region": ...}` dicts. Auto-generated by `scripts/generate_langmap.py` from Table 9 (Section A.1) of the MADLAD-400 paper. Available languages at runtime are the intersection of this mapping and the model's vocabulary.
|
| 41 |
|
| 42 |
**`scripts/`** β `generate_langmap.py` parses the MADLAD-400 paper PDF (Table 9, pages 16-22) using pdfplumber and generates the static language mapping with region assignments. Dev-only tool; requires `requirements-dev.txt` dependencies.
|
| 43 |
|
| 44 |
-
**`tests/`** β
|
| 45 |
|
| 46 |
## Tooling
|
| 47 |
|
|
|
|
| 25 |
uv run ty check
|
| 26 |
|
| 27 |
# Test
|
| 28 |
+
uv run pytest # all 68 tests (slow require CUDA + model download)
|
| 29 |
+
uv run pytest -m "not slow" # 58 fast tests only
|
| 30 |
uv run pytest -m slow # 10 model tests only (CUDA only)
|
| 31 |
|
| 32 |
# Generate language mapping (dev only)
|
|
|
|
| 35 |
|
| 36 |
## Architecture
|
| 37 |
|
| 38 |
+
**`app.py`** β Single-file application with a Google Translate-style layout: top row has two symmetric, filterable, region-sorted language dropdowns (source defaults to "English (en)", target defaults to "French (fr)") with a swap button ("β") between them; below that, input textbox (autofocused) and output textbox with copy button side by side. The Translate button spans full width below both textboxes (shows "Translating..." during processing). Ctrl+Enter submits from the input. The model auto-detects source language; the source dropdown is for user reference and the swap button only, which an `info=` caption discloses. Each control carries an `info=` caption (caption text, not HTML/Markdown blocks): the target dropdown a quality-varies caveat, the input the Ctrl+Enter hint, the output model/arXiv/license provenance. Uses `@lru_cache` for lazy loading of the `google/madlad400-3b-mt` tokenizer and model. On ZeroGPU (`SPACES_ZERO_GPU=1`), `_maybe_eager_load()` places the model at module scope so the `spaces` hijack can pack weights and stream them into workers for fast cold starts; off-ZeroGPU (local, tests, cpu-basic) it stays lazy, so importing the app never downloads the model. Uses `bfloat16` on CUDA (T5/MADLAD is numerically unstable in `float16` β fp16's narrow range overflows to inf/NaN; bf16 is the format T5 was trained in), `float32` on CPU. MPS is not supported (produces garbage output with T5 models). Translation prepends a target language token with a space to the input text (e.g., `<2fr> Hello`) before tokenization and generation; whitespace-only input short-circuits to an empty string before the model loads. Decoding is greedy by default (deterministic); a non-default `temperature` enables sampling, and `num_beams > 1` uses beam search. A collapsed "Advanced" accordion exposes `max_new_tokens`/`num_beams`/`temperature` as `gr.Number` controls (no sliders; defaults mirror `translate()`, so the default surface stays greedy). Right-to-left target scripts (an explicit `RTL_CODES` token set β `region` is not a usable proxy) flip the output box to RTL via the Translate-button path. The `@spaces.GPU` decorator allocates GPU on HF Spaces infrastructure; its `duration` is a callable (`_estimate_duration`) that scales the GPU reservation with `max_new_tokens Γ num_beams` (capped at 120s). Both translate handlers (the private Translate-button click and the public submit) carry the advanced params, so Ctrl+Enter and the `/translate` API honor the accordion; the params keep defaults, so existing two-arg callers still work. The submit handler exposes a stable `/translate` API endpoint (returns a bare string); the swap and Translate-button handlers are `api_visibility="private"`, and both generation handlers use `show_progress="minimal"`. Only `/translate` is public.
|
| 39 |
|
| 40 |
**`langmap/`** β Package with `langid_mapping.py`, mapping 418 language tokens to `{"name": ..., "region": ...}` dicts. Auto-generated by `scripts/generate_langmap.py` from Table 9 (Section A.1) of the MADLAD-400 paper. Available languages at runtime are the intersection of this mapping and the model's vocabulary.
|
| 41 |
|
| 42 |
**`scripts/`** β `generate_langmap.py` parses the MADLAD-400 paper PDF (Table 9, pages 16-22) using pdfplumber and generates the static language mapping with region assignments. Dev-only tool; requires `requirements-dev.txt` dependencies.
|
| 43 |
|
| 44 |
+
**`tests/`** β 68 tests (58 fast, 10 slow). `test_langmap.py` has 10 fast tests for mapping validation (dict shape, regions, spot-checks). `test_app.py` has 48 fast tests (signatures, device fallback, bfloat16/float32 dtype selection, ZeroGPU eager-load gating, GPU duration estimator and its signature-mirror contract, greedy-by-default decoding, whitespace-input short-circuit, RTL output direction on the button path, `requirements.txt` excludes platform packages, UI layout with symmetric dropdowns, swap button, textbox config including toolbar buttons and input autofocus, `info=` captions on dropdowns and textboxes, the Advanced accordion's `gr.Number` controls wired into both translate handlers, `show_progress="minimal"` on generation handlers, handler wiring, stable `translate` API endpoint carrying the advanced params with UI-only handlers kept private, no HTML elements, no sliders, locale codes, no title) and 10 slow tests (translation with various parameters, language mapping). Slow tests require CUDA and model download; auto-skipped without CUDA.
|
| 45 |
|
| 46 |
## Tooling
|
| 47 |
|
tests/test_app.py
CHANGED
|
@@ -283,6 +283,20 @@ def test_dropdown_choices_include_locale_codes(demo):
|
|
| 283 |
assert all("(" in label and ")" in label for label in labels), f"Expected locale codes in choices: {labels}"
|
| 284 |
|
| 285 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 286 |
def test_demo_has_two_textboxes(demo):
|
| 287 |
"""UI should have input and output textboxes."""
|
| 288 |
textboxes = [b for b in demo.blocks.values() if type(b).__name__ == "Textbox"]
|
|
@@ -421,6 +435,14 @@ def test_all_handlers_wired(demo):
|
|
| 421 |
assert len(demo.fns) == 3, f"Expected 3 handlers, found {len(demo.fns)}"
|
| 422 |
|
| 423 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 424 |
def test_translate_endpoint_has_stable_api_name(demo):
|
| 425 |
"""The submit handler exposes a stable 'translate' API endpoint accepting text + target
|
| 426 |
language + the advanced generation params, returning a single string."""
|
|
|
|
| 283 |
assert all("(" in label and ")" in label for label in labels), f"Expected locale codes in choices: {labels}"
|
| 284 |
|
| 285 |
|
| 286 |
+
def test_dropdowns_have_info_captions(demo):
|
| 287 |
+
"""Both dropdowns carry info= captions; the source one discloses that the source language
|
| 288 |
+
is auto-detected (it only feeds the swap button)."""
|
| 289 |
+
dropdowns = [b for b in demo.blocks.values() if type(b).__name__ == "Dropdown"]
|
| 290 |
+
assert all(d.info for d in dropdowns), "both dropdowns should carry info captions"
|
| 291 |
+
assert any("auto-detect" in (d.info or "").lower() for d in dropdowns), "source must disclose auto-detection"
|
| 292 |
+
|
| 293 |
+
|
| 294 |
+
def test_textboxes_have_info_captions(demo):
|
| 295 |
+
"""Input and output textboxes carry info= captions (Ctrl+Enter hint and model provenance)."""
|
| 296 |
+
textboxes = [b for b in demo.blocks.values() if type(b).__name__ == "Textbox"]
|
| 297 |
+
assert all(t.info for t in textboxes), "input and output textboxes should carry info captions"
|
| 298 |
+
|
| 299 |
+
|
| 300 |
def test_demo_has_two_textboxes(demo):
|
| 301 |
"""UI should have input and output textboxes."""
|
| 302 |
textboxes = [b for b in demo.blocks.values() if type(b).__name__ == "Textbox"]
|
|
|
|
| 435 |
assert len(demo.fns) == 3, f"Expected 3 handlers, found {len(demo.fns)}"
|
| 436 |
|
| 437 |
|
| 438 |
+
def test_generation_handlers_use_minimal_progress(demo):
|
| 439 |
+
"""Both translate handlers use show_progress='minimal' so the multi-second generation does
|
| 440 |
+
not draw a heavy overlay over the output box."""
|
| 441 |
+
gen_fns = [fn for fn in demo.fns.values() if [type(i).__name__ for i in fn.inputs][:2] == ["Textbox", "Dropdown"]]
|
| 442 |
+
assert len(gen_fns) == 2
|
| 443 |
+
assert all(getattr(fn, "show_progress", None) == "minimal" for fn in gen_fns)
|
| 444 |
+
|
| 445 |
+
|
| 446 |
def test_translate_endpoint_has_stable_api_name(demo):
|
| 447 |
"""The submit handler exposes a stable 'translate' API endpoint accepting text + target
|
| 448 |
language + the advanced generation params, returning a single string."""
|