Spaces:
Running on Zero
fix: harden generation params, fix swap RTL, polish from review
Browse filesAddress the adversarial self-review of the UI/API changes.
High — None/NaN/null safety on the now-public /translate:
- Add _normalize_params (None/NaN -> default, clamp to range) as the single
funnel; call it in translate() and in _estimate_duration (which ZeroGPU runs
*before* translate with the same uncast args). A cleared gr.Number arrives as
None and the public submit path passes values uncast, so without this a single
empty Advanced field crashed the endpoint and the duration callable.
- Make the empty-input guard None-safe: `not (text or "").strip()`.
- Drop the now-redundant int()/float() casts in _translate_with_loading.
- Type the numeric params as int|None / float|None to reflect Gradio's nulls.
Medium — swap-button stale RTL:
- _swap_languages now emits gr.update(rtl/text_align) for both textboxes so
direction follows the swapped text (rtl is sticky; a prior RTL flip must reset).
Low polish:
- Gate sampling on abs(temperature - 1.0) > 1e-6 to absorb float spinner drift.
- Reword the temperature caption to describe both directions.
- Document the Ctrl+Enter-vs-button RTL divergence in the submit comment.
Tests (68 fast, +10): param forwarding into generate, _normalize_params
clamp/None/NaN, translate None/NaN coercion, _estimate_duration None-safety,
empty/None parametrized guard, swap RTL, RTL_CODES ⊆ langmap, gr.Number bounds,
/translate input order pinned by label, public-path assertion keyed on
api_visibility, textbox caption content, RTL LTR-branch value. Docs synced to
78/68/10.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- CLAUDE.md +4 -4
- README.md +2 -2
- app.py +68 -25
- tests/test_app.py +117 -9
|
@@ -25,8 +25,8 @@ uv run ruff format .
|
|
| 25 |
uv run ty check
|
| 26 |
|
| 27 |
# Test
|
| 28 |
-
uv run pytest # all
|
| 29 |
-
uv run pytest -m "not slow" #
|
| 30 |
uv run pytest -m slow # 10 model tests only (CUDA only)
|
| 31 |
|
| 32 |
# Generate language mapping (dev only)
|
|
@@ -35,13 +35,13 @@ uv run scripts/generate_langmap.py <path-to-paper.pdf>
|
|
| 35 |
|
| 36 |
## Architecture
|
| 37 |
|
| 38 |
-
**`app.py`** — Single-file application with a Google Translate-style layout: top row has two symmetric, filterable, region-sorted language dropdowns (source defaults to "English (en)", target defaults to "French (fr)") with a swap button ("⇄") between them; below that, input textbox (autofocused) and output textbox with copy button side by side. The Translate button spans full width below both textboxes (shows "Translating..." during processing). Ctrl+Enter submits from the input. The model auto-detects source language; the source dropdown is for user reference and the swap button only, which an `info=` caption discloses. Each control carries an `info=` caption (caption text, not HTML/Markdown blocks): the target dropdown a quality-varies caveat, the input the Ctrl+Enter hint, the output model/arXiv/license provenance. Uses `@lru_cache` for lazy loading of the `google/madlad400-3b-mt` tokenizer and model. On ZeroGPU (`SPACES_ZERO_GPU=1`), `_maybe_eager_load()` places the model at module scope so the `spaces` hijack can pack weights and stream them into workers for fast cold starts; off-ZeroGPU (local, tests, cpu-basic) it stays lazy, so importing the app never downloads the model. Uses `bfloat16` on CUDA (T5/MADLAD is numerically unstable in `float16` — fp16's narrow range overflows to inf/NaN; bf16 is the format T5 was trained in), `float32` on CPU. MPS is not supported (produces garbage output with T5 models). Translation prepends a target language token with a space to the input text (e.g., `<2fr> Hello`) before tokenization and generation; whitespace-only input short-circuits to an empty string before the model loads. Decoding is greedy by default (deterministic); a non-default `temperature` enables sampling, and `num_beams > 1` uses beam search. A collapsed "Advanced" accordion exposes `max_new_tokens`/`num_beams`/`temperature` as `gr.Number` controls (no sliders; defaults mirror `translate()`, so the default surface stays greedy). Right-to-left target scripts (an explicit `RTL_CODES` token set — `region` is not a usable proxy) flip the output box to RTL via the Translate-button
|
| 39 |
|
| 40 |
**`langmap/`** — Package with `langid_mapping.py`, mapping 418 language tokens to `{"name": ..., "region": ...}` dicts. Auto-generated by `scripts/generate_langmap.py` from Table 9 (Section A.1) of the MADLAD-400 paper. Available languages at runtime are the intersection of this mapping and the model's vocabulary.
|
| 41 |
|
| 42 |
**`scripts/`** — `generate_langmap.py` parses the MADLAD-400 paper PDF (Table 9, pages 16-22) using pdfplumber and generates the static language mapping with region assignments. Dev-only tool; requires `requirements-dev.txt` dependencies.
|
| 43 |
|
| 44 |
-
**`tests/`** —
|
| 45 |
|
| 46 |
## Tooling
|
| 47 |
|
|
|
|
| 25 |
uv run ty check
|
| 26 |
|
| 27 |
# Test
|
| 28 |
+
uv run pytest # all 78 tests (slow require CUDA + model download)
|
| 29 |
+
uv run pytest -m "not slow" # 68 fast tests only
|
| 30 |
uv run pytest -m slow # 10 model tests only (CUDA only)
|
| 31 |
|
| 32 |
# Generate language mapping (dev only)
|
|
|
|
| 35 |
|
| 36 |
## Architecture
|
| 37 |
|
| 38 |
+
**`app.py`** — Single-file application with a Google Translate-style layout: top row has two symmetric, filterable, region-sorted language dropdowns (source defaults to "English (en)", target defaults to "French (fr)") with a swap button ("⇄") between them; below that, input textbox (autofocused) and output textbox with copy button side by side. The Translate button spans full width below both textboxes (shows "Translating..." during processing). Ctrl+Enter submits from the input. The model auto-detects source language; the source dropdown is for user reference and the swap button only, which an `info=` caption discloses. Each control carries an `info=` caption (caption text, not HTML/Markdown blocks): the target dropdown a quality-varies caveat, the input the Ctrl+Enter hint, the output model/arXiv/license provenance. Uses `@lru_cache` for lazy loading of the `google/madlad400-3b-mt` tokenizer and model. On ZeroGPU (`SPACES_ZERO_GPU=1`), `_maybe_eager_load()` places the model at module scope so the `spaces` hijack can pack weights and stream them into workers for fast cold starts; off-ZeroGPU (local, tests, cpu-basic) it stays lazy, so importing the app never downloads the model. Uses `bfloat16` on CUDA (T5/MADLAD is numerically unstable in `float16` — fp16's narrow range overflows to inf/NaN; bf16 is the format T5 was trained in), `float32` on CPU. MPS is not supported (produces garbage output with T5 models). Translation prepends a target language token with a space to the input text (e.g., `<2fr> Hello`) before tokenization and generation; whitespace-only or `None` input short-circuits to an empty string before the model loads. The generation params are normalized in `translate()` via `_normalize_params` (`None`/`NaN` → default, then clamped to range) so the cast-less public path and the ZeroGPU duration callable can't crash on a cleared `gr.Number` field. Decoding is greedy by default (deterministic); a non-default `temperature` (tolerance-compared to absorb float spinner drift) enables sampling, and `num_beams > 1` uses beam search. A collapsed "Advanced" accordion exposes `max_new_tokens`/`num_beams`/`temperature` as `gr.Number` controls (no sliders; defaults mirror `translate()`, so the default surface stays greedy). Right-to-left target scripts (an explicit `RTL_CODES` token set — `region` is not a usable proxy) flip the output box to RTL via the Translate-button and swap paths; Ctrl+Enter/`/translate` return a bare string and stay LTR. The `@spaces.GPU` decorator allocates GPU on HF Spaces infrastructure; its `duration` is a callable (`_estimate_duration`) that scales the GPU reservation with `max_new_tokens × num_beams` (capped at 120s). Both translate handlers (the private Translate-button click and the public submit) carry the advanced params, so Ctrl+Enter and the `/translate` API honor the accordion; the params keep defaults, so existing two-arg callers still work. The submit handler exposes a stable `/translate` API endpoint (returns a bare string); the swap and Translate-button handlers are `api_visibility="private"`, and both generation handlers use `show_progress="minimal"`. Only `/translate` is public.
|
| 39 |
|
| 40 |
**`langmap/`** — Package with `langid_mapping.py`, mapping 418 language tokens to `{"name": ..., "region": ...}` dicts. Auto-generated by `scripts/generate_langmap.py` from Table 9 (Section A.1) of the MADLAD-400 paper. Available languages at runtime are the intersection of this mapping and the model's vocabulary.
|
| 41 |
|
| 42 |
**`scripts/`** — `generate_langmap.py` parses the MADLAD-400 paper PDF (Table 9, pages 16-22) using pdfplumber and generates the static language mapping with region assignments. Dev-only tool; requires `requirements-dev.txt` dependencies.
|
| 43 |
|
| 44 |
+
**`tests/`** — 78 tests (68 fast, 10 slow). `test_langmap.py` has 10 fast tests for mapping validation (dict shape, regions, spot-checks). `test_app.py` has 58 fast tests (signatures, device fallback, bfloat16/float32 dtype selection, ZeroGPU eager-load gating, GPU duration estimator and its signature-mirror contract + `None`-safety, greedy-by-default decoding, param forwarding into `generate`, `_normalize_params` None/NaN/clamp coercion, empty/`None`-input short-circuit, RTL output direction on the button and swap paths, `RTL_CODES` ⊆ langmap invariant, `requirements.txt` excludes platform packages, UI layout with symmetric dropdowns, swap button, textbox config including toolbar buttons and input autofocus, `info=` captions on dropdowns and textboxes spot-checked by content, the Advanced accordion's `gr.Number` controls and their bounds, advanced params reaching the public endpoint by `api_visibility` with the `/translate` input order pinned by label, `show_progress="minimal"` on generation handlers, handler wiring, stable `translate` API endpoint carrying the advanced params with UI-only handlers kept private, no HTML elements, no sliders, locale codes, no title) and 10 slow tests (translation with various parameters, language mapping). Slow tests require CUDA and model download; auto-skipped without CUDA.
|
| 45 |
|
| 46 |
## Tooling
|
| 47 |
|
|
@@ -39,6 +39,6 @@ The Gradio interface launches at `http://localhost:7860`.
|
|
| 39 |
uv run ruff check . # lint
|
| 40 |
uv run ruff format . # format
|
| 41 |
uv run ty check # type check
|
| 42 |
-
uv run pytest -m "not slow" #
|
| 43 |
-
uv run pytest # all
|
| 44 |
```
|
|
|
|
| 39 |
uv run ruff check . # lint
|
| 40 |
uv run ruff format . # format
|
| 41 |
uv run ty check # type check
|
| 42 |
+
uv run pytest -m "not slow" # 68 fast tests
|
| 43 |
+
uv run pytest # all 78 tests (slow require CUDA + model download)
|
| 44 |
```
|
|
@@ -3,6 +3,7 @@ Translation interface using the MADLAD-400 3B model.
|
|
| 3 |
Translates between 418 languages from the MADLAD-400 paper.
|
| 4 |
"""
|
| 5 |
|
|
|
|
| 6 |
import os
|
| 7 |
import time
|
| 8 |
import warnings
|
|
@@ -105,19 +106,40 @@ def _maybe_eager_load() -> None:
|
|
| 105 |
_load_model()
|
| 106 |
|
| 107 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 108 |
def _estimate_duration(
|
| 109 |
text: str,
|
| 110 |
target_language_name: str,
|
| 111 |
-
max_new_tokens: int = 512,
|
| 112 |
-
num_beams: int = 1,
|
| 113 |
-
temperature: float = 1.0,
|
| 114 |
) -> int:
|
| 115 |
-
"""Reserve GPU time scaled to the worst case: generation cost grows with the
|
| 116 |
-
|
| 117 |
-
|
| 118 |
-
|
| 119 |
-
translate() (zerogpu.md 'Sizing
|
| 120 |
-
|
|
|
|
|
|
|
| 121 |
return min(120, 30 + (max_new_tokens * num_beams) // 8)
|
| 122 |
|
| 123 |
|
|
@@ -125,16 +147,24 @@ def _estimate_duration(
|
|
| 125 |
def translate(
|
| 126 |
text: str,
|
| 127 |
target_language_name: str,
|
| 128 |
-
max_new_tokens: int = 512,
|
| 129 |
-
num_beams: int = 1,
|
| 130 |
-
temperature: float = 1.0,
|
| 131 |
) -> str:
|
| 132 |
# No-op on empty/whitespace input: skip the model entirely rather than feeding a bare
|
| 133 |
# "<2xx> " prompt (which would burn generation time and emit a stray token). Guard lives
|
| 134 |
# here, not in _translate_with_loading, so the public /translate and Ctrl+Enter paths
|
| 135 |
-
# (which call translate() directly) are covered too.
|
| 136 |
-
|
|
|
|
| 137 |
return ""
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 138 |
tokenizer = _load_tokenizer()
|
| 139 |
model = _load_model()
|
| 140 |
device = model.device
|
|
@@ -144,7 +174,7 @@ def translate(
|
|
| 144 |
if target_code is None:
|
| 145 |
raise ValueError(f"Unsupported language: {target_language_name}")
|
| 146 |
|
| 147 |
-
if num_beams > 1 and
|
| 148 |
gr.Info("Temperature has no effect when beam search is enabled (num_beams > 1).")
|
| 149 |
|
| 150 |
input_ids = tokenizer(target_code + " " + text, return_tensors="pt").input_ids.to(device)
|
|
@@ -152,7 +182,7 @@ def translate(
|
|
| 152 |
generate_kwargs: dict = {"input_ids": input_ids, "max_new_tokens": max_new_tokens, "num_beams": num_beams}
|
| 153 |
# Greedy by default (deterministic, higher-quality MT). Only sample when the user
|
| 154 |
# explicitly sets a non-default temperature; beam search (num_beams > 1) ignores it.
|
| 155 |
-
if num_beams == 1 and
|
| 156 |
generate_kwargs["do_sample"] = True
|
| 157 |
generate_kwargs["temperature"] = temperature
|
| 158 |
|
|
@@ -171,12 +201,13 @@ def translate(
|
|
| 171 |
def _translate_with_loading(
|
| 172 |
text: str,
|
| 173 |
target_language_name: str,
|
| 174 |
-
max_new_tokens: int = 512,
|
| 175 |
-
num_beams: int = 1,
|
| 176 |
-
temperature: float = 1.0,
|
| 177 |
) -> Generator[tuple[object, object], None, None]:
|
| 178 |
yield gr.update(value="Translating...", interactive=False), gr.update()
|
| 179 |
-
|
|
|
|
| 180 |
# Flip the output box to RTL for right-to-left target scripts so the translation reads
|
| 181 |
# correctly; reset to LTR otherwise (rtl is sticky across reruns). Only the button path
|
| 182 |
# carries this — the public /translate endpoint stays a bare str to keep its API stable.
|
|
@@ -188,9 +219,20 @@ def _translate_with_loading(
|
|
| 188 |
|
| 189 |
def _swap_languages(
|
| 190 |
source_lang: str, target_lang: str, source_text: str, target_text: str
|
| 191 |
-
) -> tuple[str, str,
|
| 192 |
-
"""Swap source/target languages and their text
|
| 193 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 194 |
|
| 195 |
|
| 196 |
def _build_demo() -> gr.Blocks:
|
|
@@ -262,7 +304,7 @@ def _build_demo() -> gr.Blocks:
|
|
| 262 |
minimum=0.1,
|
| 263 |
maximum=2.0,
|
| 264 |
step=0.1,
|
| 265 |
-
info="No effect at 1.0 (greedy) or when Beams > 1;
|
| 266 |
)
|
| 267 |
|
| 268 |
# UI-only handlers: kept off the public API surface (private) so only /translate is exposed.
|
|
@@ -284,7 +326,8 @@ def _build_demo() -> gr.Blocks:
|
|
| 284 |
# /translate exposes the advanced params too. They all have defaults, so existing
|
| 285 |
# two-arg callers (text, target) keep working; wiring them here also makes Ctrl+Enter
|
| 286 |
# honor the Advanced accordion, matching the Translate button. The endpoint returns a
|
| 287 |
-
# bare str
|
|
|
|
| 288 |
input_text.submit(
|
| 289 |
fn=translate,
|
| 290 |
inputs=[input_text, target_language, max_new_tokens, num_beams, temperature],
|
|
|
|
| 3 |
Translates between 418 languages from the MADLAD-400 paper.
|
| 4 |
"""
|
| 5 |
|
| 6 |
+
import math
|
| 7 |
import os
|
| 8 |
import time
|
| 9 |
import warnings
|
|
|
|
| 106 |
_load_model()
|
| 107 |
|
| 108 |
|
| 109 |
+
def _normalize_params(
|
| 110 |
+
max_new_tokens: float | None, num_beams: float | None, temperature: float | None
|
| 111 |
+
) -> tuple[int, int, float]:
|
| 112 |
+
"""Coerce the advanced generation params to safe values. A cleared ``gr.Number`` arrives as
|
| 113 |
+
``None`` (Gradio skips its bounds check for ``None``) and the public ``/translate`` path
|
| 114 |
+
passes values uncast, so this is the single funnel every caller — button, submit, API,
|
| 115 |
+
direct, and the ZeroGPU duration callable — goes through. ``None``/``NaN`` fall back to the
|
| 116 |
+
defaults; values are clamped to the ranges the Advanced ``gr.Number`` controls advertise."""
|
| 117 |
+
|
| 118 |
+
def _num(value: float | None, default: float) -> float:
|
| 119 |
+
return default if value is None or math.isnan(value) else value
|
| 120 |
+
|
| 121 |
+
return (
|
| 122 |
+
int(max(1, min(1024, _num(max_new_tokens, 512)))),
|
| 123 |
+
int(max(1, min(8, _num(num_beams, 1)))),
|
| 124 |
+
float(max(0.1, min(2.0, _num(temperature, 1.0)))),
|
| 125 |
+
)
|
| 126 |
+
|
| 127 |
+
|
| 128 |
def _estimate_duration(
|
| 129 |
text: str,
|
| 130 |
target_language_name: str,
|
| 131 |
+
max_new_tokens: int | None = 512,
|
| 132 |
+
num_beams: int | None = 1,
|
| 133 |
+
temperature: float | None = 1.0,
|
| 134 |
) -> int:
|
| 135 |
+
"""Reserve GPU time scaled to the worst case: generation cost grows with the number of
|
| 136 |
+
tokens generated and the beam width. Mirrors translate()'s signature (ZeroGPU calls the
|
| 137 |
+
duration callable with the decorated function's args, and runs it *before* translate(), so
|
| 138 |
+
it must tolerate the same cleared-field ``None`` values — normalize first). Conservative and
|
| 139 |
+
capped at 120s; calibrate from the perf_counter log in translate() (zerogpu.md 'Sizing
|
| 140 |
+
duration')."""
|
| 141 |
+
del text, target_language_name # only token/beam counts drive runtime
|
| 142 |
+
max_new_tokens, num_beams, _ = _normalize_params(max_new_tokens, num_beams, temperature)
|
| 143 |
return min(120, 30 + (max_new_tokens * num_beams) // 8)
|
| 144 |
|
| 145 |
|
|
|
|
| 147 |
def translate(
|
| 148 |
text: str,
|
| 149 |
target_language_name: str,
|
| 150 |
+
max_new_tokens: int | None = 512,
|
| 151 |
+
num_beams: int | None = 1,
|
| 152 |
+
temperature: float | None = 1.0,
|
| 153 |
) -> str:
|
| 154 |
# No-op on empty/whitespace input: skip the model entirely rather than feeding a bare
|
| 155 |
# "<2xx> " prompt (which would burn generation time and emit a stray token). Guard lives
|
| 156 |
# here, not in _translate_with_loading, so the public /translate and Ctrl+Enter paths
|
| 157 |
+
# (which call translate() directly) are covered too. (text or "") stays None-safe for an
|
| 158 |
+
# API caller that POSTs a null text field. Returns a str, so the contract holds.
|
| 159 |
+
if not (text or "").strip():
|
| 160 |
return ""
|
| 161 |
+
# Normalize the generation params here — translate() is the single source of truth. The
|
| 162 |
+
# public submit path passes gr.Number values uncast, and a cleared field arrives as
|
| 163 |
+
# None/NaN, so coerce and clamp before use (the duration callable normalizes identically).
|
| 164 |
+
max_new_tokens, num_beams, temperature = _normalize_params(max_new_tokens, num_beams, temperature)
|
| 165 |
+
# Compare with a tolerance so float spinner drift (e.g. 0.1*9 = 0.999…) doesn't trip sampling.
|
| 166 |
+
sampling = abs(temperature - 1.0) > 1e-6
|
| 167 |
+
|
| 168 |
tokenizer = _load_tokenizer()
|
| 169 |
model = _load_model()
|
| 170 |
device = model.device
|
|
|
|
| 174 |
if target_code is None:
|
| 175 |
raise ValueError(f"Unsupported language: {target_language_name}")
|
| 176 |
|
| 177 |
+
if num_beams > 1 and sampling:
|
| 178 |
gr.Info("Temperature has no effect when beam search is enabled (num_beams > 1).")
|
| 179 |
|
| 180 |
input_ids = tokenizer(target_code + " " + text, return_tensors="pt").input_ids.to(device)
|
|
|
|
| 182 |
generate_kwargs: dict = {"input_ids": input_ids, "max_new_tokens": max_new_tokens, "num_beams": num_beams}
|
| 183 |
# Greedy by default (deterministic, higher-quality MT). Only sample when the user
|
| 184 |
# explicitly sets a non-default temperature; beam search (num_beams > 1) ignores it.
|
| 185 |
+
if num_beams == 1 and sampling:
|
| 186 |
generate_kwargs["do_sample"] = True
|
| 187 |
generate_kwargs["temperature"] = temperature
|
| 188 |
|
|
|
|
| 201 |
def _translate_with_loading(
|
| 202 |
text: str,
|
| 203 |
target_language_name: str,
|
| 204 |
+
max_new_tokens: int | None = 512,
|
| 205 |
+
num_beams: int | None = 1,
|
| 206 |
+
temperature: float | None = 1.0,
|
| 207 |
) -> Generator[tuple[object, object], None, None]:
|
| 208 |
yield gr.update(value="Translating...", interactive=False), gr.update()
|
| 209 |
+
# translate() normalizes the params (None/NaN/clamp), so forward them as-is.
|
| 210 |
+
result = translate(text, target_language_name, max_new_tokens, num_beams, temperature)
|
| 211 |
# Flip the output box to RTL for right-to-left target scripts so the translation reads
|
| 212 |
# correctly; reset to LTR otherwise (rtl is sticky across reruns). Only the button path
|
| 213 |
# carries this — the public /translate endpoint stays a bare str to keep its API stable.
|
|
|
|
| 219 |
|
| 220 |
def _swap_languages(
|
| 221 |
source_lang: str, target_lang: str, source_text: str, target_text: str
|
| 222 |
+
) -> tuple[str, str, object, object]:
|
| 223 |
+
"""Swap source/target languages and their text, flipping each textbox's direction to follow
|
| 224 |
+
the text that lands in it. rtl is sticky across reruns, so a stale RTL flip left by a prior
|
| 225 |
+
translation must be reset. After the swap the input box holds the old target text and the
|
| 226 |
+
output box holds the old source text."""
|
| 227 |
+
name_to_code, _ = _build_language_mappings()
|
| 228 |
+
input_rtl = name_to_code.get(target_lang) in RTL_CODES # old target text moves into the input box
|
| 229 |
+
output_rtl = name_to_code.get(source_lang) in RTL_CODES # old source text moves into the output box
|
| 230 |
+
return (
|
| 231 |
+
target_lang,
|
| 232 |
+
source_lang,
|
| 233 |
+
gr.update(value=target_text, rtl=input_rtl, text_align="right" if input_rtl else "left"),
|
| 234 |
+
gr.update(value=source_text, rtl=output_rtl, text_align="right" if output_rtl else "left"),
|
| 235 |
+
)
|
| 236 |
|
| 237 |
|
| 238 |
def _build_demo() -> gr.Blocks:
|
|
|
|
| 304 |
minimum=0.1,
|
| 305 |
maximum=2.0,
|
| 306 |
step=0.1,
|
| 307 |
+
info="No effect at 1.0 (greedy) or when Beams > 1; below 1.0 is more focused, above 1.0 more random.",
|
| 308 |
)
|
| 309 |
|
| 310 |
# UI-only handlers: kept off the public API surface (private) so only /translate is exposed.
|
|
|
|
| 326 |
# /translate exposes the advanced params too. They all have defaults, so existing
|
| 327 |
# two-arg callers (text, target) keep working; wiring them here also makes Ctrl+Enter
|
| 328 |
# honor the Advanced accordion, matching the Translate button. The endpoint returns a
|
| 329 |
+
# bare str, so an RTL target submitted via Ctrl+Enter is NOT direction-flipped — that
|
| 330 |
+
# happens only on the Translate-button path (an accepted, documented UI divergence).
|
| 331 |
input_text.submit(
|
| 332 |
fn=translate,
|
| 333 |
inputs=[input_text, target_language, max_new_tokens, num_beams, temperature],
|
|
@@ -173,20 +173,99 @@ def test_translate_greedy_by_default_samples_on_custom_temperature():
|
|
| 173 |
assert sampled.get("do_sample") is True and sampled["temperature"] == 0.5
|
| 174 |
|
| 175 |
|
| 176 |
-
def
|
| 177 |
-
"""
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 178 |
import app
|
| 179 |
|
| 180 |
with (
|
| 181 |
patch("app._load_model") as load_model,
|
| 182 |
patch("app._load_tokenizer") as load_tokenizer,
|
| 183 |
):
|
| 184 |
-
result = app.translate(
|
| 185 |
assert result == ""
|
| 186 |
load_model.assert_not_called()
|
| 187 |
load_tokenizer.assert_not_called()
|
| 188 |
|
| 189 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 190 |
def test_translate_with_loading_flips_rtl_for_rtl_target():
|
| 191 |
"""The private button path marks the output RTL for right-to-left target languages and
|
| 192 |
resets to LTR otherwise (rtl is sticky across reruns)."""
|
|
@@ -204,7 +283,17 @@ def test_translate_with_loading_flips_rtl_for_rtl_target():
|
|
| 204 |
ltr = final_output("French (fr)", "<2fr>")
|
| 205 |
assert rtl["rtl"] is True and rtl["text_align"] == "right"
|
| 206 |
assert ltr["rtl"] is False and ltr["text_align"] == "left"
|
| 207 |
-
assert rtl["value"] == "out"
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 208 |
|
| 209 |
|
| 210 |
def test_requirements_excludes_platform_packages():
|
|
@@ -292,9 +381,13 @@ def test_dropdowns_have_info_captions(demo):
|
|
| 292 |
|
| 293 |
|
| 294 |
def test_textboxes_have_info_captions(demo):
|
| 295 |
-
"""Input
|
| 296 |
textboxes = [b for b in demo.blocks.values() if type(b).__name__ == "Textbox"]
|
| 297 |
assert all(t.info for t in textboxes), "input and output textboxes should carry info captions"
|
|
|
|
|
|
|
|
|
|
|
|
|
| 298 |
|
| 299 |
|
| 300 |
def test_demo_has_two_textboxes(demo):
|
|
@@ -416,18 +509,30 @@ def test_advanced_params_are_numbers(demo):
|
|
| 416 |
assert len(numbers) == 3, f"Expected 3 Number controls, found {len(numbers)}"
|
| 417 |
|
| 418 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 419 |
def test_advanced_params_wired_to_both_translate_handlers(demo):
|
| 420 |
"""Both translate handlers (button click + public submit) carry the three advanced Number
|
| 421 |
-
params after text + language
|
|
|
|
| 422 |
full_input_fns = [
|
| 423 |
fn
|
| 424 |
for fn in demo.fns.values()
|
| 425 |
if [type(i).__name__ for i in fn.inputs] == ["Textbox", "Dropdown", "Number", "Number", "Number"]
|
| 426 |
]
|
| 427 |
assert len(full_input_fns) == 2, "Expected both translate handlers to carry the 3 advanced params"
|
| 428 |
-
|
| 429 |
-
|
| 430 |
-
)
|
| 431 |
|
| 432 |
|
| 433 |
def test_all_handlers_wired(demo):
|
|
@@ -450,6 +555,9 @@ def test_translate_endpoint_has_stable_api_name(demo):
|
|
| 450 |
assert len(api_fns) == 1, "Expected exactly one handler with api_name='translate'"
|
| 451 |
fn = api_fns[0]
|
| 452 |
assert [type(i).__name__ for i in fn.inputs] == ["Textbox", "Dropdown", "Number", "Number", "Number"]
|
|
|
|
|
|
|
|
|
|
| 453 |
assert [type(o).__name__ for o in fn.outputs] == ["Textbox"]
|
| 454 |
|
| 455 |
|
|
|
|
| 173 |
assert sampled.get("do_sample") is True and sampled["temperature"] == 0.5
|
| 174 |
|
| 175 |
|
| 176 |
+
def _run_translate(text, target, **kwargs):
|
| 177 |
+
"""Call translate() against a mocked model/tokenizer and return generate()'s kwargs + result."""
|
| 178 |
+
import app
|
| 179 |
+
|
| 180 |
+
model = MagicMock()
|
| 181 |
+
model.device = torch.device("cpu")
|
| 182 |
+
model.generate.return_value = [[0]]
|
| 183 |
+
tokenizer = MagicMock()
|
| 184 |
+
tokenizer.decode.return_value = "out"
|
| 185 |
+
with (
|
| 186 |
+
patch("app._load_model", return_value=model),
|
| 187 |
+
patch("app._load_tokenizer", return_value=tokenizer),
|
| 188 |
+
patch("app._build_language_mappings", return_value=({"French (fr)": "<2fr>"}, ["French (fr)"])),
|
| 189 |
+
):
|
| 190 |
+
result = app.translate(text, target, **kwargs)
|
| 191 |
+
return model.generate.call_args.kwargs if model.generate.called else None, result
|
| 192 |
+
|
| 193 |
+
|
| 194 |
+
def test_translate_forwards_generation_params():
|
| 195 |
+
"""Non-default max_new_tokens/num_beams must reach model.generate; beam search must not sample."""
|
| 196 |
+
kwargs, _ = _run_translate("Hello", "French (fr)", max_new_tokens=10, num_beams=4)
|
| 197 |
+
assert kwargs["max_new_tokens"] == 10
|
| 198 |
+
assert kwargs["num_beams"] == 4
|
| 199 |
+
assert "do_sample" not in kwargs, "beam search must not enable sampling"
|
| 200 |
+
|
| 201 |
+
|
| 202 |
+
def test_normalize_params_clamps_and_defaults():
|
| 203 |
+
"""_normalize_params coerces None/NaN to defaults and clamps to the advertised ranges."""
|
| 204 |
+
import app
|
| 205 |
+
|
| 206 |
+
assert app._normalize_params(None, None, None) == (512, 1, 1.0)
|
| 207 |
+
nan = float("nan")
|
| 208 |
+
assert app._normalize_params(nan, nan, nan) == (512, 1, 1.0)
|
| 209 |
+
assert app._normalize_params(99999, 99, 9.0) == (1024, 8, 2.0) # clamp high
|
| 210 |
+
assert app._normalize_params(0, 0, 0.0) == (1, 1, 0.1) # clamp low
|
| 211 |
+
mnt, beams, temp = app._normalize_params(10.0, 4.0, 0.5)
|
| 212 |
+
assert (mnt, beams, temp) == (10, 4, 0.5)
|
| 213 |
+
assert type(mnt) is int and type(beams) is int and type(temp) is float
|
| 214 |
+
|
| 215 |
+
|
| 216 |
+
def test_translate_normalizes_invalid_params():
|
| 217 |
+
"""A cleared gr.Number arrives as None (and temperature can be NaN) on the public path;
|
| 218 |
+
translate() must coerce to defaults instead of crashing or corrupting sampling."""
|
| 219 |
+
kwargs, result = _run_translate(
|
| 220 |
+
"Hello", "French (fr)", max_new_tokens=None, num_beams=None, temperature=float("nan")
|
| 221 |
+
)
|
| 222 |
+
assert result == "out"
|
| 223 |
+
assert kwargs["max_new_tokens"] == 512 and kwargs["num_beams"] == 1
|
| 224 |
+
assert "do_sample" not in kwargs, "NaN temperature must fall back to greedy"
|
| 225 |
+
|
| 226 |
+
|
| 227 |
+
def test_estimate_duration_handles_none_params():
|
| 228 |
+
"""The ZeroGPU duration callable runs before translate() with the same uncast args, so a
|
| 229 |
+
cleared gr.Number (None) must not crash it."""
|
| 230 |
+
import app
|
| 231 |
+
|
| 232 |
+
assert isinstance(app._estimate_duration("hi", "French (fr)", None, None, None), int)
|
| 233 |
+
|
| 234 |
+
|
| 235 |
+
@pytest.mark.parametrize("blank", ["", " ", "\n\t", None])
|
| 236 |
+
def test_translate_skips_model_on_empty_input(blank):
|
| 237 |
+
"""Empty/whitespace/None input short-circuits to '' without loading or running the model."""
|
| 238 |
import app
|
| 239 |
|
| 240 |
with (
|
| 241 |
patch("app._load_model") as load_model,
|
| 242 |
patch("app._load_tokenizer") as load_tokenizer,
|
| 243 |
):
|
| 244 |
+
result = app.translate(blank, "French (fr)")
|
| 245 |
assert result == ""
|
| 246 |
load_model.assert_not_called()
|
| 247 |
load_tokenizer.assert_not_called()
|
| 248 |
|
| 249 |
|
| 250 |
+
def test_swap_flips_rtl_to_follow_text():
|
| 251 |
+
"""Swapping must move each textbox's direction with the text: after EN->Arabic then swap,
|
| 252 |
+
the input box (now holding the Arabic translation) goes RTL and the output box (now holding
|
| 253 |
+
the English source) resets to LTR."""
|
| 254 |
+
import gradio as gr
|
| 255 |
+
|
| 256 |
+
import app
|
| 257 |
+
|
| 258 |
+
name_to_code = {"English (en)": "<2en>", "Arabic (ar)": "<2ar>"}
|
| 259 |
+
with patch("app._build_language_mappings", return_value=(name_to_code, list(name_to_code))):
|
| 260 |
+
new_source, new_target, input_update, output_update = app._swap_languages(
|
| 261 |
+
"English (en)", "Arabic (ar)", "Hello", "RTL-text"
|
| 262 |
+
)
|
| 263 |
+
assert (new_source, new_target) == ("Arabic (ar)", "English (en)")
|
| 264 |
+
# input box now holds the Arabic translation -> RTL; output box holds the English source -> LTR
|
| 265 |
+
assert input_update == gr.update(value="RTL-text", rtl=True, text_align="right")
|
| 266 |
+
assert output_update == gr.update(value="Hello", rtl=False, text_align="left")
|
| 267 |
+
|
| 268 |
+
|
| 269 |
def test_translate_with_loading_flips_rtl_for_rtl_target():
|
| 270 |
"""The private button path marks the output RTL for right-to-left target languages and
|
| 271 |
resets to LTR otherwise (rtl is sticky across reruns)."""
|
|
|
|
| 283 |
ltr = final_output("French (fr)", "<2fr>")
|
| 284 |
assert rtl["rtl"] is True and rtl["text_align"] == "right"
|
| 285 |
assert ltr["rtl"] is False and ltr["text_align"] == "left"
|
| 286 |
+
assert rtl["value"] == "out" and ltr["value"] == "out" # both branches forward the result
|
| 287 |
+
|
| 288 |
+
|
| 289 |
+
def test_rtl_codes_are_valid_langmap_tokens():
|
| 290 |
+
"""Every RTL_CODES token must exist in the langmap, so a langmap regeneration that renames
|
| 291 |
+
or drops a token can't silently disable an RTL flip without failing this test."""
|
| 292 |
+
import app
|
| 293 |
+
from langmap.langid_mapping import langid_to_language
|
| 294 |
+
|
| 295 |
+
missing = app.RTL_CODES - set(langid_to_language)
|
| 296 |
+
assert not missing, f"RTL_CODES not in langmap: {missing}"
|
| 297 |
|
| 298 |
|
| 299 |
def test_requirements_excludes_platform_packages():
|
|
|
|
| 381 |
|
| 382 |
|
| 383 |
def test_textboxes_have_info_captions(demo):
|
| 384 |
+
"""Input box carries the Ctrl+Enter hint; output box carries model/arXiv/license provenance."""
|
| 385 |
textboxes = [b for b in demo.blocks.values() if type(b).__name__ == "Textbox"]
|
| 386 |
assert all(t.info for t in textboxes), "input and output textboxes should carry info captions"
|
| 387 |
+
input_box = next(t for t in textboxes if t.interactive is not False)
|
| 388 |
+
output_box = next(t for t in textboxes if t.interactive is False)
|
| 389 |
+
assert "ctrl+enter" in input_box.info.lower()
|
| 390 |
+
assert "madlad400-3b-mt" in output_box.info
|
| 391 |
|
| 392 |
|
| 393 |
def test_demo_has_two_textboxes(demo):
|
|
|
|
| 509 |
assert len(numbers) == 3, f"Expected 3 Number controls, found {len(numbers)}"
|
| 510 |
|
| 511 |
|
| 512 |
+
def test_advanced_params_have_safe_bounds(demo):
|
| 513 |
+
"""The Number controls must keep their documented bounds — for the public /translate path,
|
| 514 |
+
Gradio's component preprocess is the server-side guard keeping params in range."""
|
| 515 |
+
numbers = {n.label: n for n in demo.blocks.values() if type(n).__name__ == "Number"}
|
| 516 |
+
assert numbers["Max new tokens"].minimum == 1 and numbers["Max new tokens"].maximum == 1024
|
| 517 |
+
assert numbers["Max new tokens"].precision == 0
|
| 518 |
+
assert numbers["Beams"].minimum == 1 and numbers["Beams"].maximum == 8
|
| 519 |
+
assert numbers["Beams"].precision == 0
|
| 520 |
+
assert numbers["Temperature"].minimum == 0.1 and numbers["Temperature"].maximum == 2.0
|
| 521 |
+
|
| 522 |
+
|
| 523 |
def test_advanced_params_wired_to_both_translate_handlers(demo):
|
| 524 |
"""Both translate handlers (button click + public submit) carry the three advanced Number
|
| 525 |
+
params after text + language. Exactly one of them is the public /translate endpoint, so the
|
| 526 |
+
params demonstrably reach the public path (keyed on api_visibility, not a name coincidence)."""
|
| 527 |
full_input_fns = [
|
| 528 |
fn
|
| 529 |
for fn in demo.fns.values()
|
| 530 |
if [type(i).__name__ for i in fn.inputs] == ["Textbox", "Dropdown", "Number", "Number", "Number"]
|
| 531 |
]
|
| 532 |
assert len(full_input_fns) == 2, "Expected both translate handlers to carry the 3 advanced params"
|
| 533 |
+
public = [fn for fn in full_input_fns if getattr(fn, "api_visibility", None) == "public"]
|
| 534 |
+
assert len(public) == 1, "exactly one full-input handler should be the public endpoint"
|
| 535 |
+
assert getattr(public[0], "api_name", None) == "translate", "the public one must be /translate"
|
| 536 |
|
| 537 |
|
| 538 |
def test_all_handlers_wired(demo):
|
|
|
|
| 555 |
assert len(api_fns) == 1, "Expected exactly one handler with api_name='translate'"
|
| 556 |
fn = api_fns[0]
|
| 557 |
assert [type(i).__name__ for i in fn.inputs] == ["Textbox", "Dropdown", "Number", "Number", "Number"]
|
| 558 |
+
# the three Number inputs are positionally indistinguishable by type, so pin their order by
|
| 559 |
+
# label — a num_beams/temperature swap in the inputs= list would otherwise pass silently.
|
| 560 |
+
assert [i.label for i in fn.inputs[2:]] == ["Max new tokens", "Beams", "Temperature"]
|
| 561 |
assert [type(o).__name__ for o in fn.outputs] == ["Textbox"]
|
| 562 |
|
| 563 |
|