Scripts
Automation scripts for quality checks, resource catalog validation, and search index generation.
Available scripts
validate_normalization.py: validate normalization seed TSV format and rules.check_links.py: ensure markdown links are clickable (optional online reachability check).validate_resource_catalog.py: validateresources/catalog/resources.json.generate_resource_views.py: generateresources/*/README.md,resources/README.md, anddocs/search/resources.jsonfrom the catalog.sync_resources.py: collect new candidate Pashto resources from public endpoints intoresources/catalog/pending_candidates.json.
Usage
Validate normalization seed file:
python scripts/validate_normalization.py data/processed/normalization_seed_v0.1.tsv
Validate resource catalog:
python scripts/validate_resource_catalog.py
Generate markdown and search index from catalog:
python scripts/generate_resource_views.py
Sync candidate resources for maintainer review:
python scripts/sync_resources.py --limit 20
Check markdown links format:
python scripts/check_links.py
Check markdown links and verify URLs online:
python scripts/check_links.py --online