musaw
sync(hf): snapshot main content without binary history
6f1c8bd
|
Raw
History Blame
2.92 kB
metadata
license: apache-2.0
language:
  - ps
  - en
tags:
  - pashto
  - pukhto
  - pushto
  - asr
  - tts
  - nlp
  - machine-translation
  - language-resources
  - low-resource-languages
  - speech-recognition

Pashto Language Resources Hub (Pukhto/Pashto)

Open-source Pashto language technology hub for datasets, models, benchmarks, ASR, TTS, NLP, and machine translation.

This repository curates verified Pashto resources and keeps validation and publishing workflows reproducible.

Quick Links

High-Intent Pages

Repository Structure

  • resources/: verified external resources with structured categories.
  • data/: normalization seeds, metadata, and data workflows.
  • asr/: ASR notes, baselines, and references.
  • tts/: TTS notes, baselines, and references.
  • benchmarks/: schemas, result templates, and evaluation guidance.
  • experiments/: reproducible run-card templates.
  • docs/: SEO, release, platform, and contribution documentation.

Resource Workflow

  1. Discovery job (.github/workflows/resource_sync.yml) updates candidate feed.
  2. Automation promotes valid non-duplicate candidates into resources/catalog/resources.json.
  3. Regeneration and validation update derived views and search index.

Core commands:

python scripts/validate_resource_catalog.py
python scripts/generate_resource_views.py
python scripts/check_links.py
python -m pytest -q

SEO and Discoverability

Releases

Contributing