File size: 2,918 Bytes
9003457
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
91796ad
71c613c
7d9f55b
4363902
7d9f55b
53fd6b7
7d9f55b
2f53244
7d9f55b
2f53244
7d9f55b
 
2f53244
 
7d9f55b
2f53244
7d9f55b
 
 
2f53244
7d9f55b
1f304d8
7d9f55b
 
 
 
 
 
 
2f53244
7d9f55b
4363902
7d9f55b
6f1c8bd
 
4363902
7d9f55b
4363902
 
 
 
 
 
 
 
7d9f55b
2f53244
7d9f55b
9003457
 
7d9f55b
 
 
 
 
 
 
3614ee2
7b1db93
7d9f55b
 
 
 
 
 
204c5d9
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
---
license: apache-2.0
language:
- ps
- en
tags:
- pashto
- pukhto
- pushto
- asr
- tts
- nlp
- machine-translation
- language-resources
- low-resource-languages
- speech-recognition
---

# Pashto Language Resources Hub (Pukhto/Pashto)

Open-source Pashto language technology hub for datasets, models, benchmarks, ASR, TTS, NLP, and machine translation.

This repository curates verified Pashto resources and keeps validation and publishing workflows reproducible.

## Quick Links

- Search page: [Pashto Resource Search](https://musawer1214.github.io/pashto-language-resources/search/)
- Project site: [Pashto Language Resources Hub](https://musawer1214.github.io/pashto-language-resources/)
- Documentation hub: [docs/README.md](docs/README.md)
- GitHub: [Musawer1214/pashto-language-resources](https://github.com/Musawer1214/pashto-language-resources)
- Hugging Face mirror: [Musawer14/pashto-language-resources](https://huggingface.co/Musawer14/pashto-language-resources)

## High-Intent Pages

- [Pashto datasets](docs/pashto_datasets.md)
- [Pashto ASR resources](docs/pashto_asr.md)
- [Pashto TTS resources](docs/pashto_tts.md)

## Repository Structure

- `resources/`: verified external resources with structured categories.
- `data/`: normalization seeds, metadata, and data workflows.
- `asr/`: ASR notes, baselines, and references.
- `tts/`: TTS notes, baselines, and references.
- `benchmarks/`: schemas, result templates, and evaluation guidance.
- `experiments/`: reproducible run-card templates.
- `docs/`: SEO, release, platform, and contribution documentation.

## Resource Workflow

1. Discovery job (`.github/workflows/resource_sync.yml`) updates candidate feed.
2. Automation promotes valid non-duplicate candidates into `resources/catalog/resources.json`.
3. Regeneration and validation update derived views and search index.

Core commands:

```bash
python scripts/validate_resource_catalog.py
python scripts/generate_resource_views.py
python scripts/check_links.py
python -m pytest -q
```

## SEO and Discoverability

- SEO playbook: [docs/discoverability_seo.md](docs/discoverability_seo.md)
- GitHub topics checklist: [docs/github_topics_checklist.md](docs/github_topics_checklist.md)
- Backlink strategy: [docs/backlink_strategy.md](docs/backlink_strategy.md)
- Platform sync policy: [docs/platform_sync_policy.md](docs/platform_sync_policy.md)
- Search UI source: [docs/search/index.html](docs/search/index.html)
- Citation metadata: [CITATION.cff](CITATION.cff)

## Releases

- Release notes index: [docs/releases/README.md](docs/releases/README.md)
- Latest release notes: [v1.1.1](docs/releases/v1.1.1.md)
- Changelog: [CHANGELOG.md](CHANGELOG.md)

## Contributing

- Contribution guide: [CONTRIBUTING.md](CONTRIBUTING.md)
- Community communication: [community/COMMUNICATION.md](community/COMMUNICATION.md)
- Resource guidelines: [docs/dataset_guidelines.md](docs/dataset_guidelines.md)