inspirebek commited on
Commit
32c24b3
·
verified ·
1 Parent(s): f70adb5

docs: add model card

Browse files
Files changed (1) hide show
  1. README.md +30 -1
README.md CHANGED
@@ -2,7 +2,17 @@
2
  language:
3
  - uz
4
  - en
5
- license: apache-2.0
 
 
 
 
 
 
 
 
 
 
6
  library_name: gguf
7
  pipeline_tag: text-generation
8
  base_model: inspirebek/qwen3-4b-uzbek-v2
@@ -51,6 +61,25 @@ ollama run hf.co/inspirebek/qwen3-4b-uzbek-v2-GGUF:Q4_K_M
51
 
52
  converted from the bf16 merged model via `llama.cpp`'s `convert_hf_to_gguf.py` → `llama-quantize`. no calibration data (k-quants are statistics-only).
53
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
54
  ## sibling formats
55
 
56
  - [`inspirebek/qwen3-4b-uzbek-v2`](https://huggingface.co/inspirebek/qwen3-4b-uzbek-v2)
 
2
  language:
3
  - uz
4
  - en
5
+ license: cc-by-nc-4.0
6
+ datasets:
7
+ - yakhyo/uz-wiki
8
+ - tahrirchi/uz-books-v2
9
+ - tahrirchi/uz-crawl
10
+ - saillab/alpaca_uzbek_taco
11
+ - behbudiy/alpaca-cleaned-uz
12
+ - UAzimov/uzbek-instruct-llm
13
+ - CohereLabs/aya_collection_language_split
14
+ - med-alex/qa_mt_ru_to_uzn
15
+ - med-alex/qa_mt_tr_to_uzn
16
  library_name: gguf
17
  pipeline_tag: text-generation
18
  base_model: inspirebek/qwen3-4b-uzbek-v2
 
61
 
62
  converted from the bf16 merged model via `llama.cpp`'s `convert_hf_to_gguf.py` → `llama-quantize`. no calibration data (k-quants are statistics-only).
63
 
64
+ ## datasets
65
+
66
+ **stage a — fluency (continued pretraining):**
67
+
68
+ - [`yakhyo/uz-wiki`](https://huggingface.co/datasets/yakhyo/uz-wiki) · MIT
69
+ - [`tahrirchi/uz-books-v2`](https://huggingface.co/datasets/tahrirchi/uz-books-v2) · MIT
70
+ - [`tahrirchi/uz-crawl`](https://huggingface.co/datasets/tahrirchi/uz-crawl) · Apache-2.0
71
+
72
+ **stage b — instruct (sft):**
73
+
74
+ - [`saillab/alpaca_uzbek_taco`](https://huggingface.co/datasets/saillab/alpaca_uzbek_taco) · CC-BY-NC-4.0
75
+ - [`behbudiy/alpaca-cleaned-uz`](https://huggingface.co/datasets/behbudiy/alpaca-cleaned-uz) · CC-BY-4.0
76
+ - [`UAzimov/uzbek-instruct-llm`](https://huggingface.co/datasets/UAzimov/uzbek-instruct-llm) · Apache-2.0
77
+ - [`CohereLabs/aya_collection_language_split`](https://huggingface.co/datasets/CohereLabs/aya_collection_language_split) · Apache-2.0
78
+ - [`med-alex/qa_mt_ru_to_uzn`](https://huggingface.co/datasets/med-alex/qa_mt_ru_to_uzn) · unspecified
79
+ - [`med-alex/qa_mt_tr_to_uzn`](https://huggingface.co/datasets/med-alex/qa_mt_tr_to_uzn) · unspecified
80
+
81
+ > ⚠️ licensing note: `saillab/alpaca_uzbek_taco` is cc-by-nc-4.0, which restricts commercial use of derivative models. downstream users who need a fully permissive license should retrain without that subset.
82
+
83
  ## sibling formats
84
 
85
  - [`inspirebek/qwen3-4b-uzbek-v2`](https://huggingface.co/inspirebek/qwen3-4b-uzbek-v2)