Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
stukenov
's Collections
EkiTil: Bilingual Kazakh-Russian Language Models
SozKZ Misc: TTS, Sentiment & Other
SozKZ Vocab: Kazakh Tokenizers
SozKZ Corpora: Kazakh Training Datasets
SozKZ MoE: Mixture of Experts
SozKZ GEC: Kazakh Grammar Error Correction
SozKZ Core: Kazakh Language Models
SozKZ Vocab: Kazakh Tokenizers
updated
Mar 25
BPE and SentencePiece tokenizers trained on Kazakh text — 32K vocabularies
Upvote
-
Sort: Collection
stukenov/sozkz-vocab-bpe-32k-kk-base-v1
Text Generation
•
Updated
Mar 25
stukenov/sozkz-vocab-sp-32k-kk-t5-v1
Updated
Mar 25
stukenov/kzcalm-sp-tokenizer-4k-kk-v1
Updated
Mar 25
Upvote
-
Sort: Collection
Share collection
View history
Collection guide
Browse collections