Instructions to use stukenov/sozkz-core-llama-300m-kk-gec-v3 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use stukenov/sozkz-core-llama-300m-kk-gec-v3 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="stukenov/sozkz-core-llama-300m-kk-gec-v3")# Load model directly from transformers import AutoTokenizer, AutoModelForMultimodalLM tokenizer = AutoTokenizer.from_pretrained("stukenov/sozkz-core-llama-300m-kk-gec-v3") model = AutoModelForMultimodalLM.from_pretrained("stukenov/sozkz-core-llama-300m-kk-gec-v3") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use stukenov/sozkz-core-llama-300m-kk-gec-v3 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "stukenov/sozkz-core-llama-300m-kk-gec-v3" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "stukenov/sozkz-core-llama-300m-kk-gec-v3", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/stukenov/sozkz-core-llama-300m-kk-gec-v3
- SGLang
How to use stukenov/sozkz-core-llama-300m-kk-gec-v3 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "stukenov/sozkz-core-llama-300m-kk-gec-v3" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "stukenov/sozkz-core-llama-300m-kk-gec-v3", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "stukenov/sozkz-core-llama-300m-kk-gec-v3" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "stukenov/sozkz-core-llama-300m-kk-gec-v3", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use stukenov/sozkz-core-llama-300m-kk-gec-v3 with Docker Model Runner:
docker model run hf.co/stukenov/sozkz-core-llama-300m-kk-gec-v3
YAML Metadata Warning:The pipeline tag "text2text-generation" is not in the official list: text-classification, token-classification, table-question-answering, question-answering, zero-shot-classification, translation, summarization, feature-extraction, text-generation, fill-mask, sentence-similarity, text-to-speech, text-to-audio, automatic-speech-recognition, audio-to-audio, audio-classification, audio-text-to-text, voice-activity-detection, depth-estimation, image-classification, object-detection, image-segmentation, text-to-image, image-to-text, image-to-image, image-to-video, unconditional-image-generation, video-classification, reinforcement-learning, robotics, tabular-classification, tabular-regression, tabular-to-text, table-to-text, multiple-choice, text-ranking, text-retrieval, time-series-forecasting, text-to-video, image-text-to-text, image-text-to-image, image-text-to-video, visual-question-answering, document-question-answering, zero-shot-image-classification, graph-ml, mask-generation, zero-shot-object-detection, text-to-3d, image-to-3d, image-feature-extraction, video-text-to-text, keypoint-detection, visual-document-retrieval, any-to-any, video-to-video, other
SozKZ Core Llama 300M — Kazakh GEC v3
Казақ тіліне арналған грамматикалық қателерді түзету моделі (Grammar Error Correction).
Model Details
| Parameter | Value |
|---|---|
| Architecture | LlamaForCausalLM (decoder-only) |
| Parameters | 325M |
| Base model | stukenov/sozkz-core-llama-300m-kk-base-v1 |
| Training data | sozkz-corpus-synthetic-kk-gec-v1 |
| Training | 5 epochs, LR=2e-5, BS=128, cosine schedule |
| Clean ratio | 50% (error pairs + identity pairs) |
| Hardware | 4× RTX 4090 (vast.ai), 48 min |
| License | MIT |
Format
The model uses a tag-based correction format:
<тег> ошибочный текст
→ исправленный текст
Available Tags
| Tag | Error Type | Example |
|---|---|---|
<грамматика> |
General grammar | catch-all |
<сингармонизм> |
Vowel harmony | әкімшілігінда → әкімшілігінде |
<септік> |
Case suffixes | гранитоидтерке → гранитоидтерге |
<тәуелдік> |
Possessive | түпкы → түпкі |
<жіктік> |
Personal endings | тарасың → тарады |
<шылау> |
Postpositions | білікге үшін → білік үшін |
<көптік> |
Plural | Студенттар → Студенттер |
<болымсыз> |
Negation | жарамеған → жарамаған |
<шақ> |
Tense | кетееді → кетеді |
<жалғау> |
Suffixes | general suffix errors |
<қате> |
Typos/noise | Астна → Астана |
<таза> |
Clean (no error) | should return input unchanged |
Usage
from transformers import AutoModelForCausalLM, GPT2TokenizerFast
from huggingface_hub import hf_hub_download
import torch
model_id = "stukenov/sozkz-core-llama-300m-kk-gec-v3"
tok_file = hf_hub_download(repo_id=model_id, filename="tokenizer.json")
tokenizer = GPT2TokenizerFast(tokenizer_file=tok_file)
tokenizer.pad_token = tokenizer.eos_token
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.bfloat16)
model.eval()
def correct(tag, text):
prompt = f"<{tag}> {text}\n→ "
inputs = tokenizer(prompt, return_tensors="pt", truncation=True, max_length=480)
with torch.no_grad():
out = model.generate(
**inputs,
max_new_tokens=len(inputs["input_ids"][0]) + 30,
temperature=0.3, top_p=0.9, do_sample=True,
repetition_penalty=1.1,
pad_token_id=tokenizer.eos_token_id,
)
result = tokenizer.decode(out[0], skip_special_tokens=True)
if "→ " in result:
result = result.split("→ ", 1)[1]
for stop in ["\n<", "\n\n"]:
if stop in result:
result = result[:result.index(stop)]
return result.strip() or text
# Example
print(correct("көптік", "Студенттар университетте оқиды."))
# → Студенттер университетте оқиды.
Evaluation (v3 vs previous versions)
| Metric | v1 (1ep) | v2a (1ep) | v3 (5ep) |
|---|---|---|---|
| Accuracy | 9.0% | 6.6% | 15.8% |
| Wrong | 85.2% | 78.4% | 70.8% |
| FP (clean) | 84% | 82% | 76% |
Best per-tag accuracy: көптік 32%, болымсыз 30%, жіктік 26%.
Limitations
- Accuracy is still low (~16%) — this is an early research model
- False positive rate is high (76%) — the model sometimes changes correct text
- Occasional non-Kazakh output (Chinese/Greek characters from base Llama)
- Best suited for research, not production use
Part of SozKZ Project
This model is part of the SozKZ project for Kazakh language AI.
Citation
@misc{sozkz-gec-v3-2026,
title={SozKZ Core Llama 300M Kazakh GEC v3},
author={Stukenov, Saken},
year={2026},
url={https://huggingface.co/stukenov/sozkz-core-llama-300m-kk-gec-v3}
}
Benchmark Results
Evaluated on 100-example custom GEC test (pure model inference, no pre/post pipeline).
| Category | Score |
|---|---|
| Орфография (емле) | 0/30 (0%) |
| Грамматика | 1/20 (5%) |
| Пунктуация | 0/15 (0%) |
| Смешанный | 0/20 (0%) |
| Identity preservation | 0/15 (0%) |
| Total | 1/100 (1%) |
Leaderboard (100-example custom benchmark)
| Модель | Total | Емле/30 | Грамм/20 | Пункт/15 | Смеш/20 | Ident/15 |
|---|---|---|---|---|---|---|
| sozkz-core-llama-600m-kk-gec-v1 | 47% | 15 | 12 | 3 | 2 | 15/15 |
| sozkz-fix-qwen-500m-kk-gec-v3 | 38% | 0 | 16 | 9 | 0 | 13/15 |
| sozkz-core-llama-300m-kk-gec-v4 | 37% | 9 | 6 | 4 | 3 | 15/15 |
| sozkz-fix-qwen-500m-kk-gec-v1 | 35% | 0 | 12 | 8 | 0 | 15/15 |
| sozkz-fix-qwen-500m-kk-gec-v2 | 30% | 0 | 11 | 7 | 0 | 12/15 |
| sozkz-core-llama-1b-kk-gec-v1 | 16% | 2 | 6 | 1 | 0 | 7/15 |
| sozkz-fix-qwen-500m-kk-gec-v4 | 5% | 0 | 1 | 4 | 0 | 0/15 |
| sozkz-fix-mt5b-kk-gec-run13-v1 | 5% | 0 | 2 | 0 | 0 | 3/15 |
| sozkz-nllb-1b-kk-gec-v1 | 1% | 0 | 1 | 0 | 0 | 0/15 |
| sozkz-nllb-1b-kk-pretrain-v1 | 1% | 0 | 1 | 0 | 0 | 0/15 |
| sozkz-core-llama-300m-kk-gec-v3 | 1% | 0 | 1 | 0 | 0 | 0/15 |
| sozkz-core-llama-300m-kk-gec-v1/v2a/v2b | 0–1% | 0 | 0 | 0 | 0 | 0–1 |
| sozkz-fix-mt5-50m-kk-gec-v1 | 0% | 0 | 0 | 0 | 0 | 0/15 |
- Downloads last month
- -