--- language: - en tags: - text-generation - grammar-correction - essay-generation - small-language-model - parameter-golf license: mit --- # Syndra — Grammar & Essay SLM A small language model (~30MB) fine-tuned for grammar correction and essay generation. Built from scratch in one day on an RTX 3050 4GB GPU. ## Model details | Property | Value | |---|---| | Parameters | 16.08M | | File size | ~30MB | | Architecture | 4-layer transformer decoder | | Heads | 4 attention heads | | Dimension | 256 | | Tokenizer | GPT-2 BPE (tiktoken) | | Val loss | 1.7955214977264404 | | Task | Grammar correction + Essay generation | ## How to use ```python import torch from model import GPT, GPTConfig import tiktoken # Load model ckpt = torch.load('model.pt', map_location='cpu') config = GPTConfig(**ckpt['model_args']) model = GPT(config) sd = {k: v.float() for k, v in ckpt['model'].items()} model.load_state_dict(sd, strict=False) model.eval() enc = tiktoken.get_encoding('gpt2') # Grammar correction prompt = "### Task: Grammar Correction\n### Input: she go to school\n### Output:" ids = enc.encode(prompt) x = torch.tensor(ids).unsqueeze(0) with torch.no_grad(): out = model.generate(x, max_new_tokens=100, temperature=0.3, top_k=40) print(enc.decode(out[0].tolist())) # Essay generation prompt = "### Task: Essay Writing\n### Topic: The importance of books\n### Essay:" ids = enc.encode(prompt) x = torch.tensor(ids).unsqueeze(0) with torch.no_grad(): out = model.generate(x, max_new_tokens=400, temperature=0.75, top_k=40) print(enc.decode(out[0].tolist())) ``` ## Training - Base pretrained on TinyStories (10,000 steps) - Fine-tuned on JFLEG grammar dataset + curated essays (3,000 steps) - Hardware: RTX 3050 4GB VRAM - Framework: PyTorch / nanoGPT architecture ## Built for OpenAI Parameter Golf competition — targeting sub-16MB models with competitive bits-per-byte scores on enwik8.