--- license: apache-2.0 tags: - build-small-hackathon - pgsm - exactstate-memory - non-transformer - language-model - surprisal - fineweb-edu - tiny-model - tiny-titan - well-tuned datasets: - HuggingFaceFW/fineweb-edu --- # PGSM Text Surprisal Editor Model This repository contains the trained model weights used by the Hugging Face Space: https://huggingface.co/spaces/build-small-hackathon/pgsm-text-surprisal-editor ## Model Summary PGSM Text Surprisal Editor is powered by a compact non-Transformer language model based on a custom ExactState Memory / PGSM architecture. The model is used to score whole-word surprisal by evaluating how predictable each removed word is from its left and right context. ## Architecture - Architecture: PGSM / ExactState Memory - Transformer blocks: 0 - Self-attention layers: 0 - Parameters: approximately 4 million - Vocabulary: approximately 2k tokens - Model file: `final_infer.pt` This model does not use Transformer self-attention. Context is propagated through learned state transitions rather than pairwise attention computations. ## Training The model was fully trained by the author on approximately 19 billion tokens from FineWeb-Edu. Training details: - Training source: FineWeb-Edu - Training scale: approximately 19B tokens - Training type: full custom training by the author - Base architecture: PGSM / ExactState Memory - Off-the-shelf Transformer checkpoint used: none - Final inference weights: `final_infer.pt` ## Intended Use This model is intended for the PGSM Text Surprisal Editor Space, where it powers whole-word surprisal heatmaps for pasted text. The model is designed for experimentation, visualization, and language-analysis demos rather than production writing assistance or factual generation. ## Limitations - Very small model size compared with mainstream LLMs - Compact vocabulary - Designed for surprisal visualization, not general-purpose chat - Outputs should be treated as model-analysis signals, not factual judgments - Training and evaluation details are summarized here for hackathon review ## Hackathon Context This model supports the Hugging Face Build Small Hackathon submission: - Track: Thousand Token Wood - Badges: Tiny Titan, Well-Tuned, Off the Grid, Field Notes The key goal is to demonstrate a very small, fully trained, non-Transformer language model running locally inside a Hugging Face Space.