NEO-1-mini

Transformer decoder-only (GPT-style) trained 100% from scratch by MDFJ / mdfjbots.

Architecture

Parameter Value
Layers 4
Heads 4
d_model 128
d_ff 512
Vocabulary 1,370
Context 256 tok
~Parameters ~1-2M

Quick Use

from huggingface_hub import hf_hub_download
from neo_mini.train import load_checkpoint
from neo_mini.tokenizer import NeoTokenizer
from neo_mini.generate import generate

model, _ = load_checkpoint(hf_hub_download("MDFJ/neo-1-mini", "model_best.pt"))
tok      = NeoTokenizer.load(hf_hub_download("MDFJ/neo-1-mini", "neo_tokenizer.model"))

print(generate(model, tok, "hola, ¿cómo estás?"))

License

Apache 2.0 — MDFJ / mdfjbots

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including MDFJ/neo-1-mini