geoffsee
/

auto-g-nano-10m

@@ -1,10 +1,39 @@
 ---
 tags:
-- model_hub_mixin
-- pytorch_model_hub_mixin
 ---
-This model has been pushed to the Hub using the [PytorchModelHubMixin](https://huggingface.co/docs/huggingface_hub/package_reference/mixins#huggingface_hub.PyTorchModelHubMixin) integration:
-- Code: [More Information Needed]
-- Paper: [More Information Needed]
-- Docs: [More Information Needed]

 ---
+language: en
+license: mit
+datasets:
+- tinyshakespeare
 tags:
+- text-generation
+- gpt
+- nano-gpt
+- pytorch
 ---
+# auto-g-nano
+This is a minimal, decoder-only Transformer (nanoGPT-style) trained from scratch on the Tiny Shakespeare dataset.
+## Model Details
+- **Architecture**: Decoder-only Transformer
+- **Parameters**: ~10.8M
+- **Vocabulary Size**: 65
+- **Embedding Dimension**: 384
+- **Heads**: 6
+- **Layers**: 6
+- **Block Size**: 256
+## How to Use
+You can use this model directly with the `GPT` class from this repository.
+```python
+from model import GPT
+model = GPT.from_pretrained("geoffsee/auto-g-nano")
+# Generate text
+# context = torch.zeros((1, 1), dtype=torch.long)
+# print(model.generate(context, max_new_tokens=100))
+```
+## Training Data
+Trained on the [Tiny Shakespeare](https://raw.githubusercontent.com/karpathy/char-rnn/master/data/tinyshakespeare/input.txt) dataset.