ishathombre
/

monolingual-hindi-from-scratch

Model card Files Files and versions

ishathombre commited on Jun 6, 2025

Commit

b054188

·

verified ·

1 Parent(s): def9b62

Update README.md

Files changed (1) hide show

README.md +32 -1

README.md CHANGED Viewed

@@ -7,4 +7,35 @@ base_model:
 - google-bert/bert-base-uncased
 pipeline_tag: fill-mask
 library_name: transformers
----

 - google-bert/bert-base-uncased
 pipeline_tag: fill-mask
 library_name: transformers
+---
+# BERT from Scratch (1 Epoch, Training Loss: 4.13)
+These are the scripts for creating the BERT model trained from scratch using a custom tokenizer with a 64,000-token vocabulary. The model can be found here: https://huggingface.co/ishathombre/monolingual-hindi-from-scratch
+- **Training:** 1 epoch
+- **Masked Language Modeling (MLM) loss:** 4.13
+- **Tokenizer:** Custom-trained, vocab size, on iit-madras Hindi-monolingual dataset = 64,000
+- **Architecture:**
+Maximum position embeddings: 512
+Hidden size: 312
+Number of attention heads: 12
+Number of transformer layers: 4
+Intermediate (feed-forward) size: 1200
+Type vocabulary size: 2 (for segment embeddings)
+It is uploaded for checkpointing, experimentation, and community feedback.
+## Intended Use
+- Research on training dynamics
+- Continued pretraining
+- Fine-tuning for downstream tasks (with caution)
+## Limitations
+- Low training coverage (1 epoch)
+- Not yet evaluated on downstream tasks
+[More Information Needed]