Instructions to use ishathombre/monolingual-hindi-from-scratch with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use ishathombre/monolingual-hindi-from-scratch with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("fill-mask", model="ishathombre/monolingual-hindi-from-scratch")# Load model directly from transformers import AutoTokenizer, AutoModelForMaskedLM tokenizer = AutoTokenizer.from_pretrained("ishathombre/monolingual-hindi-from-scratch") model = AutoModelForMaskedLM.from_pretrained("ishathombre/monolingual-hindi-from-scratch") - Notebooks
- Google Colab
- Kaggle
File size: 975 Bytes
def9b62 b054188 1616c2b b054188 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 | ---
datasets:
- ai4bharat/IndicCorpV2
language:
- hi
base_model:
- google-bert/bert-base-uncased
pipeline_tag: fill-mask
library_name: transformers
---
# BERT from Scratch (1 Epoch, Training Loss: 4.13)
BERT model trained from scratch using a custom tokenizer with a 64,000-token vocabulary.
- **Training:** 1 epoch
- **Masked Language Modeling (MLM) loss:** 4.13
- **Tokenizer:** Custom-trained, vocab size, on iit-madras Hindi-monolingual dataset = 64,000
- **Architecture:**
Maximum position embeddings: 512
Hidden size: 312
Number of attention heads: 12
Number of transformer layers: 4
Intermediate (feed-forward) size: 1200
Type vocabulary size: 2 (for segment embeddings)
It is uploaded for checkpointing, experimentation, and community feedback.
## Intended Use
- Research on training dynamics
- Continued pretraining
- Fine-tuning for downstream tasks (with caution)
## Limitations
- Low training coverage (1 epoch)
- Not yet evaluated on downstream tasks
|