Update README.md

1616c2b verified about 1 year ago

975 Bytes

datasets:
  - ai4bharat/IndicCorpV2
language:
  - hi
base_model:
  - google-bert/bert-base-uncased
pipeline_tag: fill-mask
library_name: transformers

BERT from Scratch (1 Epoch, Training Loss: 4.13)

BERT model trained from scratch using a custom tokenizer with a 64,000-token vocabulary.

Training: 1 epoch
Masked Language Modeling (MLM) loss: 4.13
Tokenizer: Custom-trained, vocab size, on iit-madras Hindi-monolingual dataset = 64,000
Architecture: Maximum position embeddings: 512 Hidden size: 312 Number of attention heads: 12 Number of transformer layers: 4 Intermediate (feed-forward) size: 1200 Type vocabulary size: 2 (for segment embeddings)

It is uploaded for checkpointing, experimentation, and community feedback.

Intended Use