ai4bharat/IndicCorpV2
Preview • Updated • 1.34k • 20
How to use ishathombre/monolingual-hindi-from-scratch with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("fill-mask", model="ishathombre/monolingual-hindi-from-scratch") # Load model directly
from transformers import AutoTokenizer, AutoModelForMaskedLM
tokenizer = AutoTokenizer.from_pretrained("ishathombre/monolingual-hindi-from-scratch")
model = AutoModelForMaskedLM.from_pretrained("ishathombre/monolingual-hindi-from-scratch")BERT model trained from scratch using a custom tokenizer with a 64,000-token vocabulary.
It is uploaded for checkpointing, experimentation, and community feedback.
Base model
google-bert/bert-base-uncased