alina0195/ro-modernBERT-phase-d-nowwm

Romanian ModernBERT-base produced by continual pretraining of ModernBERT-base on the Romanian FineWeb2 corpus with a custom 42k Romanian tokenizer.

This is the Phase D checkpoint (no whole-word masking): context-extended to 8192 tokens in Phase C, then cooled down with a 1-sqrt LR schedule in Phase D.

  • Backbone: ModernBERT-base (22 layers, hidden 768, 12 heads)
  • Tokenizer: custom Romanian BPE, vocab=42240
  • Max sequence length: 8192
  • Training framework: MosaicML Composer
Downloads last month
13
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support