OLMo-2 300M Pretrained

OLMo-2 アーキテクチャを ~300M に縮小してランダム初期化から事前学習したモデル。

Training Data

  • FineWeb (60%) — ODC-By
  • Wikipedia JA (40%) — CC BY-SA 4.0

Architecture

  • Base: OLMo-2 config (allenai/OLMo-2-0425-1B)
  • Parameters: ~300M
  • hidden_size: 1024, layers: 16, heads: 16
Downloads last month
38
Safetensors
Model size
0.5B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support