slm-bahasa-id / tokenizer_config.json
romizone's picture
Upload SLM Bahasa Indonesia
9815efc verified
Raw
History Blame Contribute Delete
219 Bytes
{
"tokenizer_class": "BPETokenizer",
"vocab_size": 4000,
"model_type": "bpe",
"special_tokens": {
"<PAD>": 0,
"<UNK>": 1,
"<BOS>": 2,
"<EOS>": 3
},
"do_lower_case": true,
"language": "id"
}