djsamseng's picture
Fix: Use SentencePiece directly instead of AlbertTokenizer which strips some important khmer characters
6c85eb7 verified