--- license: apache-2.0 language: - multilingual - en - zh - yue tags: - distilbert - mlm - coreml - ios - grammar-correction - code-switching - fast library_name: coremltools base_model: distilbert/distilbert-base-multilingual-cased --- # DistilBERT MLM Multilingual - CoreML (Fast) This is a CoreML conversion of [distilbert-base-multilingual-cased](https://huggingface.co/distilbert/distilbert-base-multilingual-cased) with the Masked Language Model (MLM) head, optimized for iOS deployment. ## Why DistilBERT? DistilBERT is a smaller, faster version of BERT that retains 97% of BERT's language understanding while being: - **40% smaller** (~258MB vs ~340MB) - **60% faster** inference - Same 104 language support ## Model Description - **Base Model**: distilbert-base-multilingual-cased - **Task**: Masked Language Modeling (MLM) - **Languages**: 104 languages including English, Chinese, Cantonese - **Format**: CoreML (.mlmodelc) - **Size**: ~258MB compiled ## Intended Use Grammar correction that preserves code-switching (mixed language text). Ideal for mobile keyboards where speed is important. ### Comparison with BERT | Model | Size | Speed | Quality | |-------|------|-------|---------| | BERT-base-multilingual | ~340MB | Baseline | 100% | | **DistilBERT-multilingual** | ~258MB | ~2x faster | ~97% | ## Model Files - `vocab.txt` - WordPiece vocabulary (119,547 tokens) - `distilbert_mlm.mlmodelc/` - Compiled CoreML model for iOS ## Technical Details - **Architecture**: DistilBERT (6 layers, 768 hidden, 12 attention heads) - **Parameters**: ~66M (vs BERT's 110M) - **Max Sequence Length**: 128 tokens - **Compute Units**: CPU (for iOS background app compatibility) - **Minimum iOS**: 15.0 ## Usage in iOS ```swift import CoreML // Load model let config = MLModelConfiguration() config.computeUnits = .cpuOnly let model = try MLModel(contentsOf: modelURL, configuration: config) // Prepare inputs (DistilBERT doesn't use token_type_ids) let inputIds: MLMultiArray = // tokenized input with [MASK] tokens let attentionMask: MLMultiArray = // attention mask // Run inference let input = try MLDictionaryFeatureProvider(dictionary: [ "input_ids": MLFeatureValue(multiArray: inputIds), "attention_mask": MLFeatureValue(multiArray: attentionMask) ]) let output = try model.prediction(from: input) let logits = output.featureValue(for: "logits")?.multiArrayValue ``` ## License This model is released under the **Apache 2.0 License**. ## Attribution - **Original Model**: [distilbert-base-multilingual-cased](https://huggingface.co/distilbert/distilbert-base-multilingual-cased) - **DistilBERT**: Created by Hugging Face - **Conversion**: CoreML conversion using [coremltools](https://github.com/apple/coremltools) - **Converted by**: goodpixelltd ## Citation ```bibtex @article{sanh2019distilbert, title={DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter}, author={Sanh, Victor and Debut, Lysandre and Chaumond, Julien and Wolf, Thomas}, journal={arXiv preprint arXiv:1910.01108}, year={2019} } ```