| --- |
| license: apache-2.0 |
| library_name: executorch |
| tags: |
| - android |
| - ios |
| - on-device |
| - pytorch |
| - react-native |
| - smollm |
| - llama |
| base_model: HuggingFaceTB/SmolLM2-1.7B-Instruct |
| --- |
| |
| # SmolLM2-1.7B-Executorch-Q8DA4W |
|
|
| This repository contains the `smollm2_1_7b_q8da4w.pte` model, exported for use with [ExecuTorch](https://pytorch.org/executorch). |
|
|
| ## Details |
| - **Base Model**: [HuggingFaceTB/SmolLM2-1.7B-Instruct](https://huggingface.co/HuggingFaceTB/SmolLM2-1.7B-Instruct) |
| - **Format**: `.pte` (ExecuTorch) |
| - **Quantization**: Q8DA4W (4-bit linear weights, 8-bit dynamic activations) |
| - **Architecture**: llama (compatible with Llama export pipeline) |
| - **File Size**: ~1.7 GB |
|
|
| ## Features |
| - 🚀 Optimized for mobile/edge devices |
| - 📱 Compatible with `react-native-executorch` |
| - 💡 SmolLM2 is efficient and fast for resource-constrained environments |
| - 🗣️ Instruct-tuned for conversational AI |
|
|
| ## Usage |
| This model is ready to be used in mobile applications (iOS/Android) via the ExecuTorch runtime or `react-native-executorch`. |
|
|
| 1. Download `smollm2_1_7b_q8da4w.pte` and the tokenizer files (`tokenizer.json`, `vocab.json`, `merges.txt`). |
| 2. Place them in your app's asset folder. |
| 3. Load with ExecuTorch runtime. |
|
|
| ## Notes |
| - SmolLM2 uses **byte-level BPE tokenizer** (similar to GPT-2), not SentencePiece like Llama. |
| - Tokenizer files are: `tokenizer.json`, `vocab.json`, `merges.txt` |
|
|