--- license: apache-2.0 library_name: executorch tags: - android - ios - on-device - pytorch - react-native - smollm - llama base_model: HuggingFaceTB/SmolLM2-1.7B-Instruct --- # SmolLM2-1.7B-Executorch-Q8DA4W This repository contains the `smollm2_1_7b_q8da4w.pte` model, exported for use with [ExecuTorch](https://pytorch.org/executorch). ## Details - **Base Model**: [HuggingFaceTB/SmolLM2-1.7B-Instruct](https://huggingface.co/HuggingFaceTB/SmolLM2-1.7B-Instruct) - **Format**: `.pte` (ExecuTorch) - **Quantization**: Q8DA4W (4-bit linear weights, 8-bit dynamic activations) - **Architecture**: llama (compatible with Llama export pipeline) - **File Size**: ~1.7 GB ## Features - 🚀 Optimized for mobile/edge devices - 📱 Compatible with `react-native-executorch` - 💡 SmolLM2 is efficient and fast for resource-constrained environments - 🗣️ Instruct-tuned for conversational AI ## Usage This model is ready to be used in mobile applications (iOS/Android) via the ExecuTorch runtime or `react-native-executorch`. 1. Download `smollm2_1_7b_q8da4w.pte` and the tokenizer files (`tokenizer.json`, `vocab.json`, `merges.txt`). 2. Place them in your app's asset folder. 3. Load with ExecuTorch runtime. ## Notes - SmolLM2 uses **byte-level BPE tokenizer** (similar to GPT-2), not SentencePiece like Llama. - Tokenizer files are: `tokenizer.json`, `vocab.json`, `merges.txt`