---
license: apache-2.0
library_name: executorch
tags:
- android
- ios
- on-device
- pytorch
- react-native
- smollm
- llama
base_model: HuggingFaceTB/SmolLM2-1.7B-Instruct
---

# SmolLM2-1.7B-Executorch-Q8DA4W

This repository contains the `smollm2_1_7b_q8da4w.pte` model, exported for use with [ExecuTorch](https://pytorch.org/executorch).

## Details
- **Base Model**: [HuggingFaceTB/SmolLM2-1.7B-Instruct](https://huggingface.co/HuggingFaceTB/SmolLM2-1.7B-Instruct)
- **Format**: `.pte` (ExecuTorch)
- **Quantization**: Q8DA4W (4-bit linear weights, 8-bit dynamic activations)
- **Architecture**: llama (compatible with Llama export pipeline)
- **File Size**: ~1.7 GB

## Features
- 🚀 Optimized for mobile/edge devices
- 📱 Compatible with `react-native-executorch`
- 💡 SmolLM2 is efficient and fast for resource-constrained environments
- 🗣️ Instruct-tuned for conversational AI

## Usage
This model is ready to be used in mobile applications (iOS/Android) via the ExecuTorch runtime or `react-native-executorch`.

1. Download `smollm2_1_7b_q8da4w.pte` and the tokenizer files (`tokenizer.json`, `vocab.json`, `merges.txt`).
2. Place them in your app's asset folder.
3. Load with ExecuTorch runtime.

## Notes
- SmolLM2 uses **byte-level BPE tokenizer** (similar to GPT-2), not SentencePiece like Llama.
- Tokenizer files are: `tokenizer.json`, `vocab.json`, `merges.txt`