blackcloud1199
/

SmolLM2-1.7B-Executorch-Q8DA4W

Model card Files Files and versions

SmolLM2-1.7B-Executorch-Q8DA4W / README.md

blackcloud1199's picture

Upload README.md with huggingface_hub

1a4933f verified 6 months ago

|

History Blame Contribute Delete

1.39 kB

	---
	license: apache-2.0
	library_name: executorch
	tags:
	- android
	- ios
	- on-device
	- pytorch
	- react-native
	- smollm
	- llama
	base_model: HuggingFaceTB/SmolLM2-1.7B-Instruct
	---

	# SmolLM2-1.7B-Executorch-Q8DA4W

	This repository contains the `smollm2_1_7b_q8da4w.pte` model, exported for use with [ExecuTorch](https://pytorch.org/executorch).

	## Details
	- Base Model: [HuggingFaceTB/SmolLM2-1.7B-Instruct](https://huggingface.co/HuggingFaceTB/SmolLM2-1.7B-Instruct)
	- Format: `.pte` (ExecuTorch)
	- Quantization: Q8DA4W (4-bit linear weights, 8-bit dynamic activations)
	- Architecture: llama (compatible with Llama export pipeline)
	- File Size: ~1.7 GB

	## Features
	- 🚀 Optimized for mobile/edge devices
	- 📱 Compatible with `react-native-executorch`
	- 💡 SmolLM2 is efficient and fast for resource-constrained environments
	- 🗣️ Instruct-tuned for conversational AI

	## Usage
	This model is ready to be used in mobile applications (iOS/Android) via the ExecuTorch runtime or `react-native-executorch`.

	1. Download `smollm2_1_7b_q8da4w.pte` and the tokenizer files (`tokenizer.json`, `vocab.json`, `merges.txt`).
	2. Place them in your app's asset folder.
	3. Load with ExecuTorch runtime.

	## Notes
	- SmolLM2 uses byte-level BPE tokenizer (similar to GPT-2), not SentencePiece like Llama.
	- Tokenizer files are: `tokenizer.json`, `vocab.json`, `merges.txt`