---
language:
- en
license: apache-2.0
library_name: transformers
base_model: HuggingFaceTB/SmolVLM2-2.2B-Instruct
tags:
- vision
- multimodal
- llama-cpp
- gguf
- quantized
---

# SmolVLM2-2.2B-Instruct GGUF

GGUF quantizations of [HuggingFaceTB/SmolVLM2-2.2B-Instruct](https://huggingface.co/HuggingFaceTB/SmolVLM2-2.2B-Instruct) for use with llama.cpp and Ollama.

## Model Description

SmolVLM2 is a compact 2.2B parameter vision-language model from HuggingFace with video understanding capabilities. It's designed to be fast and efficient while maintaining strong performance on vision-language tasks.

## Key Features

- **Compact & Fast** - Only 2.2B parameters, runs efficiently on consumer hardware
- **Vision & Video** - Understands both images and video frames
- **Instruction-tuned** - Optimized for following user instructions
- **Apache 2.0** - Fully open source

## Available Quantizations

| Filename | Quant | Size | Description |
|----------|-------|------|-------------|
| SmolVLM2-2.2B-Instruct-Q4_K_M.gguf | Q4_K_M | 1.0 GB | Best balance of quality and speed (recommended) |
| SmolVLM2-2.2B-Instruct-Q8_0.gguf | Q8_0 | 1.8 GB | Higher quality |
| SmolVLM2-2.2B-Instruct.gguf | F16 | 3.4 GB | Full precision |

## Usage

### With Ollama

```bash
# Pull and run (Q4_K_M by default)
ollama run richardyoung/smolvlm2-2.2b-instruct

# Or specific quantization
ollama run richardyoung/smolvlm2-2.2b-instruct:q8_0
ollama run richardyoung/smolvlm2-2.2b-instruct:f16
```

### With llama.cpp

```bash
# Download a quantization
wget https://huggingface.co/richardyoung/SmolVLM2-2.2B-Instruct-GGUF/resolve/main/SmolVLM2-2.2B-Instruct-Q4_K_M.gguf

# Run with llama.cpp
./llama-cli -m SmolVLM2-2.2B-Instruct-Q4_K_M.gguf -p "Describe this image:" --image your_image.jpg
```

## Technical Requirements

- **Minimum:** 4GB RAM, any modern CPU
- **Recommended:** 8GB RAM or Apple Silicon Mac

## Chat Template

SmolVLM2 uses the ChatML format:
```
<|im_start|>system
{system_message}<|im_end|>
<|im_start|>user
{user_message}<|im_end|>
<|im_start|>assistant
{assistant_response}<|im_end|>
```

## Links

- **Original Model:** [HuggingFaceTB/SmolVLM2-2.2B-Instruct](https://huggingface.co/HuggingFaceTB/SmolVLM2-2.2B-Instruct)
- **Ollama:** [richardyoung/smolvlm2-2.2b-instruct](https://ollama.com/richardyoung/smolvlm2-2.2b-instruct)

## Credits

- **Original Model:** HuggingFace
- **Quantization:** Richard Young ([deepneuro.ai](https://deepneuro.ai/richard))

## License

Apache 2.0