Image Feature Extraction
Transformers
Safetensors
deepseek_vl_v2
ocr
vision-language
fp8
quantized
deepseek
custom_code
compressed-tensors
Instructions to use richarddavison/DeepSeek-OCR-2-FP8 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use richarddavison/DeepSeek-OCR-2-FP8 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-feature-extraction", model="richarddavison/DeepSeek-OCR-2-FP8", trust_remote_code=True)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("richarddavison/DeepSeek-OCR-2-FP8", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
File size: 1,363 Bytes
0a95096 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 | ---
license: mit
base_model: deepseek-ai/DeepSeek-OCR-2
tags:
- ocr
- vision-language
- fp8
- quantized
- deepseek
library_name: transformers
---
# DeepSeek-OCR-2-FP8
FP8 dynamically quantized version of [deepseek-ai/DeepSeek-OCR-2](https://huggingface.co/deepseek-ai/DeepSeek-OCR-2) for faster inference.
## Model Details
- **Base Model**: deepseek-ai/DeepSeek-OCR-2
- **Architecture**: deepseek_vl_v2 (3B parameters)
- **Quantization**: FP8 Dynamic (llmcompressor)
- **Model Size**: ~3.5GB (vs ~6GB BF16)
## Quantization
Quantized using [llmcompressor](https://github.com/vllm-project/llmcompressor) with FP8_DYNAMIC scheme:
```python
from llmcompressor import oneshot
from llmcompressor.modifiers.quantization import QuantizationModifier
recipe = QuantizationModifier(
targets="Linear",
scheme="FP8_DYNAMIC",
ignore=["lm_head"]
)
oneshot(model=model, recipe=recipe)
```
## Usage
```python
from transformers import AutoModel, AutoTokenizer
model = AutoModel.from_pretrained(
"richarddavison/DeepSeek-OCR-2-FP8",
device_map="auto",
torch_dtype="auto",
trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained(
"richarddavison/DeepSeek-OCR-2-FP8",
trust_remote_code=True
)
```
## Requirements
- transformers==4.46.3
- torch>=2.0
- flash-attn (recommended)
## License
MIT (same as base model)
|