Instructions to use wjbmattingly/LightOnOCR-2-1B-german-shorthand-line with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use wjbmattingly/LightOnOCR-2-1B-german-shorthand-line with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="wjbmattingly/LightOnOCR-2-1B-german-shorthand-line")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("wjbmattingly/LightOnOCR-2-1B-german-shorthand-line", dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use wjbmattingly/LightOnOCR-2-1B-german-shorthand-line with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "wjbmattingly/LightOnOCR-2-1B-german-shorthand-line"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "wjbmattingly/LightOnOCR-2-1B-german-shorthand-line",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/wjbmattingly/LightOnOCR-2-1B-german-shorthand-line

SGLang

How to use wjbmattingly/LightOnOCR-2-1B-german-shorthand-line with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "wjbmattingly/LightOnOCR-2-1B-german-shorthand-line" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "wjbmattingly/LightOnOCR-2-1B-german-shorthand-line",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "wjbmattingly/LightOnOCR-2-1B-german-shorthand-line" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "wjbmattingly/LightOnOCR-2-1B-german-shorthand-line",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Docker Model Runner
How to use wjbmattingly/LightOnOCR-2-1B-german-shorthand-line with Docker Model Runner:
```
docker model run hf.co/wjbmattingly/LightOnOCR-2-1B-german-shorthand-line
```

LightOnOCR-2-1B-german-shorthand-line / README.md

wjbmattingly

Upload folder using huggingface_hub

afc0579 verified 5 months ago

preview code

Raw

History Blame Contribute Delete

6.27 kB

	---
	language:
	- de
	license: apache-2.0
	library_name: transformers
	base_model: lightonai/LightOnOCR-2-1B-base
	tags:
	- ocr
	- vision-language
	- lightonocr
	- document-understanding
	- german
	- shorthand
	- manuscript
	- medieval
	datasets:
	- medieval-data/german-shorthand-line
	pipeline_tag: image-text-to-text
	---

	# LightOnOCR-2-1B for German (Line-Level)

	<p align="center">
	<img src="https://huggingface.co/lightonai/LightOnOCR-2-1B-base/resolve/main/lightonocr-banner.png" alt="LightOnOCR Banner" width="600"/>
	</p>

	This model is a fine-tuned version of [lightonai/LightOnOCR-2-1B-base](https://huggingface.co/lightonai/LightOnOCR-2-1B-base) specifically trained for line-level OCR.

	German shorthand manuscript line-level OCR

	## Model Description

	- Base Model: [lightonai/LightOnOCR-2-1B-base](https://huggingface.co/lightonai/LightOnOCR-2-1B-base)
	- Training Data: [medieval-data/german-shorthand-line](https://huggingface.co/datasets/medieval-data/german-shorthand-line)
	- Task: Line-level text transcription from document images
	- Language: German (de)
	- Architecture: Vision-Language Model (1B parameters)

	This is a line-level model - it expects cropped line images as input, not full pages. Each image should contain a single line of text.

	## Evaluation Results

	Evaluated on 50 samples from the test set:

	\| Metric \| Base Model \| Finetuned \| Improvement \|
	\|--------\|------------\|---------------\|-------------\|
	\| CER (%) \| 381.26 \| 21.89 \| +359.37 \|
	\| WER (%) \| 494.99 \| 37.41 \| +457.58 \|
	\| Perfect Matches \| 0 \| 0 \| +0 \|

	Lower CER/WER is better. Higher perfect matches is better.

	### Example Outputs

	\| # \| Ground Truth \| Base Model \| Finetuned \|
	\|---\|--------------\|------------\|---------------\|
	\| 1 \| (Haupt der seligen Irmeng. gefunden. Im ... \| 12/12/1998 10:00 AM 10:00 AM 10:00 AM 10... \| (Haupt der seitdem Jänner 12 20 bei Daue... \|
	\| 2 \| Schw. Reinh.: Ist vom Lagerdienst freige... \| Schw. Reinh. : 2d 9.20 16 09 J. 6 \| Schw. Reinh.: Ist vom Lagerdienst frei g... \|
	\| 3 \| Klage daß im Naz.heim den Kranken die Ko... \| $$
	\begin{aligned}
	& \text { 22 e 2 haz.... \| Klage daß im Naz.heim den Kranken die Ko... \|
	\| 4 \| Irene: Stimmung sehr verschieden. Kommen... \| \| Irene: Stimmung sehr verschiedenes. Münd... \|
	\| 5 \| Zwei Schwestern Calabrien: M. Cristina u... \| 226 Kolabrie: M. Cisneros, Urode \| Zwei Schwestern Katalrien: M. Cristina u... \|

	✓ = exact match

	## Usage

	### Installation

	```bash
	# Requires transformers from source
	pip install git+https://github.com/huggingface/transformers
	pip install pillow torch
	```

	### Python Usage

	```python
	import torch
	from transformers import LightOnOcrForConditionalGeneration, LightOnOcrProcessor
	from PIL import Image

	# Load model and processor
	model_id = "wjbmattingly/LightOnOCR-2-1B-german-shorthand-line"
	device = "cuda" if torch.cuda.is_available() else "cpu"
	dtype = torch.bfloat16 if device == "cuda" else torch.float32

	processor = LightOnOcrProcessor.from_pretrained(model_id)
	model = LightOnOcrForConditionalGeneration.from_pretrained(
	model_id,
	torch_dtype=dtype,
	).to(device)

	# Load your line image
	image = Image.open("your_image.jpg").convert("RGB")

	# Prepare input
	messages = [{"role": "user", "content": [{"type": "image"}]}]
	text = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)

	inputs = processor(
	text=[text],
	images=[[image]],
	return_tensors="pt",
	padding=True,
	size={"longest_edge": 700},
	).to(device)
	inputs["pixel_values"] = inputs["pixel_values"].to(dtype)

	# Generate transcription
	with torch.no_grad():
	outputs = model.generate(**inputs, max_new_tokens=256, do_sample=False)

	# Decode output
	input_length = inputs["input_ids"].shape[1]
	generated_ids = outputs[0, input_length:]
	transcription = processor.decode(generated_ids, skip_special_tokens=True)

	print(transcription)
	```

	### Batch Inference

	```python
	from datasets import load_dataset

	# Load dataset
	dataset = load_dataset("medieval-data/german-shorthand-line", split="train[:10]")

	# Process batch
	images = [[img.convert("RGB")] for img in dataset["image"]]
	messages = [{"role": "user", "content": [{"type": "image"}]}]
	text = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
	texts = [text] * len(images)

	inputs = processor(
	text=texts,
	images=images,
	return_tensors="pt",
	padding=True,
	size={"longest_edge": 700},
	).to(device)
	inputs["pixel_values"] = inputs["pixel_values"].to(dtype)

	outputs = model.generate(**inputs, max_new_tokens=256, do_sample=False)
	predictions = processor.batch_decode(outputs[:, inputs["input_ids"].shape[1]:], skip_special_tokens=True)

	for pred, gt in zip(predictions, dataset["text"]):
	print(f"Prediction: {pred}")
	print(f"Ground Truth: {gt}")
	print()
	```

	## Training Details

	- Base Model: [lightonai/LightOnOCR-2-1B-base](https://huggingface.co/lightonai/LightOnOCR-2-1B-base)
	- Training Method: Fine-tuning with frozen language model backbone
	- Optimizer: AdamW (fused)
	- Learning Rate: 6e-5 with linear decay
	- Precision: bfloat16

	## Limitations

	- This model is trained on line-level images. For full-page transcription, you need to first segment the page into individual lines.
	- Performance may vary on document styles not represented in the training data.

	## Citation

	If you use this model, please cite:

	```bibtex
	@misc{lightonocr2_finetuned_2026,
	title = {LightOnOCR Fine-tuned for German},
	author = {William Mattingly},
	year = {2026},
	howpublished = {\url{https://huggingface.co/wjbmattingly/LightOnOCR-2-1B-german-shorthand-line}}
	}
	```

	And the original LightOnOCR paper:

	```bibtex
	@misc{lightonocr2_2026,
	title = {LightOnOCR: A 1B End-to-End Multilingual Vision-Language Model for State-of-the-Art OCR},
	author = {Said Taghadouini and Adrien Cavaill\`{e}s and Baptiste Aubertin},
	year = {2026},
	howpublished = {\url{https://arxiv.org/pdf/2601.14251}}
	}
	```

	## Acknowledgments

	- [LightOn AI](https://www.lighton.ai/) for the excellent LightOnOCR base model
	- The creators of the [medieval-data/german-shorthand-line](https://huggingface.co/datasets/medieval-data/german-shorthand-line) dataset