Instructions to use coder3101/LFM2.5-VL-1.6B-heretic with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use coder3101/LFM2.5-VL-1.6B-heretic with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="coder3101/LFM2.5-VL-1.6B-heretic")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForMultimodalLM

processor = AutoProcessor.from_pretrained("coder3101/LFM2.5-VL-1.6B-heretic")
model = AutoModelForMultimodalLM.from_pretrained("coder3101/LFM2.5-VL-1.6B-heretic")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use coder3101/LFM2.5-VL-1.6B-heretic with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "coder3101/LFM2.5-VL-1.6B-heretic"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "coder3101/LFM2.5-VL-1.6B-heretic",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/coder3101/LFM2.5-VL-1.6B-heretic

SGLang

How to use coder3101/LFM2.5-VL-1.6B-heretic with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "coder3101/LFM2.5-VL-1.6B-heretic" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "coder3101/LFM2.5-VL-1.6B-heretic",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "coder3101/LFM2.5-VL-1.6B-heretic" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "coder3101/LFM2.5-VL-1.6B-heretic",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Docker Model Runner
How to use coder3101/LFM2.5-VL-1.6B-heretic with Docker Model Runner:
```
docker model run hf.co/coder3101/LFM2.5-VL-1.6B-heretic
```

LFM2.5-VL-1.6B-heretic / reproduce /README.md

coder3101

Upload reproduce/README.md with huggingface_hub

a145630 verified 14 days ago

preview code

raw

history blame contribute delete

4 kB

Reproduction guide

This directory contains the necessary information and assets to reproduce the results obtained during this Heretic run.

Local code

This system installed Heretic from a local directory or wheel. Uncommitted or experimental code may have been executed.

Reproducibility cannot be guaranteed in this environment.

Models

Base model: LiquidAI/LFM2.5-VL-1.6B (Commit: 919fde3)

Datasets

Good prompts: mlabonne/harmless_alpaca (Commit: 02c6a92)
Bad prompts: mlabonne/harmful_behaviors (Commit: 01cead0)
Good evaluation prompts: mlabonne/harmless_alpaca (Commit: 02c6a92)
Bad evaluation prompts: mlabonne/harmful_behaviors (Commit: 01cead0)

Selected trial

Trial number: 184
KL divergence: 0.011400
Refusals: 8/100

System

Python: 3.12.3 (CPython, GCC 13.3.0) [Virtualenv/Venv]
Operating system: Linux-6.17.0-1021-nvidia-aarch64-with-glibc2.39 (aarch64)
CPU: Cortex-X925

Accelerators

CUDA: Detected 1 device(s) (121.69 GB total VRAM)
- CUDA Version: 13.0
- Driver Version: 580.159.03
Devices:
- CUDA 0: NVIDIA GB10 (121.69 GB)

Environment

Heretic: v1.3.0 (Origin: Local)
PyTorch: 2.12.0+cu130
Other dependencies: See requirements.txt.

Contents of this directory

requirements.txt: The exact versions of all Python packages.
config.toml: The exact configuration used, including the RNG seed.
LiquidAI--LFM2--5-VL-1--6B.jsonl: The Optuna study journal containing the history of all trials.
SHA256SUMS: Cryptographic hashes for all weight files.
reproduce.json: A machine-readable file containing all reproducibility information.

How to reproduce

Ensure your system matches the specifications in the System section above. Exact reproducibility is only guaranteed if all aspects of your system are identical to the one the model was originally generated on.
Install the exact version of Heretic indicated in the Environment section above, from its original source.
Install the packages listed in requirements.txt: pip install -r requirements.txt
Install the correct version of PyTorch: pip install torch==2.12.0+cu130 --index-url https://download.pytorch.org/whl/cu130
Place the provided config.toml in your working directory.
Run Heretic without any additional arguments: heretic
Wait for the run to finish, then select trial 184 and export the model.
Verify that the weight files have been exactly reproduced by comparing their SHA-256 hashes against those in SHA256SUMS: sha256sum -c SHA256SUMS (or look at the hashes online if you uploaded to Hugging Face)

To use the included Optuna study journal LiquidAI--LFM2--5-VL-1--6B.jsonl, place it in the checkpoints directory (usually checkpoints/) before running Heretic.

This allows you to export other models from the Pareto front, or to run additional trials without having to re-run the stored trials.