Instructions to use LiquidAI/LFM2.5-8B-A1B-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Notebooks
Google Colab
Kaggle
Local Apps Settings

How to use LiquidAI/LFM2.5-8B-A1B-GGUF with llama.cpp:

Install (macOS, Linux)

curl -LsSf https://llama.app/install.sh | sh
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf LiquidAI/LFM2.5-8B-A1B-GGUF:Q4_K_M
# Run inference directly in the terminal:
llama cli -hf LiquidAI/LFM2.5-8B-A1B-GGUF:Q4_K_M

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf LiquidAI/LFM2.5-8B-A1B-GGUF:Q4_K_M
# Run inference directly in the terminal:
llama cli -hf LiquidAI/LFM2.5-8B-A1B-GGUF:Q4_K_M

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf LiquidAI/LFM2.5-8B-A1B-GGUF:Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf LiquidAI/LFM2.5-8B-A1B-GGUF:Q4_K_M

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf LiquidAI/LFM2.5-8B-A1B-GGUF:Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf LiquidAI/LFM2.5-8B-A1B-GGUF:Q4_K_M

Use Docker

docker model run hf.co/LiquidAI/LFM2.5-8B-A1B-GGUF:Q4_K_M

LM Studio
Jan

vLLM

How to use LiquidAI/LFM2.5-8B-A1B-GGUF with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "LiquidAI/LFM2.5-8B-A1B-GGUF"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "LiquidAI/LFM2.5-8B-A1B-GGUF",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/LiquidAI/LFM2.5-8B-A1B-GGUF:Q4_K_M

Ollama
How to use LiquidAI/LFM2.5-8B-A1B-GGUF with Ollama:
```
ollama run hf.co/LiquidAI/LFM2.5-8B-A1B-GGUF:Q4_K_M
```

Unsloth Studio

How to use LiquidAI/LFM2.5-8B-A1B-GGUF with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for LiquidAI/LFM2.5-8B-A1B-GGUF to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for LiquidAI/LFM2.5-8B-A1B-GGUF to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for LiquidAI/LFM2.5-8B-A1B-GGUF to start chatting

How to use LiquidAI/LFM2.5-8B-A1B-GGUF with Pi:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama serve -hf LiquidAI/LFM2.5-8B-A1B-GGUF:Q4_K_M

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "llama-cpp": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "LiquidAI/LFM2.5-8B-A1B-GGUF:Q4_K_M"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use LiquidAI/LFM2.5-8B-A1B-GGUF with Hermes Agent:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama serve -hf LiquidAI/LFM2.5-8B-A1B-GGUF:Q4_K_M

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default LiquidAI/LFM2.5-8B-A1B-GGUF:Q4_K_M

Run Hermes

hermes

Atomic Chat new

OpenClaw new

How to use LiquidAI/LFM2.5-8B-A1B-GGUF with OpenClaw:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama serve -hf LiquidAI/LFM2.5-8B-A1B-GGUF:Q4_K_M

Configure OpenClaw

# Install OpenClaw:
npm install -g openclaw@latest
# Register the local server and set it as the default model:
openclaw onboard --non-interactive --mode local \
  --auth-choice custom-api-key \
  --custom-base-url http://127.0.0.1:8080/v1 \
  --custom-model-id "LiquidAI/LFM2.5-8B-A1B-GGUF:Q4_K_M" \
  --custom-provider-id llama-cpp \
  --custom-compatibility openai \
  --custom-text-input \
  --accept-risk \
  --skip-health

Run OpenClaw

openclaw agent --local --agent main --message "Hello from Hugging Face"

Docker Model Runner
How to use LiquidAI/LFM2.5-8B-A1B-GGUF with Docker Model Runner:
```
docker model run hf.co/LiquidAI/LFM2.5-8B-A1B-GGUF:Q4_K_M
```

Lemonade

How to use LiquidAI/LFM2.5-8B-A1B-GGUF with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull LiquidAI/LFM2.5-8B-A1B-GGUF:Q4_K_M

Run and chat with the model

lemonade run user.LFM2.5-8B-A1B-GGUF-Q4_K_M

List all available models

lemonade list

Tool calling and thinking not working in llama.cpp

by crzynik - opened May 28

Discussion

crzynik

May 28

•

edited May 28

Running in llama.cpp I am seeing two issues:

think tags are being output as general output not reasoning trace including <think/> tags
tool calling is not working.

I tried having chat-template = chatml but this did not help. Unclear if this is a problem of needing additional args to start in llama.cpp or if fixes are needed

kalle07

May 28

"BenchLocal" in all test much less good than gemma-4-e2b
no tools, bad bugfix, bad math reasoning ... aso
sry next try guys ;)
your smart lfm2.5_vl 0.6/1.6 for images are fine

brunocasado

May 28

•

edited May 28

not working properly. unfortunately. llama.cpp + hermes agent.

the understanding is very poor tbh. Asking in portuguese and then it answers me with a unrelated thing

toriset

May 28

same here, but seems to have some potential if tool calls worked

alfredo-ottomate

May 28

The chat template is busted, but I cannot fix it tonight.
Reasoning works with this unrelated one (for Qwen) https://huggingface.co/froggeric/Qwen-Fixed-Chat-Templates/blob/main/chat_template.jinja

aminya

May 28

•

edited May 29

This chat template from https://huggingface.co/nathanrchn/LFM2.5-8B-A1B-GGUF-fixed-v2 seems to work

{# List of tools: [ #} {{- bos_token -}} {%- set preserve_thinking = preserve_thinking | default(false) -%} {%- macro format_arg_value(arg_value) -%} {%- if arg_value is string -%} {{- "'" + arg_value + "'" -}} {%- elif arg_value is mapping -%} {{- arg_value | tojson -}} {%- else -%} {{- arg_value | string -}} {%- endif -%} {%- endmacro -%} {%- macro parse_content(content) -%} {%- if content is string -%} {{- content -}} {%- else -%} {%- set _ns = namespace(result="") -%} {%- for item in content -%} {%- if item["type"] == "image" -%} {%- set _ns.result = _ns.result + "<image>" -%} {%- elif item["type"] == "text" -%} {%- set _ns.result = _ns.result + item["text"] -%} {%- else -%} {%- set _ns.result = _ns.result + item | tojson -%} {%- endif -%} {%- endfor -%} {{- _ns.result -}} {%- endif -%} {%- endmacro -%} {%- macro render_tool_calls(tool_calls) -%} {%- set tool_calls_ns = namespace(tool_calls=[]) -%} {%- for tool_call in tool_calls -%} {%- set func_name = tool_call["function"]["name"] -%} {%- set func_args = tool_call["function"]["arguments"] -%} {%- set args_ns = namespace(arg_strings=[]) -%} {%- for arg_name, arg_value in func_args.items() -%} {%- set args_ns.arg_strings = args_ns.arg_strings + [arg_name + "=" + format_arg_value(arg_value)] -%} {%- endfor -%} {%- set tool_calls_ns.tool_calls = tool_calls_ns.tool_calls + [func_name + "(" + (args_ns.arg_strings | join(", ")) + ")"] -%} {%- endfor -%} {{- "<|tool_call_start|>[" + (tool_calls_ns.tool_calls | join(", ")) + "]<|tool_call_end|>" -}} {%- endmacro -%} {%- set ns = namespace(system_prompt="", last_user_index=-1) -%} {%- if messages[0]["role"] == "system" -%} {%- if messages[0].get("content") -%} {%- set ns.system_prompt = parse_content(messages[0]["content"]) -%} {%- endif -%} {%- set messages = messages[1:] -%} {%- endif -%} {%- if tools -%} {%- set ns.system_prompt = ns.system_prompt + ("\n\n" if ns.system_prompt else "") + "Today's date: " + strftime_now("%Y-%m-%d") + "\n\nList of tools: " + (tools | tojson) -%} {%- endif -%} {%- if ns.system_prompt -%} {{- "<|im_start|>system\n" + ns.system_prompt + "<|im_end|>\n" -}} {%- endif -%} {%- for message in messages -%} {%- if message["role"] == "user" -%} {%- set ns.last_user_index = loop.index0 -%} {%- endif -%} {%- endfor -%} {%- for message in messages -%} {{- "<|im_start|>" + message.role + "\n" -}} {%- if message.role == "assistant" -%} {%- generation -%} {%- if message.thinking is defined and (preserve_thinking or loop.index0 > ns.last_user_index) -%} {{- "<think>" + message.thinking + "</think>" -}} {%- endif -%} {%- set _cfm_tag = "CONTINUE_FINAL_MESSAGE_TAG " -%} {%- set _has_cfm = false -%} {%- if message.content is defined -%} {%- set content = parse_content(message.content) -%} {%- if not (preserve_thinking or loop.index0 > ns.last_user_index) -%} {%- if "</think>" in content -%} {%- set content = content.split("</think>")[-1] | trim -%} {%- endif -%} {%- endif -%} {%- if message.tool_calls is defined and content.endswith(_cfm_tag) -%} {%- set _has_cfm = true -%} {%- set _trunc_len = (content | length) - (_cfm_tag | length) -%} {{- content[:_trunc_len] -}} {%- else -%} {{- content -}} {%- endif -%} {%- endif -%} {%- if message.tool_calls is defined -%} {{- render_tool_calls(message.tool_calls) -}} {%- endif -%} {%- if _has_cfm -%} {{- _cfm_tag -}} {%- endif -%} {{- "<|im_end|>\n" -}} {%- endgeneration -%} {%- else %} {%- if message.get("content") -%} {{- parse_content(message["content"]) -}} {%- endif -%} {{- "<|im_end|>\n" -}} {%- endif %} {%- endfor -%} {%- if add_generation_prompt -%} {{- "<|im_start|>assistant\n" -}} {%- endif -%}

alfredo-ottomate

May 28

This chat template from https://huggingface.co/nathanrchn/LFM2.5-8B-A1B-GGUF-fixed-v2 seems to work

{# List of tools: [ #} {{- bos_token -}} {%- set preserve_thinking = preserve_thinking | default(false) -%} {%- macro format_arg_value(arg_value) -%} {%- if arg_value is string -%} {{- "'" + arg_value + "'" -}} {%- elif arg_value is mapping -%} {{- arg_value | tojson -}} {%- else -%} {{- arg_value | string -}} {%- endif -%} {%- endmacro -%} {%- macro parse_content(content) -%} {%- if content is string -%} {{- content -}} {%- else -%} {%- set _ns = namespace(result="") -%} {%- for item in content -%} {%- if item["type"] == "image" -%} {%- set _ns.result = _ns.result + "<image>" -%} {%- elif item["type"] == "text" -%} {%- set _ns.result = _ns.result + item["text"] -%} {%- else -%} {%- set _ns.result = _ns.result + item | tojson -%} {%- endif -%} {%- endfor -%} {{- _ns.result -}} {%- endif -%} {%- endmacro -%} {%- macro render_tool_calls(tool_calls) -%} {%- set tool_calls_ns = namespace(tool_calls=[]) -%} {%- for tool_call in tool_calls -%} {%- set func_name = tool_call["function"]["name"] -%} {%- set func_args = tool_call["function"]["arguments"] -%} {%- set args_ns = namespace(arg_strings=[]) -%} {%- for arg_name, arg_value in func_args.items() -%} {%- set args_ns.arg_strings = args_ns.arg_strings + [arg_name + "=" + format_arg_value(arg_value)] -%} {%- endfor -%} {%- set tool_calls_ns.tool_calls = tool_calls_ns.tool_calls + [func_name + "(" + (args_ns.arg_strings | join(", ")) + ")"] -%} {%- endfor -%} {{- "<|tool_call_start|>[" + (tool_calls_ns.tool_calls | join(", ")) + "]<|tool_call_end|>" -}} {%- endmacro -%} {%- set ns = namespace(system_prompt="", last_user_index=-1) -%} {%- if messages[0]["role"] == "system" -%} {%- if messages[0].get("content") -%} {%- set ns.system_prompt = parse_content(messages[0]["content"]) -%} {%- endif -%} {%- set messages = messages[1:] -%} {%- endif -%} {%- if tools -%} {%- set ns.system_prompt = ns.system_prompt + ("\n\n" if ns.system_prompt else "") + "Today's date: " + strftime_now("%Y-%m-%d") + "\n\nList of tools: " + (tools | tojson) -%} {%- endif -%} {%- if ns.system_prompt -%} {{- "<|im_start|>system\n" + ns.system_prompt + "<|im_end|>\n" -}} {%- endif -%} {%- for message in messages -%} {%- if message["role"] == "user" -%} {%- set ns.last_user_index = loop.index0 -%} {%- endif -%} {%- endfor -%} {%- for message in messages -%} {{- "<|im_start|>" + message.role + "\n" -}} {%- if message.role == "assistant" -%} {%- generation -%} {%- if message.thinking is defined and (preserve_thinking or loop.index0 > ns.last_user_index) -%} {{- "<think>" + message.thinking + "</think>" -}} {%- endif -%} {%- set _cfm_tag = "CONTINUE_FINAL_MESSAGE_TAG " -%} {%- set _has_cfm = false -%} {%- if message.content is defined -%} {%- set content = parse_content(message.content) -%} {%- if not (preserve_thinking or loop.index0 > ns.last_user_index) -%} {%- if "</think>" in content -%} {%- set content = content.split("</think>")[-1] | trim -%} {%- endif -%} {%- endif -%} {%- if message.tool_calls is defined and content.endswith(_cfm_tag) -%} {%- set _has_cfm = true -%} {%- set _trunc_len = (content | length) - (_cfm_tag | length) -%} {{- content[:_trunc_len] -}} {%- else -%} {{- content -}} {%- endif -%} {%- endif -%} {%- if message.tool_calls is defined -%} {{- render_tool_calls(message.tool_calls) -}} {%- endif -%} {%- if _has_cfm -%} {{- _cfm_tag -}} {%- endif -%} {{- "<|im_end|>\n" -}} {%- endgeneration -%} {%- else %} {%- if message.get("content") -%} {{- parse_content(message["content"]) -}} {%- endif -%} {{- "<|im_end|>\n" -}} {%- endif %} {%- endfor -%} {%- if add_generation_prompt -%} {{- "<|im_start|>assistant\n" -}} {%- endif -%}

Nice find man. It's not perfect, but it works very well.

chdelacr

May 29

Issues reported in llama.cpp: https://github.com/ggml-org/llama.cpp/issues?q=is%3Aissue%20state%3Aopen%20lfm2.5-8b

tarek-liquid

Liquid AI org May 29

Tool calling has been fixed, and GGUF's were updated. Please pull.

tarek-liquid changed discussion status to closed May 29

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment