How to use from the
Use from the
Transformers library
# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="blockblockblock/LFM2.5-8B-A1B-uncensored-abliterated")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)
# Load model directly
from transformers import AutoTokenizer, AutoModelForMultimodalLM

tokenizer = AutoTokenizer.from_pretrained("blockblockblock/LFM2.5-8B-A1B-uncensored-abliterated")
model = AutoModelForMultimodalLM.from_pretrained("blockblockblock/LFM2.5-8B-A1B-uncensored-abliterated")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))
Quick Links

LFM2.5-8B-A1B-uncensored-abliterated

refusals base

AdvBench:

refusals (n=100)
original 39/100
this 0/100

Method

  • Dual-path abliteration targeting LFM2.5 hybrid architecture (18 conv + 6 attention layers)
  • 50 modules modified across all 24 layers (conv + attention paths)
  • Refusal direction: harmful - harmless
  • Per-module weights: attn=4.0, conv=3.0, ffn=2.5, in_proj=2.0
  • 51 prompts for precise direction estimation

License: LFM 1.0 License


Base Model Card (LiquidAI/LFM2.5-8B-A1B)

LFM2.5-8B-A1B

LFM2.5 is a new family of hybrid models designed for on-device deployment. It builds on the LFM2 architecture with extended pre-training and reinforcement learning.

  • On-device personal assistant: Designed to power real-life applications, chaining tool calls, and following complex instructions on all devices.
  • Compressed performance: Competitive with much larger dense and MoE models on instruction following and agentic tasks.
  • Unmatched throughput: Fastest in its size class on both CPU and GPU inference, with day-one support for llama.cpp, MLX, vLLM, and SGLang.

Find more information about LFM2.5-8B-A1B in our blog post.

Model Details

  • Model size: 8B params
  • Tensor type: F32 / BF16
  • Architecture: Hybrid (18 conv + 6 attention layers)
  • Paper: LFM2 Technical Report

License

This model uses the LFM 1.0 License.

Citation

@article{liquidAI20268BA1B,
  author  = {Liquid AI},
  title   = {LFM2.5-8B-A1B: Personal Assistant On Your Laptop},
  journal = {Liquid AI Blog},
  year    = {2026},
  note    = {www.liquid.ai/blog/lfm2-5-8b-a1b},
}

@article{liquidai2025lfm2,
  title   = {LFM2 Technical Report},
  author  = {Liquid AI},
  journal = {arXiv preprint arXiv:2511.23404},
  year    = {2025}
}
Downloads last month
247
Safetensors
Model size
9B params
Tensor type
F32
·
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for blockblockblock/LFM2.5-8B-A1B-uncensored-abliterated

Finetuned
(21)
this model
Quantizations
5 models

Paper for blockblockblock/LFM2.5-8B-A1B-uncensored-abliterated