Instructions to use sahilchachra/LFM2.5-8B-A1B-Uncensored with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use sahilchachra/LFM2.5-8B-A1B-Uncensored with MLX:
# Make sure mlx-lm is installed # pip install --upgrade mlx-lm # Generate text with mlx-lm from mlx_lm import load, generate model, tokenizer = load("sahilchachra/LFM2.5-8B-A1B-Uncensored") prompt = "Write a story about Einstein" messages = [{"role": "user", "content": prompt}] prompt = tokenizer.apply_chat_template( messages, add_generation_prompt=True ) text = generate(model, tokenizer, prompt=prompt, verbose=True) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- LM Studio
- Pi
How to use sahilchachra/LFM2.5-8B-A1B-Uncensored with Pi:
Start the MLX server
# Install MLX LM: uv tool install mlx-lm # Start a local OpenAI-compatible server: mlx_lm.server --model "sahilchachra/LFM2.5-8B-A1B-Uncensored"
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "mlx-lm": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "sahilchachra/LFM2.5-8B-A1B-Uncensored" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use sahilchachra/LFM2.5-8B-A1B-Uncensored with Hermes Agent:
Start the MLX server
# Install MLX LM: uv tool install mlx-lm # Start a local OpenAI-compatible server: mlx_lm.server --model "sahilchachra/LFM2.5-8B-A1B-Uncensored"
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default sahilchachra/LFM2.5-8B-A1B-Uncensored
Run Hermes
hermes
- MLX LM
How to use sahilchachra/LFM2.5-8B-A1B-Uncensored with MLX LM:
Generate or start a chat session
# Install MLX LM uv tool install mlx-lm # Interactive chat REPL mlx_lm.chat --model "sahilchachra/LFM2.5-8B-A1B-Uncensored"
Run an OpenAI-compatible server
# Install MLX LM uv tool install mlx-lm # Start the server mlx_lm.server --model "sahilchachra/LFM2.5-8B-A1B-Uncensored" # Calling the OpenAI-compatible server with curl curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "sahilchachra/LFM2.5-8B-A1B-Uncensored", "messages": [ {"role": "user", "content": "Hello"} ] }'
LFM2.5-8B-A1B — Uncensored
A fully uncensored version of LiquidAI/LFM2.5-8B-A1B produced via a two-stage pipeline: abliteration followed by LoRA supervised fine-tuning. Achieves 0% refusal rate on AdvBench with no capability regression on benign prompts.
Intended for: security research, red-teaming, jailbreak benchmarking, and AI safety study. Not intended for production deployment or harmful use.
Benchmark Results
Evaluated on AdvBench (100 harmful behaviors) and an over-refusal set (40 benign prompts).
Harmful prompt refusal rate ↓ lower is more uncensored
| Model | Refused / 100 | Refusal Rate |
|---|---|---|
| LFM2.5-8B-A1B (original) | 24 / 100 | 24.0% |
| LFM2.5-8B-A1B (abliterated only) | 17 / 100 | 17.0% |
| LFM2.5-8B-A1B-Uncensored (this model) | 0 / 100 | 0.0% |
Over-refusal rate on benign prompts ↓ lower is better
| Model | Refused / 40 | Refusal Rate |
|---|---|---|
| LFM2.5-8B-A1B (original) | 0 / 40 | 0.0% |
| LFM2.5-8B-A1B (abliterated only) | 1 / 40 | 2.5% |
| LFM2.5-8B-A1B-Uncensored (this model) | 0 / 40 | 0.0% |
Zero refusals on harmful prompts. Zero regressions on benign prompts.
Pipeline
Stage 1 — Abliteration (training-free)
Based on Arditi et al., "Refusal in LLMs Is Mediated by a Single Direction" (2024).
- Collect residual stream activations layer-by-layer for 40 harmful and 40 harmless prompts
- Compute per-layer refusal direction:
r = normalize(mean_harmful − mean_harmless) - Orthogonalize all residual-stream output projections in layers 9–23 against
r:W_new = W − outer(r, r.T @ W)
Targeted projections: self_attn.out_proj, conv.out_proj, feed_forward.down_proj, feed_forward.switch_mlp.down_proj (all 32 experts).
Result: 24% → 17% refusal rate.
Stage 2 — LoRA SFT
Fine-tuned the 4-bit quantized base with LoRA adapters on 80 direct-response training pairs generated from the abliterated model:
| Setting | Value |
|---|---|
| Base model | LFM2.5-8B-A1B-MLX-4bit |
| LoRA rank | 16 |
| LoRA scale | 20.0 |
| Layers | Last 16 of 24 |
| Trainable params | 98M / 8.4B (1.2%) |
| Training pairs | 80 (AdvBench-style) |
| Iterations | 600 |
| Learning rate | 1e-4 |
| Peak memory | 7.4 GB |
Adapters fused and dequantized back to bfloat16.
Result: 17% → 0% refusal rate.
Model Details
| Property | Value |
|---|---|
| Base model | LiquidAI/LFM2.5-8B-A1B |
| Architecture | Hybrid Conv + GQA + MoE |
| Parameters | 8.3B total / 1.5B active |
| Layers | 24 (18 conv + 6 attention) |
| Experts | 32 total, top-4 routing |
| Context | 128K tokens |
| Format | MLX bfloat16 safetensors |
Usage (MLX)
from mlx_lm import load, generate
from mlx_lm.sample_utils import make_sampler, make_logits_processors
model, tokenizer = load("sahilchachra/LFM2.5-8B-A1B-Uncensored")
messages = [{"role": "user", "content": "Your prompt here"}]
prompt = tokenizer.apply_chat_template(
messages, add_generation_prompt=True, tokenize=False
)
response = generate(
model, tokenizer,
prompt=prompt,
max_tokens=500,
sampler=make_sampler(temp=0.2, top_k=80),
logits_processors=make_logits_processors(repetition_penalty=1.05),
)
print(response)
Limitations & Warnings
- Residual capability loss possible — LoRA training on a narrow dataset may affect performance on tasks outside the training distribution. General reasoning and coding are unaffected based on testing.
- Not fine-tuned for new knowledge — the model has no new information; the fine-tuning only removes refusal behavior.
- Responsible use — published for safety research and red-teaming. The authors do not endorse harmful use of this model.
Citation
@article{arditi2024refusal,
title={Refusal in Language Models Is Mediated by a Single Direction},
author={Arditi, Andy and Obeso, Oscar and Syed, Aaquib and Steinhardt, Jacob and Nanda, Neel and Heimersheim, Stefan},
journal={arXiv preprint arXiv:2406.11717},
year={2024}
}
@article{liquidai2025lfm25,
title={LFM 2.5: Series of Liquid Foundation Models},
author={LiquidAI},
year={2025}
}
Created with UncensorLLMs
- Downloads last month
- 371
Quantized