---
base_model: unsloth/deepseek-r1-distill-qwen-1.5b-unsloth-bnb-4bit
tags:
- text-generation-inference
- transformers
- unsloth
- qwen2
- trl
- deepseek
- deepseek-r1
- cybersecurity
- red-team
- blue-team
- security
- reasoning
- chain-of-thought
- edge-ai
- gguf
- ollama
- llama-cpp
- lora
- 4-bit
- llm-security
- penetration-testing
- threat-analysis
- mcp-security
- ai-safety
- offline
- air-gapped
license: apache-2.0
language:
- en
library_name: transformers
pipeline_tag: text-generation
model_type: qwen2
datasets:
- Nguuma/security-slm-dataset
metrics:
- name: Security Reasoning Score (post fine-tune)
type: custom
value: 8.0
- name: Security Reasoning Score (baseline)
type: custom
value: 3.4
- name: Think-block activation rate
type: custom
value: 1.0
model-index:
- name: security-slm-unsloth-1.5b
results:
- task:
type: text-generation
name: Security Reasoning
dataset:
type: custom
name: Security SLM Dataset
args: security-domain
metrics:
- name: Eval Score (baseline)
type: custom_security_reasoning
value: 3.4
- name: Eval Score (fine-tuned)
type: custom_security_reasoning
value: 8.0
- name: Think-block activation rate
type: custom_think_rate
value: 1.0
---
# security-slm-unsloth-1.5b — Edge-Deployable Security Reasoning Model
**Developed by:** Nguuma
**License:** Apache-2.0
**Base model:** unsloth/deepseek-r1-distill-qwen-1.5b-unsloth-bnb-4bit
**Quantized format:** GGUF Q4_K_M (~1.2 GB RAM at inference)
> A security-focused small language model that **thinks before it answers** — fine-tuned for AI-native Blue/Red team operations, deployable on a 4 GB RAM machine with no GPU required.
Trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) — [
](https://github.com/unslothai/unsloth)
---
## Quickstart — Copy & Run
```python
# pip install llama-cpp-python huggingface_hub
from huggingface_hub import hf_hub_download
from llama_cpp import Llama
# Download the fine-tuned GGUF from HuggingFace (~1.2 GB, one-time)
model_path = hf_hub_download(
repo_id="Nguuma/security-slm-unsloth-1.5b",
filename="security-slm-finetuned.gguf",
local_dir="./models",
)
# Load — runs on CPU, no GPU required
llm = Llama(
model_path=model_path,
n_ctx=2048,
n_threads=4, # adjust to your CPU core count
verbose=False,
)
# Ask a security question
response = llm.create_chat_completion(
messages=[
{
"role": "system",
"content": "You are a Cybersecurity assistant with Blue and Red team security reasoning. Think step by step before answering.",
},
{
"role": "user",
"content": 'An AI agent received this tool-call response: {"file": "../../../../etc/passwd"}. Is this a path traversal attack? What should the agent do?',
},
],
max_tokens=512,
temperature=0.7,
top_p=0.9,
)
print(response["choices"][0]["message"]["content"])
```
> **Prefer Ollama?** One command: `ollama run hf.co/Nguuma/security-slm-unsloth-1.5b`
---
## Why security-slm-unsloth-1.5b?
Most security-aware LLMs require cloud APIs, expose sensitive queries to third parties, and run on expensive hardware. **security-slm-unsloth-1.5b runs entirely offline on commodity hardware** — a reasoning-capable SLM purpose-built for the 2026 AI threat landscape, covering attack classes that general-purpose models have no training signal for: MCP tool poisoning, agentic lateral movement, Crescendo jailbreaks, LLM-assisted SSRF, financial fraud detection, ransomware incident response, CVE/CWE reasoning, MITRE ATT&CK TTP mapping, and regulatory compliance reasoning (NDPR, GDPR, PCI-DSS).
---
## Model Description
security-slm-unsloth-1.5b is a fine-tuned version of DeepSeek-R1-Distill-Qwen-1.5B, specialised in cybersecurity reasoning across offensive and defensive contexts. It preserves the base model's chain-of-thought (``) reasoning behaviour and redirects it toward security-domain problems: threat analysis, attack simulation, detection logic, and AI-specific attack patterns emerging in 2025–2026.
| Property | Value |
|---|---|
| Base architecture | Qwen2 / DeepSeek-R1-Distill |
| Parameters | 1.5B |
| Training dataset | curated security samples across 11 domains |
| Training epochs | 5 |
| Final training loss | 1.69 |
| Eval score (pre-fine-tune) | 3.4 / 10 |
| Eval score (post-fine-tune) | 8.0 / 10 |
| Improvement | **+135%** |
| Think-block activation rate | **100%** |
| GGUF RAM footprint | ~1.2 GB (Q4_K_M) |
| LoRA rank | r=16 |
| LoRA target modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
---
## Files in This Repository
| File | Description |
|---|---|
| `*.gguf` | Q4_K_M quantized model — use with Ollama or llama.cpp |
| `adapter_model.safetensors` | LoRA adapter weights (~30MB) — use with Transformers + PEFT |
| `adapter_config.json` | LoRA configuration |
| `tokenizer*` | Tokenizer files |
---
## Quickstart
### Ollama (recommended — one command)
```bash
ollama run hf.co/Nguuma/security-slm-unsloth-1.5b
```
Or pull first then run:
```bash
ollama pull hf.co/Nguuma/security-slm-unsloth-1.5b
ollama run hf.co/Nguuma/security-slm-unsloth-1.5b
```
### Ollama with custom Modelfile
Save this as `Modelfile`, then run `ollama create security-slm -f Modelfile && ollama run security-slm`:
```
FROM hf.co/Nguuma/security-slm-unsloth-1.5b
SYSTEM """You are a Cybersecurity assistant with Blue and Red team security reasoning. Think step by step before answering."""
PARAMETER temperature 0.7
PARAMETER top_p 0.9
PARAMETER num_predict 512
PARAMETER num_ctx 2048
```
### llama.cpp
```bash
# Download the GGUF
huggingface-cli download Nguuma/security-slm-unsloth-1.5b --include "*.gguf" --local-dir ./
# Run
./llama-cli -m security-slm-finetuned.gguf \
--prompt "Analyse this log entry for signs of prompt injection: ..." \
-n 512
```
### Transformers + PEFT (LoRA adapter)
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
base = AutoModelForCausalLM.from_pretrained(
"unsloth/deepseek-r1-distill-qwen-1.5b-unsloth-bnb-4bit"
)
model = PeftModel.from_pretrained(base, "Nguuma/security-slm-unsloth-1.5b")
tokenizer = AutoTokenizer.from_pretrained("Nguuma/security-slm-unsloth-1.5b")
```
---
## Prompt Format
This model uses the ChatML format. Always include a system prompt and open the assistant turn with `` to trigger chain-of-thought reasoning:
```
<|im_start|>system
You are a Cybersecurity assistant with Blue and Red team security reasoning. Think step by step before answering.
<|im_end|>
<|im_start|>user
A user's AI agent received this tool-call response: {"file": "../../../../etc/passwd"}.
Is this a path traversal attack? What should the agent do?
<|im_end|>
<|im_start|>assistant
```
The model will complete the `` block with its reasoning chain, then deliver a structured answer.
---
## Training Dataset
Fine-tuned on curated security samples** covering **2026 AI-native threat categories** not present in standard security benchmarks. Every scenario is authored as a matched red/blue pair — the same threat modelled from both attacker and defender perspectives.
| Domain | Description |
|---|---|
| **MCP Attacks** | Model Context Protocol exploitation, tool-call injection, context poisoning |
| **Prompt Hijacking** | Crescendo attacks, payload splitting, indirect injection chains |
| **Agentic Security** | Lateral movement between AI agents, privilege escalation in tool-use pipelines |
| **Cloud-Native AI** | RAG poisoning, SSRF via LLM agents, S3 misconfiguration exploitation |
| **Guardrail Bypass** | Base64, Unicode homoglyph, and encoding-based evasion techniques |
| **Financial Fraud** | Transaction fraud patterns, account takeover, money mule detection, payment interception, card skimming, SIM swap, deepfake-enabled identity fraud |
| **CVE/CWE Reasoning** | Vulnerability root cause analysis (CWE-89, CWE-79, CWE-287, CWE-502, CWE-918), CVE exploit chains mapped to fintech and cloud stacks |
| **MITRE ATT&CK TTP** | Technique extraction from incident logs, kill chain mapping, Sigma/KQL detection rule generation across T1566, T1078, T1190, T1486, T1003 and more |
| **Ransomware IR** | Triage and family identification (LockBit, BlackCat/ALPHV, Cl0p, Akira), containment playbooks, recovery sequencing for critical infrastructure |
| **Regulatory Compliance** | NDPR, GDPR, PCI-DSS v4.0, ISO 27001, and sector-specific cybersecurity frameworks — breach notification obligations, gap analysis, self-assessment reasoning |
| **AI Attack Detection** | AI-generated phishing detection, deepfake audio/video fraud, RAG pipeline poisoning, banking chatbot prompt injection, LLM-assisted BEC |
All samples include preserved `` reasoning blocks — critical for security work where auditability matters.
---
## Evaluation Results
Same 10 standardised prompts run against base model and fine-tuned model:
| Metric | Baseline | Fine-Tuned | Change |
|---|---|---|---|
| Average score (/ 10) | 3.4 | 8.0 | **+135%** |
| `` block rate | 20% | **100%** | +80pp |
| Average response length | 341 words | 272 words | more precise |
| Technical depth markers | 1–2 / 5 | 4–5 / 5 | +3× |
**Scoring rubric (10 pts total):** Reasoning presence (3) · Reasoning depth (3) · Technical specificity (2) · Response coverage (2)
---
## Key Features
- **Offline-first** — No API calls, no data exfiltration risk. Safe for sensitive security environments.
- **Edge-deployable** — Runs on a 4 GB RAM laptop via Ollama or llama.cpp. No GPU required.
- **100% chain-of-thought** — Every response includes a `` reasoning chain. The model shows its work.
- **2026 threat coverage** — Trained on AI-native attack classes absent from standard model training: MCP, agentic lateral movement, Crescendo, LLM SSRF.
- **Financial fraud reasoning** — Covers transaction fraud, account takeover, payment interception, and deepfake-enabled identity fraud with detection logic and playbooks.
- **CVE/CWE + ATT&CK native** — Reasons from vulnerability root cause (CWE) through exploit chain to MITRE ATT&CK technique mapping and Sigma detection rule generation.
- **Ransomware IR** — Triage, containment, and recovery playbooks for LockBit, BlackCat/ALPHV, Cl0p, and Akira targeting financial and critical infrastructure.
- **Compliance-aware** — Reasons through NDPR, GDPR, PCI-DSS v4.0, and ISO 27001 breach notification and gap analysis scenarios.
- **Dual-use** — Blue team (detection, triage, policy) and Red team (simulation, adversarial testing).
- **Quantized & portable** — Q4_K_M GGUF, ~1.2 GB. Fits on a USB drive.
---
## Use Cases
### Blue Team / Defensive Security
- Analyse suspicious logs and network events for indicators of compromise
- Draft detection rules (Sigma, YARA, KQL) from attack descriptions
- Explain CVEs, map them to CWE root causes, and surface remediation paths
- Map incident evidence to MITRE ATT&CK tactics and techniques
- Assess security posture of AI/LLM deployments (RAG pipelines, agentic systems)
- Generate incident response playbooks for ransomware and financial fraud
- Detect AI-generated phishing, deepfake-enabled fraud, and BEC patterns
- Reason through NDPR, GDPR, and PCI-DSS breach notification obligations
### Red Team / Offensive Security
- Simulate adversarial prompts and injection chains for AI system testing
- Reason through attack paths against cloud-native AI infrastructure
- Generate phishing and social engineering scenario templates for awareness training
- Enumerate MCP and agentic attack surfaces
- Model financial fraud techniques (account takeover, payment interception) for red team exercises
### Financial Sector Security
- Detect and reason about transaction fraud patterns: fan-out transfers, velocity anomalies, mule account activation
- Triage payment fraud: card skimming, SIM swap, USSD interception, deepfake voice KYC bypass
- Ransomware containment and recovery sequencing for core banking and payment infrastructure
- Map financial sector breaches to MITRE ATT&CK and generate SIEM detection rules
- Compliance gap analysis against PCI-DSS v4.0, ISO 27001, and sector-specific frameworks
### AI Security Research
- Study how reasoning models behave on adversarial security inputs
- Benchmark SLM security knowledge against larger frontier models
- Prototype lightweight security copilots for air-gapped environments
- Explore AI-native threat modelling for LLM/agent pipelines
### Education & CTF
- Walk through security concepts with chain-of-thought explanations
- Assist with Capture the Flag challenge reasoning
- Train junior analysts on threat patterns with guided step-by-step analysis
---
## Limitations
- Trained on domain-specific samples — a focused specialist, not a general security encyclopedia
- CVE/CWE and MITRE ATT&CK coverage is curated, not exhaustive — verify against NVD and ATT&CK Navigator for production use
- Ransomware IR playbooks are generalist starting points; adjust containment steps to your specific infrastructure
- Regulatory compliance reasoning (NDPR, GDPR, PCI-DSS) is advisory — consult qualified legal/compliance professionals for binding decisions
- Not a substitute for professional penetration testing or incident response
- Intended for **authorised security testing, research, and education only**
---
## Responsible Use
This model is designed for **defensive security, authorised red team exercises, CTF competitions, and security education**. Do not use it to conduct unauthorised access, develop malware, or attack systems you do not own or have explicit permission to test.
---
## Citation
```bibtex
@misc{nguuma2026securityslm,
title = {security-slm-unsloth-1.5b: Edge-Deployable Reasoning Model for AI-Native Security Intelligence},
author = {Nguuma},
year = {2026},
howpublished = {HuggingFace},
url = {https://huggingface.co/Nguuma/security-slm-unsloth-1.5b}
}
```
---
*Fine-tuned with [Unsloth](https://github.com/unslothai/unsloth) on Google Colab. Reasoning architecture based on DeepSeek-R1.*