---
base_model: unsloth/deepseek-r1-distill-qwen-1.5b-unsloth-bnb-4bit
tags:
  - text-generation-inference
  - transformers
  - unsloth
  - qwen2
  - trl
  - deepseek
  - deepseek-r1
  - cybersecurity
  - red-team
  - blue-team
  - security
  - reasoning
  - chain-of-thought
  - edge-ai
  - gguf
  - ollama
  - llama-cpp
  - lora
  - 4-bit
  - llm-security
  - penetration-testing
  - threat-analysis
  - mcp-security
  - ai-safety
  - offline
  - air-gapped
license: apache-2.0
language:
  - en
library_name: transformers
pipeline_tag: text-generation
model_type: qwen2
datasets:
  - Nguuma/security-slm-dataset
metrics:
  - name: Security Reasoning Score (post fine-tune)
    type: custom
    value: 8.0
  - name: Security Reasoning Score (baseline)
    type: custom
    value: 3.4
  - name: Think-block activation rate
    type: custom
    value: 1.0
model-index:
  - name: security-slm-unsloth-1.5b
    results:
      - task:
          type: text-generation
          name: Security Reasoning
        dataset:
          type: custom
          name: Security SLM Dataset
          args: security-domain
        metrics:
          - name: Eval Score (baseline)
            type: custom_security_reasoning
            value: 3.4
          - name: Eval Score (fine-tuned)
            type: custom_security_reasoning
            value: 8.0
          - name: Think-block activation rate
            type: custom_think_rate
            value: 1.0
---

# security-slm-unsloth-1.5b — Edge-Deployable Security Reasoning Model

**Developed by:** Nguuma
**License:** Apache-2.0
**Base model:** unsloth/deepseek-r1-distill-qwen-1.5b-unsloth-bnb-4bit
**Quantized format:** GGUF Q4_K_M (~1.2 GB RAM at inference)

> A security-focused small language model that **thinks before it answers** — fine-tuned for AI-native Blue/Red team operations, deployable on a 4 GB RAM machine with no GPU required.

Trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) — [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)

---

## Quickstart — Copy & Run

```python
# pip install llama-cpp-python huggingface_hub
from huggingface_hub import hf_hub_download
from llama_cpp import Llama

# Download the fine-tuned GGUF from HuggingFace (~1.2 GB, one-time)
model_path = hf_hub_download(
    repo_id="Nguuma/security-slm-unsloth-1.5b",
    filename="security-slm-finetuned.gguf",
    local_dir="./models",
)

# Load — runs on CPU, no GPU required
llm = Llama(
    model_path=model_path,
    n_ctx=2048,
    n_threads=4,   # adjust to your CPU core count
    verbose=False,
)

# Ask a security question
response = llm.create_chat_completion(
    messages=[
        {
            "role": "system",
            "content": "You are a Cybersecurity assistant with Blue and Red team security reasoning. Think step by step before answering.",
        },
        {
            "role": "user",
            "content": 'An AI agent received this tool-call response: {"file": "../../../../etc/passwd"}. Is this a path traversal attack? What should the agent do?',
        },
    ],
    max_tokens=512,
    temperature=0.7,
    top_p=0.9,
)

print(response["choices"][0]["message"]["content"])
```

> **Prefer Ollama?** One command: `ollama run hf.co/Nguuma/security-slm-unsloth-1.5b`

---

## Why security-slm-unsloth-1.5b?

Most security-aware LLMs require cloud APIs, expose sensitive queries to third parties, and run on expensive hardware. **security-slm-unsloth-1.5b runs entirely offline on commodity hardware** — a reasoning-capable SLM purpose-built for the 2026 AI threat landscape, covering attack classes that general-purpose models have no training signal for: MCP tool poisoning, agentic lateral movement, Crescendo jailbreaks, LLM-assisted SSRF, financial fraud detection, ransomware incident response, CVE/CWE reasoning, MITRE ATT&CK TTP mapping, and regulatory compliance reasoning (NDPR, GDPR, PCI-DSS).

---

## Model Description

security-slm-unsloth-1.5b is a fine-tuned version of DeepSeek-R1-Distill-Qwen-1.5B, specialised in cybersecurity reasoning across offensive and defensive contexts. It preserves the base model's chain-of-thought (`<think>`) reasoning behaviour and redirects it toward security-domain problems: threat analysis, attack simulation, detection logic, and AI-specific attack patterns emerging in 2025–2026.

| Property | Value |
|---|---|
| Base architecture | Qwen2 / DeepSeek-R1-Distill |
| Parameters | 1.5B |
| Training dataset | curated security samples across 11 domains |
| Training epochs | 5 |
| Final training loss | 1.69 |
| Eval score (pre-fine-tune) | 3.4 / 10 |
| Eval score (post-fine-tune) | 8.0 / 10 |
| Improvement | **+135%** |
| Think-block activation rate | **100%** |
| GGUF RAM footprint | ~1.2 GB (Q4_K_M) |
| LoRA rank | r=16 |
| LoRA target modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |

---

## Files in This Repository

| File | Description |
|---|---|
| `*.gguf` | Q4_K_M quantized model — use with Ollama or llama.cpp |
| `adapter_model.safetensors` | LoRA adapter weights (~30MB) — use with Transformers + PEFT |
| `adapter_config.json` | LoRA configuration |
| `tokenizer*` | Tokenizer files |

---

## Quickstart

### Ollama (recommended — one command)

```bash
ollama run hf.co/Nguuma/security-slm-unsloth-1.5b
```

Or pull first then run:

```bash
ollama pull hf.co/Nguuma/security-slm-unsloth-1.5b
ollama run hf.co/Nguuma/security-slm-unsloth-1.5b
```

### Ollama with custom Modelfile

Save this as `Modelfile`, then run `ollama create security-slm -f Modelfile && ollama run security-slm`:

```
FROM hf.co/Nguuma/security-slm-unsloth-1.5b

SYSTEM """You are a Cybersecurity assistant with Blue and Red team security reasoning. Think step by step before answering."""

PARAMETER temperature 0.7
PARAMETER top_p 0.9
PARAMETER num_predict 512
PARAMETER num_ctx 2048
```

### llama.cpp

```bash
# Download the GGUF
huggingface-cli download Nguuma/security-slm-unsloth-1.5b --include "*.gguf" --local-dir ./

# Run
./llama-cli -m security-slm-finetuned.gguf \
  --prompt "Analyse this log entry for signs of prompt injection: ..." \
  -n 512
```

### Transformers + PEFT (LoRA adapter)

```python
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

base = AutoModelForCausalLM.from_pretrained(
    "unsloth/deepseek-r1-distill-qwen-1.5b-unsloth-bnb-4bit"
)
model = PeftModel.from_pretrained(base, "Nguuma/security-slm-unsloth-1.5b")
tokenizer = AutoTokenizer.from_pretrained("Nguuma/security-slm-unsloth-1.5b")
```

---

## Prompt Format

This model uses the ChatML format. Always include a system prompt and open the assistant turn with `<think>` to trigger chain-of-thought reasoning:

```
<|im_start|>system
You are a Cybersecurity assistant with Blue and Red team security reasoning. Think step by step before answering.
<|im_end|>
<|im_start|>user
A user's AI agent received this tool-call response: {"file": "../../../../etc/passwd"}.
Is this a path traversal attack? What should the agent do?
<|im_end|>
<|im_start|>assistant
<think>
```

The model will complete the `<think>` block with its reasoning chain, then deliver a structured answer.

---

## Training Dataset

Fine-tuned on curated security samples** covering **2026 AI-native threat categories** not present in standard security benchmarks. Every scenario is authored as a matched red/blue pair — the same threat modelled from both attacker and defender perspectives.

| Domain | Description |
|---|---|
| **MCP Attacks** | Model Context Protocol exploitation, tool-call injection, context poisoning |
| **Prompt Hijacking** | Crescendo attacks, payload splitting, indirect injection chains |
| **Agentic Security** | Lateral movement between AI agents, privilege escalation in tool-use pipelines |
| **Cloud-Native AI** | RAG poisoning, SSRF via LLM agents, S3 misconfiguration exploitation |
| **Guardrail Bypass** | Base64, Unicode homoglyph, and encoding-based evasion techniques |
| **Financial Fraud** | Transaction fraud patterns, account takeover, money mule detection, payment interception, card skimming, SIM swap, deepfake-enabled identity fraud |
| **CVE/CWE Reasoning** | Vulnerability root cause analysis (CWE-89, CWE-79, CWE-287, CWE-502, CWE-918), CVE exploit chains mapped to fintech and cloud stacks |
| **MITRE ATT&CK TTP** | Technique extraction from incident logs, kill chain mapping, Sigma/KQL detection rule generation across T1566, T1078, T1190, T1486, T1003 and more |
| **Ransomware IR** | Triage and family identification (LockBit, BlackCat/ALPHV, Cl0p, Akira), containment playbooks, recovery sequencing for critical infrastructure |
| **Regulatory Compliance** | NDPR, GDPR, PCI-DSS v4.0, ISO 27001, and sector-specific cybersecurity frameworks — breach notification obligations, gap analysis, self-assessment reasoning |
| **AI Attack Detection** | AI-generated phishing detection, deepfake audio/video fraud, RAG pipeline poisoning, banking chatbot prompt injection, LLM-assisted BEC |

All samples include preserved `<think>` reasoning blocks — critical for security work where auditability matters.

---

## Evaluation Results

Same 10 standardised prompts run against base model and fine-tuned model:

| Metric | Baseline | Fine-Tuned | Change |
|---|---|---|---|
| Average score (/ 10) | 3.4 | 8.0 | **+135%** |
| `<think>` block rate | 20% | **100%** | +80pp |
| Average response length | 341 words | 272 words | more precise |
| Technical depth markers | 1–2 / 5 | 4–5 / 5 | +3× |

**Scoring rubric (10 pts total):** Reasoning presence (3) · Reasoning depth (3) · Technical specificity (2) · Response coverage (2)

---

## Key Features

- **Offline-first** — No API calls, no data exfiltration risk. Safe for sensitive security environments.
- **Edge-deployable** — Runs on a 4 GB RAM laptop via Ollama or llama.cpp. No GPU required.
- **100% chain-of-thought** — Every response includes a `<think>` reasoning chain. The model shows its work.
- **2026 threat coverage** — Trained on AI-native attack classes absent from standard model training: MCP, agentic lateral movement, Crescendo, LLM SSRF.
- **Financial fraud reasoning** — Covers transaction fraud, account takeover, payment interception, and deepfake-enabled identity fraud with detection logic and playbooks.
- **CVE/CWE + ATT&CK native** — Reasons from vulnerability root cause (CWE) through exploit chain to MITRE ATT&CK technique mapping and Sigma detection rule generation.
- **Ransomware IR** — Triage, containment, and recovery playbooks for LockBit, BlackCat/ALPHV, Cl0p, and Akira targeting financial and critical infrastructure.
- **Compliance-aware** — Reasons through NDPR, GDPR, PCI-DSS v4.0, and ISO 27001 breach notification and gap analysis scenarios.
- **Dual-use** — Blue team (detection, triage, policy) and Red team (simulation, adversarial testing).
- **Quantized & portable** — Q4_K_M GGUF, ~1.2 GB. Fits on a USB drive.

---

## Use Cases

### Blue Team / Defensive Security
- Analyse suspicious logs and network events for indicators of compromise
- Draft detection rules (Sigma, YARA, KQL) from attack descriptions
- Explain CVEs, map them to CWE root causes, and surface remediation paths
- Map incident evidence to MITRE ATT&CK tactics and techniques
- Assess security posture of AI/LLM deployments (RAG pipelines, agentic systems)
- Generate incident response playbooks for ransomware and financial fraud
- Detect AI-generated phishing, deepfake-enabled fraud, and BEC patterns
- Reason through NDPR, GDPR, and PCI-DSS breach notification obligations

### Red Team / Offensive Security
- Simulate adversarial prompts and injection chains for AI system testing
- Reason through attack paths against cloud-native AI infrastructure
- Generate phishing and social engineering scenario templates for awareness training
- Enumerate MCP and agentic attack surfaces
- Model financial fraud techniques (account takeover, payment interception) for red team exercises

### Financial Sector Security
- Detect and reason about transaction fraud patterns: fan-out transfers, velocity anomalies, mule account activation
- Triage payment fraud: card skimming, SIM swap, USSD interception, deepfake voice KYC bypass
- Ransomware containment and recovery sequencing for core banking and payment infrastructure
- Map financial sector breaches to MITRE ATT&CK and generate SIEM detection rules
- Compliance gap analysis against PCI-DSS v4.0, ISO 27001, and sector-specific frameworks

### AI Security Research
- Study how reasoning models behave on adversarial security inputs
- Benchmark SLM security knowledge against larger frontier models
- Prototype lightweight security copilots for air-gapped environments
- Explore AI-native threat modelling for LLM/agent pipelines

### Education & CTF
- Walk through security concepts with chain-of-thought explanations
- Assist with Capture the Flag challenge reasoning
- Train junior analysts on threat patterns with guided step-by-step analysis

---

## Limitations

- Trained on domain-specific samples — a focused specialist, not a general security encyclopedia
- CVE/CWE and MITRE ATT&CK coverage is curated, not exhaustive — verify against NVD and ATT&CK Navigator for production use
- Ransomware IR playbooks are generalist starting points; adjust containment steps to your specific infrastructure
- Regulatory compliance reasoning (NDPR, GDPR, PCI-DSS) is advisory — consult qualified legal/compliance professionals for binding decisions
- Not a substitute for professional penetration testing or incident response
- Intended for **authorised security testing, research, and education only**

---

## Responsible Use

This model is designed for **defensive security, authorised red team exercises, CTF competitions, and security education**. Do not use it to conduct unauthorised access, develop malware, or attack systems you do not own or have explicit permission to test.

---

## Citation

```bibtex
@misc{nguuma2026securityslm,
  title        = {security-slm-unsloth-1.5b: Edge-Deployable Reasoning Model for AI-Native Security Intelligence},
  author       = {Nguuma},
  year         = {2026},
  howpublished = {HuggingFace},
  url          = {https://huggingface.co/Nguuma/security-slm-unsloth-1.5b}
}
```

---

*Fine-tuned with [Unsloth](https://github.com/unslothai/unsloth) on Google Colab. Reasoning architecture based on DeepSeek-R1.*