# 🏗ïļ SCAM HONEYPOT - Complete Architecture Documentation ## 📁 Project Structure Overview ``` sentinel-scam-honeypot/ ├── app/ # Main application code │ ├── agents/ # ðŸĪ– AI Agents (brain of the system) │ ├── api/ # 🌐 REST API endpoints │ ├── core/ # 🧠 Core components (LLM, memory, prompts) │ ├── decoys/ # ðŸŠĪ Fake endpoints to trap scammers │ ├── enforcement/ # 🚔 Law enforcement simulation │ ├── intelligence/ # 📊 Threat intelligence modules │ ├── templates/ # ðŸ’ŧ HTML templates │ ├── utils/ # 🔧 Utility functions │ ├── main.py # FastAPI entry point │ └── config.py # Configuration settings ├── dashboard.py # 📈 Streamlit analytics dashboard ├── simulate_attack.py # ⚔ïļ Red vs Blue simulation ├── verify_honeypot.py # ✅ System verification script ├── Dockerfile # ðŸģ Docker deployment ├── requirements.txt # ðŸ“Ķ Python dependencies └── README.md # 📖 Project documentation ``` --- ## ðŸŽŊ System Architecture Diagram ```mermaid flowchart TB subgraph Input["ðŸ“Ĩ Input Layer"] A[Scammer Message] --> B[FastAPI Routes] B --> C{API Key Valid?} C -->|No| D[401 Unauthorized] C -->|Yes| E[Rate Limiter] E -->|Exceeded| F[429 Too Many Requests] E -->|OK| G[GUVI Handler] end subgraph Orchestrator["ðŸĪ– Orchestrator Layer"] G --> H[HoneypotOrchestrator] H --> I[Scam Detector] H --> J[Intel Extractor] H --> K[Emotional Analyzer] I --> L[LLM Client] L --> M[Groq/OpenAI/Anthropic] end subgraph Response["💎 Response Generation"] I --> N[Persona Engine] N --> O[Adaptive Strategy] O --> P[Engagement Delayer] P --> Q[Response Text] end subgraph Intelligence["📊 Intelligence Layer"] J --> R[Threat Engine] K --> R R --> S[Campaign Tracker] S --> T[Risk Scorer] end subgraph Storage["ðŸ’ū Persistence Layer"] H --> U[SQLite/PostgreSQL] H --> V[Audit Logger] V --> W[SIEM Export] end subgraph Output["ðŸ“Ī Output Layer"] Q --> X[API Response] T --> X X --> Y[GUVI Callback] X --> Z[Stakeholder Exports] Z --> AA[CERT-In STIX 2.1] Z --> AB[TRAI UCC Report] Z --> AC[NPCI Fraud Report] Z --> AD[NCRP Complaint] end style Input fill:#e3f2fd style Orchestrator fill:#fff3e0 style Response fill:#e8f5e9 style Intelligence fill:#fce4ec style Storage fill:#f3e5f5 style Output fill:#e0f7fa ``` --- ## 🔄 Agent Interaction Flow ```mermaid sequenceDiagram participant S as Scammer participant API as FastAPI participant O as Orchestrator participant SD as ScamDetector participant IE as IntelExtractor participant EA as EmotionalAnalyzer participant PE as PersonaEngine participant ED as EngagementDelayer participant DB as Database participant CB as Callback S->>API: POST /api/guvi/analyze API->>API: Verify API Key API->>API: Rate Limit Check API->>O: Process Message par Detection O->>SD: Detect Scam Type O->>IE: Extract Intelligence O->>EA: Analyze Emotions end SD-->>O: {is_scam, type, confidence} IE-->>O: {phones, upis, urls} EA-->>O: {urgency, fear, greed} O->>PE: Generate Response PE->>ED: Add Delays ED-->>PE: Delayed Response PE-->>O: Victim Response O->>DB: Store Conversation O-->>API: Response Payload API-->>S: JSON Response opt Scam Confirmed API->>CB: Send to GUVI end ``` --- ## ðŸĪ– AGENTS FOLDER (`app/agents/`) The **brain** of the honeypot system. Each agent has a specific role. ### 1. `orchestrator.py` - Main Controller | Aspect | Description | |--------|-------------| | **Purpose** | Coordinates all 6 agents to process scam messages | | **What it does** | Receives message → Runs detection → Selects persona → Generates response → Computes risk → Returns result | | **Connects to** | All other agents, LLM client, memory store | | **Key class** | `HoneypotOrchestrator` | | **Key method** | `process_message(message, conversation_id)` | ### 2. `scam_detector.py` - Scam Detection Agent | Aspect | Description | |--------|-------------| | **Purpose** | Detects if a message is a scam and classifies the type | | **What it does** | Hybrid detection using keywords + LLM classification | | **Contains** | `SCAM_DATABASE` with 10 scam types (lottery, job, banking, etc.) | | **Connects to** | LLM client, orchestrator | | **Key method** | `detect(message) → {is_scam, scam_type, confidence}` | ### 3. `persona_engine.py` - Persona Agent | Aspect | Description | |--------|-------------| | **Purpose** | Generates believable victim responses to engage scammers | | **What it does** | Selects persona based on scam type, generates Hinglish/Hindi responses | | **Contains** | `PERSONAS` dict with 10 personas (Sharma Uncle, Rahul Kumar, etc.) | | **Response phases** | hook → engage → extract → stall → self_correct | | **Key method** | `generate_response(scam_type, phase, history)` | ### 4. `adaptive_strategy.py` - Strategy Agent | Aspect | Description | |--------|-------------| | **Purpose** | Adapts honeypot behavior based on scammer actions | | **What it does** | Analyzes scammer behavior, determines phase, adjusts strategy | | **Behaviors detected** | pushing_payment, building_trust, aggressive, confused | | **Connects to** | Persona engine, orchestrator | | **Key method** | `adapt_strategy(scammer_message, history)` | ### 5. `intelligence_extractor.py` - Intel Agent | Aspect | Description | |--------|-------------| | **Purpose** | Extracts actionable intelligence from messages | | **What it does** | Regex-based extraction of phone, UPI, bank, URLs | | **Connects to** | Orchestrator, threat engine | | **Key method** | `extract(message) → {phone_numbers, upi_ids, ...}` | ### 6. `conversation_manager.py` - Memory Manager | Aspect | Description | |--------|-------------| | **Purpose** | Manages multi-turn conversation state | | **What it does** | Tracks history, phase progression, trust evolution | | **Connects to** | Memory store, orchestrator | | **Key method** | `get_conversation(id), update_conversation(...)` | --- ## 🌐 API FOLDER (`app/api/`) ### 1. `routes.py` - API Endpoints | Aspect | Description | |--------|-------------| | **Purpose** | Defines all REST API endpoints | | **Key endpoints** | `/api/v1/analyze`, `/api/guvi/analyze`, `/api/v1/scam-types` | | **Security** | `verify_api_key()` with x-api-key header | | **Connects to** | Orchestrator, GUVI handler, schemas | ### 2. `schemas.py` - Pydantic Models | Aspect | Description | |--------|-------------| | **Purpose** | Request/response validation models | | **Key models** | `AnalyzeRequest`, `AnalyzeResponse`, `GUVIInputRequest`, `GUVIOutputResponse` | | **Connects to** | Routes, GUVI handler | --- ## 🧠 CORE FOLDER (`app/core/`) ### 1. `llm_client.py` - LLM Client | Aspect | Description | |--------|-------------| | **Purpose** | Unified interface to multiple LLM providers | | **Supports** | OpenAI, Anthropic, Groq, OpenRouter | | **Fallback** | Uses mock responses if no API key | | **Key method** | `generate(prompt) → response` | ### 2. `memory.py` - Conversation Memory | Aspect | Description | |--------|-------------| | **Purpose** | In-memory conversation storage | | **Contains** | `ConversationMemory` class with TTL support | | **Stores** | History, phase, trust_score, aggregated_intelligence | | **Key method** | `get_or_create(conversation_id)` | ### 3. `prompts.py` - LLM Prompts | Aspect | Description | |--------|-------------| | **Purpose** | System prompts for LLM interactions | | **Contains** | `SCAM_DETECTION_PROMPT`, `RESPONSE_GENERATION_PROMPT`, `PHASE_GOALS` | --- ## ðŸŠĪ DECOYS FOLDER (`app/decoys/`) ### 1. `fake_endpoints.py` - Decoy Portals | Aspect | Description | |--------|-------------| | **Purpose** | Fake banking/UPI pages to trap scammers | | **Endpoints** | `/decoys/upi/status`, `/decoys/bank/kyc-portal`, `/decoys/secure/otp-generate` | | **Why** | Scammers click these links thinking they're real | ### 2. `victim_profiles.py` - Synthetic Victims | Aspect | Description | |--------|-------------| | **Purpose** | Fake victim data for honeypot responses | | **Contains** | Synthetic names, bank accounts, UPI IDs | | **Why** | No real PII is ever used | --- ## 📊 INTELLIGENCE FOLDER (`app/intelligence/`) ### 1. `threat_engine.py` - Threat Intelligence | Aspect | Description | |--------|-------------| | **Purpose** | Generates threat intelligence reports | | **Creates** | Campaign IDs, IOCs, TTPs (MITRE ATT&CK) | | **Key method** | `generate_threat_intel(scam_type, entities)` | ### 2. `risk_scorer.py` - Risk Scoring | Aspect | Description | |--------|-------------| | **Purpose** | Computes weighted risk score with explainability | | **Factors** | Keywords, payment requests, threat level, campaign match | | **Key method** | `compute_risk(detection_result) → {score, explanation}` | ### 3. `campaign_tracker.py` - Campaign Clustering | Aspect | Description | |--------|-------------| | **Purpose** | Groups scam messages into campaigns | | **Uses** | Entity similarity to cluster related attacks | | **Key method** | `get_or_create_campaign(entities)` | ### 4. `telemetry.py` - Request Telemetry | Aspect | Description | |--------|-------------| | **Purpose** | Captures IP, geo, device fingerprint | | **Uses** | ip-api.com for geolocation | | **Key method** | `capture_telemetry(request)` | ### 5. `scammer_profiler.py` - Behavioral Profiling | Aspect | Description | |--------|-------------| | **Purpose** | Builds behavioral profiles of scammers | | **Tracks** | Aggression, persistence, tactics used | ### 6. `engagement_metrics.py` - Metrics Tracking | Aspect | Description | |--------|-------------| | **Purpose** | Tracks honeypot engagement statistics | | **Metrics** | Duration, message count, intelligence extracted | ### 7. `honeytokens.py` - Honeytoken Generator | Aspect | Description | |--------|-------------| | **Purpose** | Generates fake credentials as bait | | **Creates** | Fake UPI IDs, bank accounts, phone numbers | --- ## 🚔 ENFORCEMENT FOLDER (`app/enforcement/`) ### 1. `police_api.py` - Cyber Police Simulation | Aspect | Description | |--------|-------------| | **Purpose** | Simulates NCRP (cybercrime.gov.in) integration | | **Creates** | Report IDs, priority levels, recommended actions | | **Classes** | `CyberPoliceAPI`, `ActionRecommendationAPI` | ### 2. `awareness.py` - Public Awareness | Aspect | Description | |--------|-------------| | **Purpose** | Generates scam awareness content | | **Creates** | Warning messages, educational tips | --- ## 🔧 UTILS FOLDER (`app/utils/`) ### 1. `guvi_handler.py` - GUVI Format Translator | Aspect | Description | |--------|-------------| | **Purpose** | Translates GUVI format ↔ internal format | | **Why** | GUVI uses different field names (sessionId vs conversation_id) | | **Key method** | `process_guvi_message(request) → GUVIOutputResponse` | ### 2. `callback_client.py` - GUVI Callback Sender | Aspect | Description | |--------|-------------| | **Purpose** | Sends final result to GUVI evaluation endpoint | | **Endpoint** | `POST https://hackathon.guvi.in/api/updateHoneyPotFinalResult` | | **Trigger** | Auto-sends when `scamDetected = true` | ### 3. `extractors.py` - Entity Extractors | Aspect | Description | |--------|-------------| | **Purpose** | Regex patterns for entity extraction | | **Extracts** | Phone, UPI, bank account, IFSC, email, URL | ### 4. `logger.py` - Structured Logging | Aspect | Description | |--------|-------------| | **Purpose** | Consistent logging across all agents | | **Class** | `AgentLogger` | --- ## 🔗 HOW COMPONENTS CONNECT ``` ┌─────────────────────────────────────────────────────────────────────┐ │ USER REQUEST │ │ POST /api/guvi/analyze │ └──────────────────────────────┮──────────────────────────────────────┘ ▾ ┌─────────────────────────────────────────────────────────────────────┐ │ routes.py → verify_api_key() → guvi_handler.py │ └──────────────────────────────┮──────────────────────────────────────┘ ▾ ┌─────────────────────────────────────────────────────────────────────┐ │ ORCHESTRATOR (orchestrator.py) │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │ │ Scam │ │ Intel │ │ Persona │ │ Adaptive │ │ │ │ Detector │ │ Extractor │ │ Engine │ │ Strategy │ │ │ └──────┮──────┘ └──────┮──────┘ └──────┮──────┘ └──────┮──────┘ │ │ │ │ │ │ │ │ ▾ ▾ ▾ ▾ │ │ ┌─────────────────────────────────────────────────────────────┐ │ │ │ LLM CLIENT (llm_client.py) │ │ │ │ Groq / OpenAI / Anthropic / OpenRouter / Mock │ │ │ └─────────────────────────────────────────────────────────────┘ │ │ │ │ │ │ │ │ ▾ ▾ ▾ ▾ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │ │ Memory │ │ Threat │ │ Risk │ │ Campaign │ │ │ │ Store │ │ Engine │ │ Scorer │ │ Tracker │ │ │ └─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘ │ └──────────────────────────────┮──────────────────────────────────────┘ ▾ ┌─────────────────────────────────────────────────────────────────────┐ │ RESPONSE + CALLBACK │ │ GUVIOutputResponse → callback_client.py → GUVI Evaluation │ └─────────────────────────────────────────────────────────────────────┘ ``` --- ## 📊 ROOT FILES | File | Purpose | |------|---------| | `main.py` | FastAPI app entry point, startup/shutdown events | | `config.py` | Environment variables, feature flags | | `dashboard.py` | Streamlit analytics UI with live charts | | `simulate_attack.py` | Red Team vs Blue Team simulation script | | `verify_honeypot.py` | Quick verification of all endpoints | | `Dockerfile` | Container deployment for HF Spaces | | `requirements.txt` | Python dependencies | | `README.md` | Project documentation with API examples | --- ## 🔑 KEY DATA FLOWS ### 1. Message Analysis Flow ``` Message → ScamDetector → PersonaEngine → AdaptiveStrategy → Response ``` ### 2. Intelligence Flow ``` Message → IntelExtractor → ThreatEngine → CampaignTracker → Report ``` ### 3. Risk Scoring Flow ``` DetectionResult → RiskScorer → Explanation → AnalyzeResponse ``` ### 4. GUVI Callback Flow ``` ScamDetected=true → CallbackClient → hackathon.guvi.in → Evaluation ``` --- *Generated for GUVI India AI Impact Buildathon 2025*