| # ποΈ SCAM HONEYPOT - Complete Architecture Documentation |
|
|
| ## π Project Structure Overview |
|
|
| ``` |
| sentinel-scam-honeypot/ |
| βββ app/ # Main application code |
| β βββ agents/ # π€ AI Agents (brain of the system) |
| β βββ api/ # π REST API endpoints |
| β βββ core/ # π§ Core components (LLM, memory, prompts) |
| β βββ decoys/ # πͺ€ Fake endpoints to trap scammers |
| β βββ enforcement/ # π Law enforcement simulation |
| β βββ intelligence/ # π Threat intelligence modules |
| β βββ templates/ # π» HTML templates |
| β βββ utils/ # π§ Utility functions |
| β βββ main.py # FastAPI entry point |
| β βββ config.py # Configuration settings |
| βββ dashboard.py # π Streamlit analytics dashboard |
| βββ simulate_attack.py # βοΈ Red vs Blue simulation |
| βββ verify_honeypot.py # β
System verification script |
| βββ Dockerfile # π³ Docker deployment |
| βββ requirements.txt # π¦ Python dependencies |
| βββ README.md # π Project documentation |
| ``` |
|
|
| --- |
|
|
| ## π― System Architecture Diagram |
|
|
| ```mermaid |
| flowchart TB |
| subgraph Input["π₯ Input Layer"] |
| A[Scammer Message] --> B[FastAPI Routes] |
| B --> C{API Key Valid?} |
| C -->|No| D[401 Unauthorized] |
| C -->|Yes| E[Rate Limiter] |
| E -->|Exceeded| F[429 Too Many Requests] |
| E -->|OK| G[GUVI Handler] |
| end |
| |
| subgraph Orchestrator["π€ Orchestrator Layer"] |
| G --> H[HoneypotOrchestrator] |
| H --> I[Scam Detector] |
| H --> J[Intel Extractor] |
| H --> K[Emotional Analyzer] |
| I --> L[LLM Client] |
| L --> M[Groq/OpenAI/Anthropic] |
| end |
| |
| subgraph Response["π¬ Response Generation"] |
| I --> N[Persona Engine] |
| N --> O[Adaptive Strategy] |
| O --> P[Engagement Delayer] |
| P --> Q[Response Text] |
| end |
| |
| subgraph Intelligence["π Intelligence Layer"] |
| J --> R[Threat Engine] |
| K --> R |
| R --> S[Campaign Tracker] |
| S --> T[Risk Scorer] |
| end |
| |
| subgraph Storage["πΎ Persistence Layer"] |
| H --> U[SQLite/PostgreSQL] |
| H --> V[Audit Logger] |
| V --> W[SIEM Export] |
| end |
| |
| subgraph Output["π€ Output Layer"] |
| Q --> X[API Response] |
| T --> X |
| X --> Y[GUVI Callback] |
| X --> Z[Stakeholder Exports] |
| Z --> AA[CERT-In STIX 2.1] |
| Z --> AB[TRAI UCC Report] |
| Z --> AC[NPCI Fraud Report] |
| Z --> AD[NCRP Complaint] |
| end |
| |
| style Input fill:#e3f2fd |
| style Orchestrator fill:#fff3e0 |
| style Response fill:#e8f5e9 |
| style Intelligence fill:#fce4ec |
| style Storage fill:#f3e5f5 |
| style Output fill:#e0f7fa |
| ``` |
|
|
| --- |
|
|
| ## π Agent Interaction Flow |
|
|
| ```mermaid |
| sequenceDiagram |
| participant S as Scammer |
| participant API as FastAPI |
| participant O as Orchestrator |
| participant SD as ScamDetector |
| participant IE as IntelExtractor |
| participant EA as EmotionalAnalyzer |
| participant PE as PersonaEngine |
| participant ED as EngagementDelayer |
| participant DB as Database |
| participant CB as Callback |
| |
| S->>API: POST /api/guvi/analyze |
| API->>API: Verify API Key |
| API->>API: Rate Limit Check |
| API->>O: Process Message |
| |
| par Detection |
| O->>SD: Detect Scam Type |
| O->>IE: Extract Intelligence |
| O->>EA: Analyze Emotions |
| end |
| |
| SD-->>O: {is_scam, type, confidence} |
| IE-->>O: {phones, upis, urls} |
| EA-->>O: {urgency, fear, greed} |
| |
| O->>PE: Generate Response |
| PE->>ED: Add Delays |
| ED-->>PE: Delayed Response |
| PE-->>O: Victim Response |
| |
| O->>DB: Store Conversation |
| O-->>API: Response Payload |
| API-->>S: JSON Response |
| |
| opt Scam Confirmed |
| API->>CB: Send to GUVI |
| end |
| ``` |
|
|
| --- |
|
|
| ## π€ AGENTS FOLDER (`app/agents/`) |
|
|
| The **brain** of the honeypot system. Each agent has a specific role. |
|
|
| ### 1. `orchestrator.py` - Main Controller |
| | Aspect | Description | |
| |--------|-------------| |
| | **Purpose** | Coordinates all 6 agents to process scam messages | |
| | **What it does** | Receives message β Runs detection β Selects persona β Generates response β Computes risk β Returns result | |
| | **Connects to** | All other agents, LLM client, memory store | |
| | **Key class** | `HoneypotOrchestrator` | |
| | **Key method** | `process_message(message, conversation_id)` | |
|
|
| ### 2. `scam_detector.py` - Scam Detection Agent |
| | Aspect | Description | |
| |--------|-------------| |
| | **Purpose** | Detects if a message is a scam and classifies the type | |
| | **What it does** | Hybrid detection using keywords + LLM classification | |
| | **Contains** | `SCAM_DATABASE` with 10 scam types (lottery, job, banking, etc.) | |
| | **Connects to** | LLM client, orchestrator | |
| | **Key method** | `detect(message) β {is_scam, scam_type, confidence}` | |
|
|
| ### 3. `persona_engine.py` - Persona Agent |
| | Aspect | Description | |
| |--------|-------------| |
| | **Purpose** | Generates believable victim responses to engage scammers | |
| | **What it does** | Selects persona based on scam type, generates Hinglish/Hindi responses | |
| | **Contains** | `PERSONAS` dict with 10 personas (Sharma Uncle, Rahul Kumar, etc.) | |
| | **Response phases** | hook β engage β extract β stall β self_correct | |
| | **Key method** | `generate_response(scam_type, phase, history)` | |
|
|
| ### 4. `adaptive_strategy.py` - Strategy Agent |
| | Aspect | Description | |
| |--------|-------------| |
| | **Purpose** | Adapts honeypot behavior based on scammer actions | |
| | **What it does** | Analyzes scammer behavior, determines phase, adjusts strategy | |
| | **Behaviors detected** | pushing_payment, building_trust, aggressive, confused | |
| | **Connects to** | Persona engine, orchestrator | |
| | **Key method** | `adapt_strategy(scammer_message, history)` | |
| |
| ### 5. `intelligence_extractor.py` - Intel Agent |
| | Aspect | Description | |
| |--------|-------------| |
| | **Purpose** | Extracts actionable intelligence from messages | |
| | **What it does** | Regex-based extraction of phone, UPI, bank, URLs | |
| | **Connects to** | Orchestrator, threat engine | |
| | **Key method** | `extract(message) β {phone_numbers, upi_ids, ...}` | |
|
|
| ### 6. `conversation_manager.py` - Memory Manager |
| | Aspect | Description | |
| |--------|-------------| |
| | **Purpose** | Manages multi-turn conversation state | |
| | **What it does** | Tracks history, phase progression, trust evolution | |
| | **Connects to** | Memory store, orchestrator | |
| | **Key method** | `get_conversation(id), update_conversation(...)` | |
| |
| --- |
| |
| ## π API FOLDER (`app/api/`) |
| |
| ### 1. `routes.py` - API Endpoints |
| | Aspect | Description | |
| |--------|-------------| |
| | **Purpose** | Defines all REST API endpoints | |
| | **Key endpoints** | `/api/v1/analyze`, `/api/guvi/analyze`, `/api/v1/scam-types` | |
| | **Security** | `verify_api_key()` with x-api-key header | |
| | **Connects to** | Orchestrator, GUVI handler, schemas | |
| |
| ### 2. `schemas.py` - Pydantic Models |
| | Aspect | Description | |
| |--------|-------------| |
| | **Purpose** | Request/response validation models | |
| | **Key models** | `AnalyzeRequest`, `AnalyzeResponse`, `GUVIInputRequest`, `GUVIOutputResponse` | |
| | **Connects to** | Routes, GUVI handler | |
| |
| --- |
| |
| ## π§ CORE FOLDER (`app/core/`) |
| |
| ### 1. `llm_client.py` - LLM Client |
| | Aspect | Description | |
| |--------|-------------| |
| | **Purpose** | Unified interface to multiple LLM providers | |
| | **Supports** | OpenAI, Anthropic, Groq, OpenRouter | |
| | **Fallback** | Uses mock responses if no API key | |
| | **Key method** | `generate(prompt) β response` | |
|
|
| ### 2. `memory.py` - Conversation Memory |
| | Aspect | Description | |
| |--------|-------------| |
| | **Purpose** | In-memory conversation storage | |
| | **Contains** | `ConversationMemory` class with TTL support | |
| | **Stores** | History, phase, trust_score, aggregated_intelligence | |
| | **Key method** | `get_or_create(conversation_id)` | |
|
|
| ### 3. `prompts.py` - LLM Prompts |
| | Aspect | Description | |
| |--------|-------------| |
| | **Purpose** | System prompts for LLM interactions | |
| | **Contains** | `SCAM_DETECTION_PROMPT`, `RESPONSE_GENERATION_PROMPT`, `PHASE_GOALS` | |
|
|
| --- |
|
|
| ## πͺ€ DECOYS FOLDER (`app/decoys/`) |
|
|
| ### 1. `fake_endpoints.py` - Decoy Portals |
| | Aspect | Description | |
| |--------|-------------| |
| | **Purpose** | Fake banking/UPI pages to trap scammers | |
| | **Endpoints** | `/decoys/upi/status`, `/decoys/bank/kyc-portal`, `/decoys/secure/otp-generate` | |
| | **Why** | Scammers click these links thinking they're real | |
| |
| ### 2. `victim_profiles.py` - Synthetic Victims |
| | Aspect | Description | |
| |--------|-------------| |
| | **Purpose** | Fake victim data for honeypot responses | |
| | **Contains** | Synthetic names, bank accounts, UPI IDs | |
| | **Why** | No real PII is ever used | |
|
|
| --- |
|
|
| ## π INTELLIGENCE FOLDER (`app/intelligence/`) |
|
|
| ### 1. `threat_engine.py` - Threat Intelligence |
| | Aspect | Description | |
| |--------|-------------| |
| | **Purpose** | Generates threat intelligence reports | |
| | **Creates** | Campaign IDs, IOCs, TTPs (MITRE ATT&CK) | |
| | **Key method** | `generate_threat_intel(scam_type, entities)` | |
|
|
| ### 2. `risk_scorer.py` - Risk Scoring |
| | Aspect | Description | |
| |--------|-------------| |
| | **Purpose** | Computes weighted risk score with explainability | |
| | **Factors** | Keywords, payment requests, threat level, campaign match | |
| | **Key method** | `compute_risk(detection_result) β {score, explanation}` | |
| |
| ### 3. `campaign_tracker.py` - Campaign Clustering |
| | Aspect | Description | |
| |--------|-------------| |
| | **Purpose** | Groups scam messages into campaigns | |
| | **Uses** | Entity similarity to cluster related attacks | |
| | **Key method** | `get_or_create_campaign(entities)` | |
|
|
| ### 4. `telemetry.py` - Request Telemetry |
| | Aspect | Description | |
| |--------|-------------| |
| | **Purpose** | Captures IP, geo, device fingerprint | |
| | **Uses** | ip-api.com for geolocation | |
| | **Key method** | `capture_telemetry(request)` | |
|
|
| ### 5. `scammer_profiler.py` - Behavioral Profiling |
| | Aspect | Description | |
| |--------|-------------| |
| | **Purpose** | Builds behavioral profiles of scammers | |
| | **Tracks** | Aggression, persistence, tactics used | |
| |
| ### 6. `engagement_metrics.py` - Metrics Tracking |
| | Aspect | Description | |
| |--------|-------------| |
| | **Purpose** | Tracks honeypot engagement statistics | |
| | **Metrics** | Duration, message count, intelligence extracted | |
|
|
| ### 7. `honeytokens.py` - Honeytoken Generator |
| | Aspect | Description | |
| |--------|-------------| |
| | **Purpose** | Generates fake credentials as bait | |
| | **Creates** | Fake UPI IDs, bank accounts, phone numbers | |
|
|
| --- |
|
|
| ## π ENFORCEMENT FOLDER (`app/enforcement/`) |
|
|
| ### 1. `police_api.py` - Cyber Police Simulation |
| | Aspect | Description | |
| |--------|-------------| |
| | **Purpose** | Simulates NCRP (cybercrime.gov.in) integration | |
| | **Creates** | Report IDs, priority levels, recommended actions | |
| | **Classes** | `CyberPoliceAPI`, `ActionRecommendationAPI` | |
| |
| ### 2. `awareness.py` - Public Awareness |
| | Aspect | Description | |
| |--------|-------------| |
| | **Purpose** | Generates scam awareness content | |
| | **Creates** | Warning messages, educational tips | |
| |
| --- |
| |
| ## π§ UTILS FOLDER (`app/utils/`) |
| |
| ### 1. `guvi_handler.py` - GUVI Format Translator |
| | Aspect | Description | |
| |--------|-------------| |
| | **Purpose** | Translates GUVI format β internal format | |
| | **Why** | GUVI uses different field names (sessionId vs conversation_id) | |
| | **Key method** | `process_guvi_message(request) β GUVIOutputResponse` | |
| |
| ### 2. `callback_client.py` - GUVI Callback Sender |
| | Aspect | Description | |
| |--------|-------------| |
| | **Purpose** | Sends final result to GUVI evaluation endpoint | |
| | **Endpoint** | `POST https://hackathon.guvi.in/api/updateHoneyPotFinalResult` | |
| | **Trigger** | Auto-sends when `scamDetected = true` | |
|
|
| ### 3. `extractors.py` - Entity Extractors |
| | Aspect | Description | |
| |--------|-------------| |
| | **Purpose** | Regex patterns for entity extraction | |
| | **Extracts** | Phone, UPI, bank account, IFSC, email, URL | |
|
|
| ### 4. `logger.py` - Structured Logging |
| | Aspect | Description | |
| |--------|-------------| |
| | **Purpose** | Consistent logging across all agents | |
| | **Class** | `AgentLogger` | |
|
|
| --- |
|
|
| ## π HOW COMPONENTS CONNECT |
|
|
| ``` |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| β USER REQUEST β |
| β POST /api/guvi/analyze β |
| ββββββββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββ |
| βΌ |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| β routes.py β verify_api_key() β guvi_handler.py β |
| ββββββββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββ |
| βΌ |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| β ORCHESTRATOR (orchestrator.py) β |
| β βββββββββββββββ βββββββββββββββ βββββββββββββββ βββββββββββββββ β |
| β β Scam β β Intel β β Persona β β Adaptive β β |
| β β Detector β β Extractor β β Engine β β Strategy β β |
| β ββββββββ¬βββββββ ββββββββ¬βββββββ ββββββββ¬βββββββ ββββββββ¬βββββββ β |
| β β β β β β |
| β βΌ βΌ βΌ βΌ β |
| β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β |
| β β LLM CLIENT (llm_client.py) β β |
| β β Groq / OpenAI / Anthropic / OpenRouter / Mock β β |
| β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β |
| β β β β β β |
| β βΌ βΌ βΌ βΌ β |
| β βββββββββββββββ βββββββββββββββ βββββββββββββββ βββββββββββββββ β |
| β β Memory β β Threat β β Risk β β Campaign β β |
| β β Store β β Engine β β Scorer β β Tracker β β |
| β βββββββββββββββ βββββββββββββββ βββββββββββββββ βββββββββββββββ β |
| ββββββββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββ |
| βΌ |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| β RESPONSE + CALLBACK β |
| β GUVIOutputResponse β callback_client.py β GUVI Evaluation β |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| ``` |
|
|
| --- |
|
|
| ## π ROOT FILES |
|
|
| | File | Purpose | |
| |------|---------| |
| | `main.py` | FastAPI app entry point, startup/shutdown events | |
| | `config.py` | Environment variables, feature flags | |
| | `dashboard.py` | Streamlit analytics UI with live charts | |
| | `simulate_attack.py` | Red Team vs Blue Team simulation script | |
| | `verify_honeypot.py` | Quick verification of all endpoints | |
| | `Dockerfile` | Container deployment for HF Spaces | |
| | `requirements.txt` | Python dependencies | |
| | `README.md` | Project documentation with API examples | |
|
|
| --- |
|
|
| ## π KEY DATA FLOWS |
|
|
| ### 1. Message Analysis Flow |
| ``` |
| Message β ScamDetector β PersonaEngine β AdaptiveStrategy β Response |
| ``` |
|
|
| ### 2. Intelligence Flow |
| ``` |
| Message β IntelExtractor β ThreatEngine β CampaignTracker β Report |
| ``` |
|
|
| ### 3. Risk Scoring Flow |
| ``` |
| DetectionResult β RiskScorer β Explanation β AnalyzeResponse |
| ``` |
|
|
| ### 4. GUVI Callback Flow |
| ``` |
| ScamDetected=true β CallbackClient β hackathon.guvi.in β Evaluation |
| ``` |
|
|
| --- |
|
|
| *Generated for GUVI India AI Impact Buildathon 2025* |
|
|