# Topic 15: Database Schema & Persistence Layer **Audit Date**: 2026-02-01 **Auditor**: Agent Antigravity **Scope**: Storage Architecture --- ## 1. Dual-Mode Architecture The system is built on **SQLAlchemy Async**, designed to run on: 1. **Development**: `SQLite` (Zero-config, local file `honeypot.db`). 2. **Production**: `PostgreSQL` (Supabase/Neon via `DATABASE_URL`). * **Switching Logic**: Automatic detection of `postgres://` prefix in `app/database/db.py`. --- ## 2. Schema Design (`models.py`) ### **A. Core Tables** | Table | Purpose | Key Columns | | :--- | :--- | :--- | | **`conversations`** | Session Registry | `id` (UUID), `scam_type`, `trust_score`, `phase`, `created_at` | | **`messages`** | Audit Log | `conversation_id` (FK), `turn`, `scammer_message`, `honeypot_response` | | **`intelligence`** | IOC Repository | `entity_type` (UPI, Phone), `entity_value`, `conversation_id` (FK) | | **`campaigns`** | Threat Clusters | `id` (Hash), `scam_type`, `message_count`, `entity_count` | | **`reports`** | Simulated Exports | `id`, `report_type` (Police/Bank), `status` | ### **B. Data Integrity** * **Foreign Keys**: `messages` and `intelligence` are strictly linked to `conversations`. Cascade delete is enabled (orphaned messages are wiped if session is deleted). * **Deduplication**: `UniqueConstraint` on `(conversation_id, entity_type, entity_value)` prevents storing the same Phone Number twice for the same chat. --- ## 3. Persistence Logic * **Session Management**: Uses `async_sessionmaker` with `expire_on_commit=False` for high-concurrency async performance. * **Connection Pooling**: * **SQLite**: `StaticPool` (Single threaded logic adapted for AsyncIO). * **Postgres**: `QueuePool` (5 cons default, max overflow 10). --- ## 4. Assessment The database layer is **Production-Ready**. It is not a JSON file or CSV hack; it is a full Relational Schema capable of handling millions of rows if deployed to Postgres.