# ๐Ÿš€ HF Deployment Pre-Flight Checklist **Target:** Hugging Face Spaces + GUVI Hackathon --- ## โœ… Required HF Secrets Set these in HF Spaces โ†’ Settings โ†’ Secrets: | Secret Name | Required | Description | |-------------|----------|-------------| | `GROQ_API_KEY` | โœ… YES | Groq API key for LLM calls | | `GUVI_API_KEY` | โœ… YES | GUVI hackathon auth key | **Optional (defaults work):** - `ENV=production` (optional, defaults to production behavior) --- ## โœ… Pre-Deploy Verification Commands Run these locally before pushing to HF: ```bash # 1. All behavioral tests pass py -m pytest scripts/fast_behavior_tests.py -v # 2. Cache optimization tests pass py -m pytest scripts/test_prompt_caching.py -v -s -k "not Live" # 3. Main app imports cleanly py -c "from app.main import app; print('โœ… OK')" # 4. Quick smoke test (start server) py -m uvicorn app.main:app --port 8000 --host 127.0.0.1 # Then test: curl http://localhost:8000/health ``` --- ## โœ… Model Mapping (Cache-Optimized) | Agent | Model | Cache Support | |-------|-------|---------------| | **Persona Replies** | `llama-3.1-8b-instant` | โŒ No | | **Intelligence Extraction** | `openai/gpt-oss-20b` | โœ… Yes | | **Safety Guard** | `openai/gpt-oss-safeguard-20b` | โœ… Yes | | **Smart Reasoning** | `moonshotai/kimi-k2-instruct-0905` | โœ… Yes | **Note:** Fast chat uses uncached model for speed. Heavy tasks use cached models for cost savings. --- ## โœ… Config Sanity Checklist | Check | Status | |-------|--------| | `DEBUG = False` in config.py | โœ… | | Mock callback URL commented out | โœ… | | No hardcoded API keys | โœ… | | No blocking `time.sleep()` | โœ… | | All retries capped at 2-5 | โœ… | --- ## โœ… GUVI Callback Readiness | Requirement | Status | |-------------|--------| | URL: `https://hackathon.guvi.in/api/updateHoneyPotFinalResult` | โœ… | | Auth: `x-api-key` header | โœ… | | Retry: 5x exponential backoff | โœ… | | Dedup: `sys_callback_sent` flag | โœ… | | Trigger: `scamDetected=True AND should_finalize=True` | โœ… | --- ## โœ… Budget Limits (Hardcoded) | Limit | Value | Enforced | |-------|-------|----------| | Max LLM calls per turn | 4 | โœ… | | Max LLM calls per session | 30 | โœ… | | Max cascade retries | 2 | โœ… | --- ## ๐Ÿงช 1-Command HF Sanity Test After deploying to HF, run this: ```bash curl -X POST "https://YOUR-SPACE.hf.space/api/v1/guvi/challenge" \ -H "Content-Type: application/json" \ -H "x-api-key: YOUR_GUVI_API_KEY" \ -d '{ "sessionId": "test-123", "message": {"text": "Hello, your bank account is blocked", "sender": "scammer"} }' ``` **Expected Response:** ```json { "status": "success", "reply": "..." } ``` --- ## ๐Ÿ† Final Deployment Commands ```bash # 1. Commit all changes git add . git commit -m "Production-ready for GUVI + HF" # 2. Push to HF git push hf main ``` --- **Last Verified:** 2026-02-03 **Score:** 53/53 (100%) Production Ready โ€” All Critical Fixes Applied