Spaces:

pixxel-phantom
/

orbital-thruster-env

Sleeping

App Files Files Community

pixxel-phantom commited on Apr 26

Commit

d09063f

verified ·

1 Parent(s): cfa6771

Add training pipeline: SFT+GRPO notebook, multi-reward verifier, HF job script

Browse files

This view is limited to 50 files because it contains too many changes. See raw diff

Files changed (50) hide show

.agents/skills.zip +3 -0
.agents/skills/api-design/SKILL.md +523 -0
.agents/skills/backend-patterns/SKILL.md +598 -0
.agents/skills/brainstorming/SKILL.md +164 -0
.agents/skills/brainstorming/scripts/frame-template.html +214 -0
.agents/skills/brainstorming/scripts/helper.js +88 -0
.agents/skills/brainstorming/scripts/server.cjs +354 -0
.agents/skills/brainstorming/scripts/start-server.sh +148 -0
.agents/skills/brainstorming/scripts/stop-server.sh +56 -0
.agents/skills/brainstorming/spec-document-reviewer-prompt.md +49 -0
.agents/skills/brainstorming/visual-companion.md +287 -0
.agents/skills/caveman-commit/SKILL.md +65 -0
.agents/skills/caveman-help/SKILL.md +59 -0
.agents/skills/caveman-review/SKILL.md +55 -0
.agents/skills/caveman/SKILL.md +67 -0
.agents/skills/deep-research/SKILL.md +155 -0
.agents/skills/dispatching-parallel-agents/SKILL.md +182 -0
.agents/skills/documentation-lookup/SKILL.md +90 -0
.agents/skills/e2e-testing/SKILL.md +326 -0
.agents/skills/eval-harness/SKILL.md +270 -0
.agents/skills/executing-plans/SKILL.md +70 -0
.agents/skills/finishing-a-development-branch/SKILL.md +200 -0
.agents/skills/frontend-slides/SKILL.md +184 -0
.agents/skills/frontend-slides/STYLE_PRESETS.md +330 -0
.agents/skills/karpathy-guidelines/SKILL.md +67 -0
.agents/skills/openenv-cli/SKILL.md +18 -0
.agents/skills/python-testing/SKILL.md +816 -0
.agents/skills/receiving-code-review/SKILL.md +213 -0
.agents/skills/requesting-code-review/SKILL.md +105 -0
.agents/skills/requesting-code-review/code-reviewer.md +146 -0
.agents/skills/search-first/SKILL.md +161 -0
.agents/skills/security-review/SKILL.md +495 -0
.agents/skills/security-review/cloud-infrastructure-security.md +361 -0
.agents/skills/subagent-driven-development/SKILL.md +277 -0
.agents/skills/subagent-driven-development/code-quality-reviewer-prompt.md +26 -0
.agents/skills/subagent-driven-development/implementer-prompt.md +113 -0
.agents/skills/subagent-driven-development/spec-reviewer-prompt.md +61 -0
.agents/skills/systematic-debugging/CREATION-LOG.md +119 -0
.agents/skills/systematic-debugging/SKILL.md +296 -0
.agents/skills/systematic-debugging/condition-based-waiting-example.ts +158 -0
.agents/skills/systematic-debugging/condition-based-waiting.md +115 -0
.agents/skills/systematic-debugging/defense-in-depth.md +122 -0
.agents/skills/systematic-debugging/find-polluter.sh +63 -0
.agents/skills/systematic-debugging/root-cause-tracing.md +169 -0
.agents/skills/systematic-debugging/test-academic.md +14 -0
.agents/skills/systematic-debugging/test-pressure-1.md +58 -0
.agents/skills/systematic-debugging/test-pressure-2.md +68 -0
.agents/skills/systematic-debugging/test-pressure-3.md +69 -0
.agents/skills/tdd-workflow/SKILL.md +463 -0
.agents/skills/test-driven-development/SKILL.md +371 -0

.agents/skills.zip ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:393a1294a2c39475ff04ef4dc714f4b41ef669c5b591f902518c91bed9d01048
+size 194360

.agents/skills/api-design/SKILL.md ADDED Viewed

	@@ -0,0 +1,523 @@

+---
+name: api-design
+description: REST API design patterns including resource naming, status codes, pagination, filtering, error responses, versioning, and rate limiting for production APIs.
+origin: ECC
+---
+# API Design Patterns
+Conventions and best practices for designing consistent, developer-friendly REST APIs.
+## When to Activate
+- Designing new API endpoints
+- Reviewing existing API contracts
+- Adding pagination, filtering, or sorting
+- Implementing error handling for APIs
+- Planning API versioning strategy
+- Building public or partner-facing APIs
+## Resource Design
+### URL Structure
+```
+# Resources are nouns, plural, lowercase, kebab-case
+GET    /api/v1/users
+GET    /api/v1/users/:id
+POST   /api/v1/users
+PUT    /api/v1/users/:id
+PATCH  /api/v1/users/:id
+DELETE /api/v1/users/:id
+# Sub-resources for relationships
+GET    /api/v1/users/:id/orders
+POST   /api/v1/users/:id/orders
+# Actions that don't map to CRUD (use verbs sparingly)
+POST   /api/v1/orders/:id/cancel
+POST   /api/v1/auth/login
+POST   /api/v1/auth/refresh
+```
+### Naming Rules
+```
+# GOOD
+/api/v1/team-members          # kebab-case for multi-word resources
+/api/v1/orders?status=active  # query params for filtering
+/api/v1/users/123/orders      # nested resources for ownership
+# BAD
+/api/v1/getUsers              # verb in URL
+/api/v1/user                  # singular (use plural)
+/api/v1/team_members          # snake_case in URLs
+/api/v1/users/123/getOrders   # verb in nested resource
+```
+## HTTP Methods and Status Codes
+### Method Semantics
+| Method | Idempotent | Safe | Use For |
+|--------|-----------|------|---------|
+| GET | Yes | Yes | Retrieve resources |
+| POST | No | No | Create resources, trigger actions |
+| PUT | Yes | No | Full replacement of a resource |
+| PATCH | No* | No | Partial update of a resource |
+| DELETE | Yes | No | Remove a resource |
+*PATCH can be made idempotent with proper implementation
+### Status Code Reference
+```
+# Success
+200 OK                    — GET, PUT, PATCH (with response body)
+201 Created               — POST (include Location header)
+204 No Content            — DELETE, PUT (no response body)
+# Client Errors
+400 Bad Request           — Validation failure, malformed JSON
+401 Unauthorized          — Missing or invalid authentication
+403 Forbidden             — Authenticated but not authorized
+404 Not Found             — Resource doesn't exist
+409 Conflict              — Duplicate entry, state conflict
+422 Unprocessable Entity  — Semantically invalid (valid JSON, bad data)
+429 Too Many Requests     — Rate limit exceeded
+# Server Errors
+500 Internal Server Error — Unexpected failure (never expose details)
+502 Bad Gateway           — Upstream service failed
+503 Service Unavailable   — Temporary overload, include Retry-After
+```
+### Common Mistakes
+```
+# BAD: 200 for everything
+{ "status": 200, "success": false, "error": "Not found" }
+# GOOD: Use HTTP status codes semantically
+HTTP/1.1 404 Not Found
+{ "error": { "code": "not_found", "message": "User not found" } }
+# BAD: 500 for validation errors
+# GOOD: 400 or 422 with field-level details
+# BAD: 200 for created resources
+# GOOD: 201 with Location header
+HTTP/1.1 201 Created
+Location: /api/v1/users/abc-123
+```
+## Response Format
+### Success Response
+```json
+{
+  "data": {
+    "id": "abc-123",
+    "email": "alice@example.com",
+    "name": "Alice",
+    "created_at": "2025-01-15T10:30:00Z"
+  }
+}
+```
+### Collection Response (with Pagination)
+```json
+{
+  "data": [
+    { "id": "abc-123", "name": "Alice" },
+    { "id": "def-456", "name": "Bob" }
+  ],
+  "meta": {
+    "total": 142,
+    "page": 1,
+    "per_page": 20,
+    "total_pages": 8
+  },
+  "links": {
+    "self": "/api/v1/users?page=1&per_page=20",
+    "next": "/api/v1/users?page=2&per_page=20",
+    "last": "/api/v1/users?page=8&per_page=20"
+  }
+}
+```
+### Error Response
+```json
+{
+  "error": {
+    "code": "validation_error",
+    "message": "Request validation failed",
+    "details": [
+      {
+        "field": "email",
+        "message": "Must be a valid email address",
+        "code": "invalid_format"
+      },
+      {
+        "field": "age",
+        "message": "Must be between 0 and 150",
+        "code": "out_of_range"
+      }
+    ]
+  }
+}
+```
+### Response Envelope Variants
+```typescript
+// Option A: Envelope with data wrapper (recommended for public APIs)
+interface ApiResponse<T> {
+  data: T;
+  meta?: PaginationMeta;
+  links?: PaginationLinks;
+}
+interface ApiError {
+  error: {
+    code: string;
+    message: string;
+    details?: FieldError[];
+  };
+}
+// Option B: Flat response (simpler, common for internal APIs)
+// Success: just return the resource directly
+// Error: return error object
+// Distinguish by HTTP status code
+```
+## Pagination
+### Offset-Based (Simple)
+```
+GET /api/v1/users?page=2&per_page=20
+# Implementation
+SELECT * FROM users
+ORDER BY created_at DESC
+LIMIT 20 OFFSET 20;
+```
+**Pros:** Easy to implement, supports "jump to page N"
+**Cons:** Slow on large offsets (OFFSET 100000), inconsistent with concurrent inserts
+### Cursor-Based (Scalable)
+```
+GET /api/v1/users?cursor=eyJpZCI6MTIzfQ&limit=20
+# Implementation
+SELECT * FROM users
+WHERE id > :cursor_id
+ORDER BY id ASC
+LIMIT 21;  -- fetch one extra to determine has_next
+```
+```json
+{
+  "data": [...],
+  "meta": {
+    "has_next": true,
+    "next_cursor": "eyJpZCI6MTQzfQ"
+  }
+}
+```
+**Pros:** Consistent performance regardless of position, stable with concurrent inserts
+**Cons:** Cannot jump to arbitrary page, cursor is opaque
+### When to Use Which
+| Use Case | Pagination Type |
+|----------|----------------|
+| Admin dashboards, small datasets (<10K) | Offset |
+| Infinite scroll, feeds, large datasets | Cursor |
+| Public APIs | Cursor (default) with offset (optional) |
+| Search results | Offset (users expect page numbers) |
+## Filtering, Sorting, and Search
+### Filtering
+```
+# Simple equality
+GET /api/v1/orders?status=active&customer_id=abc-123
+# Comparison operators (use bracket notation)
+GET /api/v1/products?price[gte]=10&price[lte]=100
+GET /api/v1/orders?created_at[after]=2025-01-01
+# Multiple values (comma-separated)
+GET /api/v1/products?category=electronics,clothing
+# Nested fields (dot notation)
+GET /api/v1/orders?customer.country=US
+```
+### Sorting
+```
+# Single field (prefix - for descending)
+GET /api/v1/products?sort=-created_at
+# Multiple fields (comma-separated)
+GET /api/v1/products?sort=-featured,price,-created_at
+```
+### Full-Text Search
+```
+# Search query parameter
+GET /api/v1/products?q=wireless+headphones
+# Field-specific search
+GET /api/v1/users?email=alice
+```
+### Sparse Fieldsets
+```
+# Return only specified fields (reduces payload)
+GET /api/v1/users?fields=id,name,email
+GET /api/v1/orders?fields=id,total,status&include=customer.name
+```
+## Authentication and Authorization
+### Token-Based Auth
+```
+# Bearer token in Authorization header
+GET /api/v1/users
+Authorization: Bearer eyJhbGciOiJIUzI1NiIs...
+# API key (for server-to-server)
+GET /api/v1/data
+X-API-Key: sk_live_abc123
+```
+### Authorization Patterns
+```typescript
+// Resource-level: check ownership
+app.get("/api/v1/orders/:id", async (req, res) => {
+  const order = await Order.findById(req.params.id);
+  if (!order) return res.status(404).json({ error: { code: "not_found" } });
+  if (order.userId !== req.user.id) return res.status(403).json({ error: { code: "forbidden" } });
+  return res.json({ data: order });
+});
+// Role-based: check permissions
+app.delete("/api/v1/users/:id", requireRole("admin"), async (req, res) => {
+  await User.delete(req.params.id);
+  return res.status(204).send();
+});
+```
+## Rate Limiting
+### Headers
+```
+HTTP/1.1 200 OK
+X-RateLimit-Limit: 100
+X-RateLimit-Remaining: 95
+X-RateLimit-Reset: 1640000000
+# When exceeded
+HTTP/1.1 429 Too Many Requests
+Retry-After: 60
+{
+  "error": {
+    "code": "rate_limit_exceeded",
+    "message": "Rate limit exceeded. Try again in 60 seconds."
+  }
+}
+```
+### Rate Limit Tiers
+| Tier | Limit | Window | Use Case |
+|------|-------|--------|----------|
+| Anonymous | 30/min | Per IP | Public endpoints |
+| Authenticated | 100/min | Per user | Standard API access |
+| Premium | 1000/min | Per API key | Paid API plans |
+| Internal | 10000/min | Per service | Service-to-service |
+## Versioning
+### URL Path Versioning (Recommended)
+```
+/api/v1/users
+/api/v2/users
+```
+**Pros:** Explicit, easy to route, cacheable
+**Cons:** URL changes between versions
+### Header Versioning
+```
+GET /api/users
+Accept: application/vnd.myapp.v2+json
+```
+**Pros:** Clean URLs
+**Cons:** Harder to test, easy to forget
+### Versioning Strategy
+```
+1. Start with /api/v1/ — don't version until you need to
+2. Maintain at most 2 active versions (current + previous)
+3. Deprecation timeline:
+   - Announce deprecation (6 months notice for public APIs)
+   - Add Sunset header: Sunset: Sat, 01 Jan 2026 00:00:00 GMT
+   - Return 410 Gone after sunset date
+4. Non-breaking changes don't need a new version:
+   - Adding new fields to responses
+   - Adding new optional query parameters
+   - Adding new endpoints
+5. Breaking changes require a new version:
+   - Removing or renaming fields
+   - Changing field types
+   - Changing URL structure
+   - Changing authentication method
+```
+## Implementation Patterns
+### TypeScript (Next.js API Route)
+```typescript
+import { z } from "zod";
+import { NextRequest, NextResponse } from "next/server";
+const createUserSchema = z.object({
+  email: z.string().email(),
+  name: z.string().min(1).max(100),
+});
+export async function POST(req: NextRequest) {
+  const body = await req.json();
+  const parsed = createUserSchema.safeParse(body);
+  if (!parsed.success) {
+    return NextResponse.json({
+      error: {
+        code: "validation_error",
+        message: "Request validation failed",
+        details: parsed.error.issues.map(i => ({
+          field: i.path.join("."),
+          message: i.message,
+          code: i.code,
+        })),
+      },
+    }, { status: 422 });
+  }
+  const user = await createUser(parsed.data);
+  return NextResponse.json(
+    { data: user },
+    {
+      status: 201,
+      headers: { Location: `/api/v1/users/${user.id}` },
+    },
+  );
+}
+```
+### Python (Django REST Framework)
+```python
+from rest_framework import serializers, viewsets, status
+from rest_framework.response import Response
+class CreateUserSerializer(serializers.Serializer):
+    email = serializers.EmailField()
+    name = serializers.CharField(max_length=100)
+class UserSerializer(serializers.ModelSerializer):
+    class Meta:
+        model = User
+        fields = ["id", "email", "name", "created_at"]
+class UserViewSet(viewsets.ModelViewSet):
+    serializer_class = UserSerializer
+    permission_classes = [IsAuthenticated]
+    def get_serializer_class(self):
+        if self.action == "create":
+            return CreateUserSerializer
+        return UserSerializer
+    def create(self, request):
+        serializer = CreateUserSerializer(data=request.data)
+        serializer.is_valid(raise_exception=True)
+        user = UserService.create(**serializer.validated_data)
+        return Response(
+            {"data": UserSerializer(user).data},
+            status=status.HTTP_201_CREATED,
+            headers={"Location": f"/api/v1/users/{user.id}"},
+        )
+```
+### Go (net/http)
+```go
+func (h *UserHandler) CreateUser(w http.ResponseWriter, r *http.Request) {
+    var req CreateUserRequest
+    if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
+        writeError(w, http.StatusBadRequest, "invalid_json", "Invalid request body")
+        return
+    }
+    if err := req.Validate(); err != nil {
+        writeError(w, http.StatusUnprocessableEntity, "validation_error", err.Error())
+        return
+    }
+    user, err := h.service.Create(r.Context(), req)
+    if err != nil {
+        switch {
+        case errors.Is(err, domain.ErrEmailTaken):
+            writeError(w, http.StatusConflict, "email_taken", "Email already registered")
+        default:
+            writeError(w, http.StatusInternalServerError, "internal_error", "Internal error")
+        }
+        return
+    }
+    w.Header().Set("Location", fmt.Sprintf("/api/v1/users/%s", user.ID))
+    writeJSON(w, http.StatusCreated, map[string]any{"data": user})
+}
+```
+## API Design Checklist
+Before shipping a new endpoint:
+- [ ] Resource URL follows naming conventions (plural, kebab-case, no verbs)
+- [ ] Correct HTTP method used (GET for reads, POST for creates, etc.)
+- [ ] Appropriate status codes returned (not 200 for everything)
+- [ ] Input validated with schema (Zod, Pydantic, Bean Validation)
+- [ ] Error responses follow standard format with codes and messages
+- [ ] Pagination implemented for list endpoints (cursor or offset)
+- [ ] Authentication required (or explicitly marked as public)
+- [ ] Authorization checked (user can only access their own resources)
+- [ ] Rate limiting configured
+- [ ] Response does not leak internal details (stack traces, SQL errors)
+- [ ] Consistent naming with existing endpoints (camelCase vs snake_case)
+- [ ] Documented (OpenAPI/Swagger spec updated)

.agents/skills/backend-patterns/SKILL.md ADDED Viewed

	@@ -0,0 +1,598 @@

+---
+name: backend-patterns
+description: Backend architecture patterns, API design, database optimization, and server-side best practices for Node.js, Express, and Next.js API routes.
+origin: ECC
+---
+# Backend Development Patterns
+Backend architecture patterns and best practices for scalable server-side applications.
+## When to Activate
+- Designing REST or GraphQL API endpoints
+- Implementing repository, service, or controller layers
+- Optimizing database queries (N+1, indexing, connection pooling)
+- Adding caching (Redis, in-memory, HTTP cache headers)
+- Setting up background jobs or async processing
+- Structuring error handling and validation for APIs
+- Building middleware (auth, logging, rate limiting)
+## API Design Patterns
+### RESTful API Structure
+```typescript
+// PASS: Resource-based URLs
+GET    /api/markets                 # List resources
+GET    /api/markets/:id             # Get single resource
+POST   /api/markets                 # Create resource
+PUT    /api/markets/:id             # Replace resource
+PATCH  /api/markets/:id             # Update resource
+DELETE /api/markets/:id             # Delete resource
+// PASS: Query parameters for filtering, sorting, pagination
+GET /api/markets?status=active&sort=volume&limit=20&offset=0
+```
+### Repository Pattern
+```typescript
+// Abstract data access logic
+interface MarketRepository {
+  findAll(filters?: MarketFilters): Promise<Market[]>
+  findById(id: string): Promise<Market | null>
+  create(data: CreateMarketDto): Promise<Market>
+  update(id: string, data: UpdateMarketDto): Promise<Market>
+  delete(id: string): Promise<void>
+}
+class SupabaseMarketRepository implements MarketRepository {
+  async findAll(filters?: MarketFilters): Promise<Market[]> {
+    let query = supabase.from('markets').select('*')
+    if (filters?.status) {
+      query = query.eq('status', filters.status)
+    }
+    if (filters?.limit) {
+      query = query.limit(filters.limit)
+    }
+    const { data, error } = await query
+    if (error) throw new Error(error.message)
+    return data
+  }
+  // Other methods...
+}
+```
+### Service Layer Pattern
+```typescript
+// Business logic separated from data access
+class MarketService {
+  constructor(private marketRepo: MarketRepository) {}
+  async searchMarkets(query: string, limit: number = 10): Promise<Market[]> {
+    // Business logic
+    const embedding = await generateEmbedding(query)
+    const results = await this.vectorSearch(embedding, limit)
+    // Fetch full data
+    const markets = await this.marketRepo.findByIds(results.map(r => r.id))
+    // Sort by similarity
+    return markets.sort((a, b) => {
+      const scoreA = results.find(r => r.id === a.id)?.score || 0
+      const scoreB = results.find(r => r.id === b.id)?.score || 0
+      return scoreA - scoreB
+    })
+  }
+  private async vectorSearch(embedding: number[], limit: number) {
+    // Vector search implementation
+  }
+}
+```
+### Middleware Pattern
+```typescript
+// Request/response processing pipeline
+export function withAuth(handler: NextApiHandler): NextApiHandler {
+  return async (req, res) => {
+    const token = req.headers.authorization?.replace('Bearer ', '')
+    if (!token) {
+      return res.status(401).json({ error: 'Unauthorized' })
+    }
+    try {
+      const user = await verifyToken(token)
+      req.user = user
+      return handler(req, res)
+    } catch (error) {
+      return res.status(401).json({ error: 'Invalid token' })
+    }
+  }
+}
+// Usage
+export default withAuth(async (req, res) => {
+  // Handler has access to req.user
+})
+```
+## Database Patterns
+### Query Optimization
+```typescript
+// PASS: GOOD: Select only needed columns
+const { data } = await supabase
+  .from('markets')
+  .select('id, name, status, volume')
+  .eq('status', 'active')
+  .order('volume', { ascending: false })
+  .limit(10)
+// FAIL: BAD: Select everything
+const { data } = await supabase
+  .from('markets')
+  .select('*')
+```
+### N+1 Query Prevention
+```typescript
+// FAIL: BAD: N+1 query problem
+const markets = await getMarkets()
+for (const market of markets) {
+  market.creator = await getUser(market.creator_id)  // N queries
+}
+// PASS: GOOD: Batch fetch
+const markets = await getMarkets()
+const creatorIds = markets.map(m => m.creator_id)
+const creators = await getUsers(creatorIds)  // 1 query
+const creatorMap = new Map(creators.map(c => [c.id, c]))
+markets.forEach(market => {
+  market.creator = creatorMap.get(market.creator_id)
+})
+```
+### Transaction Pattern
+```typescript
+async function createMarketWithPosition(
+  marketData: CreateMarketDto,
+  positionData: CreatePositionDto
+) {
+  // Use Supabase transaction
+  const { data, error } = await supabase.rpc('create_market_with_position', {
+    market_data: marketData,
+    position_data: positionData
+  })
+  if (error) throw new Error('Transaction failed')
+  return data
+}
+// SQL function in Supabase
+CREATE OR REPLACE FUNCTION create_market_with_position(
+  market_data jsonb,
+  position_data jsonb
+)
+RETURNS jsonb
+LANGUAGE plpgsql
+AS $$
+BEGIN
+  -- Start transaction automatically
+  INSERT INTO markets VALUES (market_data);
+  INSERT INTO positions VALUES (position_data);
+  RETURN jsonb_build_object('success', true);
+EXCEPTION
+  WHEN OTHERS THEN
+    -- Rollback happens automatically
+    RETURN jsonb_build_object('success', false, 'error', SQLERRM);
+END;
+$$;
+```
+## Caching Strategies
+### Redis Caching Layer
+```typescript
+class CachedMarketRepository implements MarketRepository {
+  constructor(
+    private baseRepo: MarketRepository,
+    private redis: RedisClient
+  ) {}
+  async findById(id: string): Promise<Market | null> {
+    // Check cache first
+    const cached = await this.redis.get(`market:${id}`)
+    if (cached) {
+      return JSON.parse(cached)
+    }
+    // Cache miss - fetch from database
+    const market = await this.baseRepo.findById(id)
+    if (market) {
+      // Cache for 5 minutes
+      await this.redis.setex(`market:${id}`, 300, JSON.stringify(market))
+    }
+    return market
+  }
+  async invalidateCache(id: string): Promise<void> {
+    await this.redis.del(`market:${id}`)
+  }
+}
+```
+### Cache-Aside Pattern
+```typescript
+async function getMarketWithCache(id: string): Promise<Market> {
+  const cacheKey = `market:${id}`
+  // Try cache
+  const cached = await redis.get(cacheKey)
+  if (cached) return JSON.parse(cached)
+  // Cache miss - fetch from DB
+  const market = await db.markets.findUnique({ where: { id } })
+  if (!market) throw new Error('Market not found')
+  // Update cache
+  await redis.setex(cacheKey, 300, JSON.stringify(market))
+  return market
+}
+```
+## Error Handling Patterns
+### Centralized Error Handler
+```typescript
+class ApiError extends Error {
+  constructor(
+    public statusCode: number,
+    public message: string,
+    public isOperational = true
+  ) {
+    super(message)
+    Object.setPrototypeOf(this, ApiError.prototype)
+  }
+}
+export function errorHandler(error: unknown, req: Request): Response {
+  if (error instanceof ApiError) {
+    return NextResponse.json({
+      success: false,
+      error: error.message
+    }, { status: error.statusCode })
+  }
+  if (error instanceof z.ZodError) {
+    return NextResponse.json({
+      success: false,
+      error: 'Validation failed',
+      details: error.errors
+    }, { status: 400 })
+  }
+  // Log unexpected errors
+  console.error('Unexpected error:', error)
+  return NextResponse.json({
+    success: false,
+    error: 'Internal server error'
+  }, { status: 500 })
+}
+// Usage
+export async function GET(request: Request) {
+  try {
+    const data = await fetchData()
+    return NextResponse.json({ success: true, data })
+  } catch (error) {
+    return errorHandler(error, request)
+  }
+}
+```
+### Retry with Exponential Backoff
+```typescript
+async function fetchWithRetry<T>(
+  fn: () => Promise<T>,
+  maxRetries = 3
+): Promise<T> {
+  let lastError: Error
+  for (let i = 0; i < maxRetries; i++) {
+    try {
+      return await fn()
+    } catch (error) {
+      lastError = error as Error
+      if (i < maxRetries - 1) {
+        // Exponential backoff: 1s, 2s, 4s
+        const delay = Math.pow(2, i) * 1000
+        await new Promise(resolve => setTimeout(resolve, delay))
+      }
+    }
+  }
+  throw lastError!
+}
+// Usage
+const data = await fetchWithRetry(() => fetchFromAPI())
+```
+## Authentication & Authorization
+### JWT Token Validation
+```typescript
+import jwt from 'jsonwebtoken'
+interface JWTPayload {
+  userId: string
+  email: string
+  role: 'admin' | 'user'
+}
+export function verifyToken(token: string): JWTPayload {
+  try {
+    const payload = jwt.verify(token, process.env.JWT_SECRET!) as JWTPayload
+    return payload
+  } catch (error) {
+    throw new ApiError(401, 'Invalid token')
+  }
+}
+export async function requireAuth(request: Request) {
+  const token = request.headers.get('authorization')?.replace('Bearer ', '')
+  if (!token) {
+    throw new ApiError(401, 'Missing authorization token')
+  }
+  return verifyToken(token)
+}
+// Usage in API route
+export async function GET(request: Request) {
+  const user = await requireAuth(request)
+  const data = await getDataForUser(user.userId)
+  return NextResponse.json({ success: true, data })
+}
+```
+### Role-Based Access Control
+```typescript
+type Permission = 'read' | 'write' | 'delete' | 'admin'
+interface User {
+  id: string
+  role: 'admin' | 'moderator' | 'user'
+}
+const rolePermissions: Record<User['role'], Permission[]> = {
+  admin: ['read', 'write', 'delete', 'admin'],
+  moderator: ['read', 'write', 'delete'],
+  user: ['read', 'write']
+}
+export function hasPermission(user: User, permission: Permission): boolean {
+  return rolePermissions[user.role].includes(permission)
+}
+export function requirePermission(permission: Permission) {
+  return (handler: (request: Request, user: User) => Promise<Response>) => {
+    return async (request: Request) => {
+      const user = await requireAuth(request)
+      if (!hasPermission(user, permission)) {
+        throw new ApiError(403, 'Insufficient permissions')
+      }
+      return handler(request, user)
+    }
+  }
+}
+// Usage - HOF wraps the handler
+export const DELETE = requirePermission('delete')(
+  async (request: Request, user: User) => {
+    // Handler receives authenticated user with verified permission
+    return new Response('Deleted', { status: 200 })
+  }
+)
+```
+## Rate Limiting
+### Simple In-Memory Rate Limiter
+```typescript
+class RateLimiter {
+  private requests = new Map<string, number[]>()
+  async checkLimit(
+    identifier: string,
+    maxRequests: number,
+    windowMs: number
+  ): Promise<boolean> {
+    const now = Date.now()
+    const requests = this.requests.get(identifier) || []
+    // Remove old requests outside window
+    const recentRequests = requests.filter(time => now - time < windowMs)
+    if (recentRequests.length >= maxRequests) {
+      return false  // Rate limit exceeded
+    }
+    // Add current request
+    recentRequests.push(now)
+    this.requests.set(identifier, recentRequests)
+    return true
+  }
+}
+const limiter = new RateLimiter()
+export async function GET(request: Request) {
+  const ip = request.headers.get('x-forwarded-for') || 'unknown'
+  const allowed = await limiter.checkLimit(ip, 100, 60000)  // 100 req/min
+  if (!allowed) {
+    return NextResponse.json({
+      error: 'Rate limit exceeded'
+    }, { status: 429 })
+  }
+  // Continue with request
+}
+```
+## Background Jobs & Queues
+### Simple Queue Pattern
+```typescript
+class JobQueue<T> {
+  private queue: T[] = []
+  private processing = false
+  async add(job: T): Promise<void> {
+    this.queue.push(job)
+    if (!this.processing) {
+      this.process()
+    }
+  }
+  private async process(): Promise<void> {
+    this.processing = true
+    while (this.queue.length > 0) {
+      const job = this.queue.shift()!
+      try {
+        await this.execute(job)
+      } catch (error) {
+        console.error('Job failed:', error)
+      }
+    }
+    this.processing = false
+  }
+  private async execute(job: T): Promise<void> {
+    // Job execution logic
+  }
+}
+// Usage for indexing markets
+interface IndexJob {
+  marketId: string
+}
+const indexQueue = new JobQueue<IndexJob>()
+export async function POST(request: Request) {
+  const { marketId } = await request.json()
+  // Add to queue instead of blocking
+  await indexQueue.add({ marketId })
+  return NextResponse.json({ success: true, message: 'Job queued' })
+}
+```
+## Logging & Monitoring
+### Structured Logging
+```typescript
+interface LogContext {
+  userId?: string
+  requestId?: string
+  method?: string
+  path?: string
+  [key: string]: unknown
+}
+class Logger {
+  log(level: 'info' | 'warn' | 'error', message: string, context?: LogContext) {
+    const entry = {
+      timestamp: new Date().toISOString(),
+      level,
+      message,
+      ...context
+    }
+    console.log(JSON.stringify(entry))
+  }
+  info(message: string, context?: LogContext) {
+    this.log('info', message, context)
+  }
+  warn(message: string, context?: LogContext) {
+    this.log('warn', message, context)
+  }
+  error(message: string, error: Error, context?: LogContext) {
+    this.log('error', message, {
+      ...context,
+      error: error.message,
+      stack: error.stack
+    })
+  }
+}
+const logger = new Logger()
+// Usage
+export async function GET(request: Request) {
+  const requestId = crypto.randomUUID()
+  logger.info('Fetching markets', {
+    requestId,
+    method: 'GET',
+    path: '/api/markets'
+  })
+  try {
+    const markets = await fetchMarkets()
+    return NextResponse.json({ success: true, data: markets })
+  } catch (error) {
+    logger.error('Failed to fetch markets', error as Error, { requestId })
+    return NextResponse.json({ error: 'Internal error' }, { status: 500 })
+  }
+}
+```
+**Remember**: Backend patterns enable scalable, maintainable server-side applications. Choose patterns that fit your complexity level.

.agents/skills/brainstorming/SKILL.md ADDED Viewed

	@@ -0,0 +1,164 @@

+---
+name: brainstorming
+description: "You MUST use this before any creative work - creating features, building components, adding functionality, or modifying behavior. Explores user intent, requirements and design before implementation."
+---
+# Brainstorming Ideas Into Designs
+Help turn ideas into fully formed designs and specs through natural collaborative dialogue.
+Start by understanding the current project context, then ask questions one at a time to refine the idea. Once you understand what you're building, present the design and get user approval.
+<HARD-GATE>
+Do NOT invoke any implementation skill, write any code, scaffold any project, or take any implementation action until you have presented a design and the user has approved it. This applies to EVERY project regardless of perceived simplicity.
+</HARD-GATE>
+## Anti-Pattern: "This Is Too Simple To Need A Design"
+Every project goes through this process. A todo list, a single-function utility, a config change — all of them. "Simple" projects are where unexamined assumptions cause the most wasted work. The design can be short (a few sentences for truly simple projects), but you MUST present it and get approval.
+## Checklist
+You MUST create a task for each of these items and complete them in order:
+1. **Explore project context** — check files, docs, recent commits
+2. **Offer visual companion** (if topic will involve visual questions) — this is its own message, not combined with a clarifying question. See the Visual Companion section below.
+3. **Ask clarifying questions** — one at a time, understand purpose/constraints/success criteria
+4. **Propose 2-3 approaches** — with trade-offs and your recommendation
+5. **Present design** — in sections scaled to their complexity, get user approval after each section
+6. **Write design doc** — save to `docs/superpowers/specs/YYYY-MM-DD-<topic>-design.md` and commit
+7. **Spec self-review** — quick inline check for placeholders, contradictions, ambiguity, scope (see below)
+8. **User reviews written spec** — ask user to review the spec file before proceeding
+9. **Transition to implementation** — invoke writing-plans skill to create implementation plan
+## Process Flow
+```dot
+digraph brainstorming {
+    "Explore project context" [shape=box];
+    "Visual questions ahead?" [shape=diamond];
+    "Offer Visual Companion\n(own message, no other content)" [shape=box];
+    "Ask clarifying questions" [shape=box];
+    "Propose 2-3 approaches" [shape=box];
+    "Present design sections" [shape=box];
+    "User approves design?" [shape=diamond];
+    "Write design doc" [shape=box];
+    "Spec self-review\n(fix inline)" [shape=box];
+    "User reviews spec?" [shape=diamond];
+    "Invoke writing-plans skill" [shape=doublecircle];
+    "Explore project context" -> "Visual questions ahead?";
+    "Visual questions ahead?" -> "Offer Visual Companion\n(own message, no other content)" [label="yes"];
+    "Visual questions ahead?" -> "Ask clarifying questions" [label="no"];
+    "Offer Visual Companion\n(own message, no other content)" -> "Ask clarifying questions";
+    "Ask clarifying questions" -> "Propose 2-3 approaches";
+    "Propose 2-3 approaches" -> "Present design sections";
+    "Present design sections" -> "User approves design?";
+    "User approves design?" -> "Present design sections" [label="no, revise"];
+    "User approves design?" -> "Write design doc" [label="yes"];
+    "Write design doc" -> "Spec self-review\n(fix inline)";
+    "Spec self-review\n(fix inline)" -> "User reviews spec?";
+    "User reviews spec?" -> "Write design doc" [label="changes requested"];
+    "User reviews spec?" -> "Invoke writing-plans skill" [label="approved"];
+}
+```
+**The terminal state is invoking writing-plans.** Do NOT invoke frontend-design, mcp-builder, or any other implementation skill. The ONLY skill you invoke after brainstorming is writing-plans.
+## The Process
+**Understanding the idea:**
+- Check out the current project state first (files, docs, recent commits)
+- Before asking detailed questions, assess scope: if the request describes multiple independent subsystems (e.g., "build a platform with chat, file storage, billing, and analytics"), flag this immediately. Don't spend questions refining details of a project that needs to be decomposed first.
+- If the project is too large for a single spec, help the user decompose into sub-projects: what are the independent pieces, how do they relate, what order should they be built? Then brainstorm the first sub-project through the normal design flow. Each sub-project gets its own spec → plan → implementation cycle.
+- For appropriately-scoped projects, ask questions one at a time to refine the idea
+- Prefer multiple choice questions when possible, but open-ended is fine too
+- Only one question per message - if a topic needs more exploration, break it into multiple questions
+- Focus on understanding: purpose, constraints, success criteria
+**Exploring approaches:**
+- Propose 2-3 different approaches with trade-offs
+- Present options conversationally with your recommendation and reasoning
+- Lead with your recommended option and explain why
+**Presenting the design:**
+- Once you believe you understand what you're building, present the design
+- Scale each section to its complexity: a few sentences if straightforward, up to 200-300 words if nuanced
+- Ask after each section whether it looks right so far
+- Cover: architecture, components, data flow, error handling, testing
+- Be ready to go back and clarify if something doesn't make sense
+**Design for isolation and clarity:**
+- Break the system into smaller units that each have one clear purpose, communicate through well-defined interfaces, and can be understood and tested independently
+- For each unit, you should be able to answer: what does it do, how do you use it, and what does it depend on?
+- Can someone understand what a unit does without reading its internals? Can you change the internals without breaking consumers? If not, the boundaries need work.
+- Smaller, well-bounded units are also easier for you to work with - you reason better about code you can hold in context at once, and your edits are more reliable when files are focused. When a file grows large, that's often a signal that it's doing too much.
+**Working in existing codebases:**
+- Explore the current structure before proposing changes. Follow existing patterns.
+- Where existing code has problems that affect the work (e.g., a file that's grown too large, unclear boundaries, tangled responsibilities), include targeted improvements as part of the design - the way a good developer improves code they're working in.
+- Don't propose unrelated refactoring. Stay focused on what serves the current goal.
+## After the Design
+**Documentation:**
+- Write the validated design (spec) to `docs/superpowers/specs/YYYY-MM-DD-<topic>-design.md`
+  - (User preferences for spec location override this default)
+- Use elements-of-style:writing-clearly-and-concisely skill if available
+- Commit the design document to git
+**Spec Self-Review:**
+After writing the spec document, look at it with fresh eyes:
+1. **Placeholder scan:** Any "TBD", "TODO", incomplete sections, or vague requirements? Fix them.
+2. **Internal consistency:** Do any sections contradict each other? Does the architecture match the feature descriptions?
+3. **Scope check:** Is this focused enough for a single implementation plan, or does it need decomposition?
+4. **Ambiguity check:** Could any requirement be interpreted two different ways? If so, pick one and make it explicit.
+Fix any issues inline. No need to re-review — just fix and move on.
+**User Review Gate:**
+After the spec review loop passes, ask the user to review the written spec before proceeding:
+> "Spec written and committed to `<path>`. Please review it and let me know if you want to make any changes before we start writing out the implementation plan."
+Wait for the user's response. If they request changes, make them and re-run the spec review loop. Only proceed once the user approves.
+**Implementation:**
+- Invoke the writing-plans skill to create a detailed implementation plan
+- Do NOT invoke any other skill. writing-plans is the next step.
+## Key Principles
+- **One question at a time** - Don't overwhelm with multiple questions
+- **Multiple choice preferred** - Easier to answer than open-ended when possible
+- **YAGNI ruthlessly** - Remove unnecessary features from all designs
+- **Explore alternatives** - Always propose 2-3 approaches before settling
+- **Incremental validation** - Present design, get approval before moving on
+- **Be flexible** - Go back and clarify when something doesn't make sense
+## Visual Companion
+A browser-based companion for showing mockups, diagrams, and visual options during brainstorming. Available as a tool — not a mode. Accepting the companion means it's available for questions that benefit from visual treatment; it does NOT mean every question goes through the browser.
+**Offering the companion:** When you anticipate that upcoming questions will involve visual content (mockups, layouts, diagrams), offer it once for consent:
+> "Some of what we're working on might be easier to explain if I can show it to you in a web browser. I can put together mockups, diagrams, comparisons, and other visuals as we go. This feature is still new and can be token-intensive. Want to try it? (Requires opening a local URL)"
+**This offer MUST be its own message.** Do not combine it with clarifying questions, context summaries, or any other content. The message should contain ONLY the offer above and nothing else. Wait for the user's response before continuing. If they decline, proceed with text-only brainstorming.
+**Per-question decision:** Even after the user accepts, decide FOR EACH QUESTION whether to use the browser or the terminal. The test: **would the user understand this better by seeing it than reading it?**
+- **Use the browser** for content that IS visual — mockups, wireframes, layout comparisons, architecture diagrams, side-by-side visual designs
+- **Use the terminal** for content that is text — requirements questions, conceptual choices, tradeoff lists, A/B/C/D text options, scope decisions
+A question about a UI topic is not automatically a visual question. "What does personality mean in this context?" is a conceptual question — use the terminal. "Which wizard layout works better?" is a visual question — use the browser.
+If they agree to the companion, read the detailed guide before proceeding:
+`skills/brainstorming/visual-companion.md`

.agents/skills/brainstorming/scripts/frame-template.html ADDED Viewed

	@@ -0,0 +1,214 @@

+<!DOCTYPE html>
+<html>
+<head>
+  <meta charset="utf-8">
+  <title>Superpowers Brainstorming</title>
+  <style>
+    /*
+     * BRAINSTORM COMPANION FRAME TEMPLATE
+     *
+     * This template provides a consistent frame with:
+     * - OS-aware light/dark theming
+     * - Fixed header and selection indicator bar
+     * - Scrollable main content area
+     * - CSS helpers for common UI patterns
+     *
+     * Content is injected via placeholder comment in #claude-content.
+     */
+    * { box-sizing: border-box; margin: 0; padding: 0; }
+    html, body { height: 100%; overflow: hidden; }
+    /* ===== THEME VARIABLES ===== */
+    :root {
+      --bg-primary: #f5f5f7;
+      --bg-secondary: #ffffff;
+      --bg-tertiary: #e5e5e7;
+      --border: #d1d1d6;
+      --text-primary: #1d1d1f;
+      --text-secondary: #86868b;
+      --text-tertiary: #aeaeb2;
+      --accent: #0071e3;
+      --accent-hover: #0077ed;
+      --success: #34c759;
+      --warning: #ff9f0a;
+      --error: #ff3b30;
+      --selected-bg: #e8f4fd;
+      --selected-border: #0071e3;
+    }
+    @media (prefers-color-scheme: dark) {
+      :root {
+        --bg-primary: #1d1d1f;
+        --bg-secondary: #2d2d2f;
+        --bg-tertiary: #3d3d3f;
+        --border: #424245;
+        --text-primary: #f5f5f7;
+        --text-secondary: #86868b;
+        --text-tertiary: #636366;
+        --accent: #0a84ff;
+        --accent-hover: #409cff;
+        --selected-bg: rgba(10, 132, 255, 0.15);
+        --selected-border: #0a84ff;
+      }
+    }
+    body {
+      font-family: system-ui, -apple-system, BlinkMacSystemFont, sans-serif;
+      background: var(--bg-primary);
+      color: var(--text-primary);
+      display: flex;
+      flex-direction: column;
+      line-height: 1.5;
+    }
+    /* ===== FRAME STRUCTURE ===== */
+    .header {
+      background: var(--bg-secondary);
+      padding: 0.5rem 1.5rem;
+      display: flex;
+      justify-content: space-between;
+      align-items: center;
+      border-bottom: 1px solid var(--border);
+      flex-shrink: 0;
+    }
+    .header h1 { font-size: 0.85rem; font-weight: 500; color: var(--text-secondary); }
+    .header .status { font-size: 0.7rem; color: var(--success); display: flex; align-items: center; gap: 0.4rem; }
+    .header .status::before { content: ''; width: 6px; height: 6px; background: var(--success); border-radius: 50%; }
+    .main { flex: 1; overflow-y: auto; }
+    #claude-content { padding: 2rem; min-height: 100%; }
+    .indicator-bar {
+      background: var(--bg-secondary);
+      border-top: 1px solid var(--border);
+      padding: 0.5rem 1.5rem;
+      flex-shrink: 0;
+      text-align: center;
+    }
+    .indicator-bar span {
+      font-size: 0.75rem;
+      color: var(--text-secondary);
+    }
+    .indicator-bar .selected-text {
+      color: var(--accent);
+      font-weight: 500;
+    }
+    /* ===== TYPOGRAPHY ===== */
+    h2 { font-size: 1.5rem; font-weight: 600; margin-bottom: 0.5rem; }
+    h3 { font-size: 1.1rem; font-weight: 600; margin-bottom: 0.25rem; }
+    .subtitle { color: var(--text-secondary); margin-bottom: 1.5rem; }
+    .section { margin-bottom: 2rem; }
+    .label { font-size: 0.7rem; color: var(--text-secondary); text-transform: uppercase; letter-spacing: 0.05em; margin-bottom: 0.5rem; }
+    /* ===== OPTIONS (for A/B/C choices) ===== */
+    .options { display: flex; flex-direction: column; gap: 0.75rem; }
+    .option {
+      background: var(--bg-secondary);
+      border: 2px solid var(--border);
+      border-radius: 12px;
+      padding: 1rem 1.25rem;
+      cursor: pointer;
+      transition: all 0.15s ease;
+      display: flex;
+      align-items: flex-start;
+      gap: 1rem;
+    }
+    .option:hover { border-color: var(--accent); }
+    .option.selected { background: var(--selected-bg); border-color: var(--selected-border); }
+    .option .letter {
+      background: var(--bg-tertiary);
+      color: var(--text-secondary);
+      width: 1.75rem; height: 1.75rem;
+      border-radius: 6px;
+      display: flex; align-items: center; justify-content: center;
+      font-weight: 600; font-size: 0.85rem; flex-shrink: 0;
+    }
+    .option.selected .letter { background: var(--accent); color: white; }
+    .option .content { flex: 1; }
+    .option .content h3 { font-size: 0.95rem; margin-bottom: 0.15rem; }
+    .option .content p { color: var(--text-secondary); font-size: 0.85rem; margin: 0; }
+    /* ===== CARDS (for showing designs/mockups) ===== */
+    .cards { display: grid; grid-template-columns: repeat(auto-fit, minmax(280px, 1fr)); gap: 1rem; }
+    .card {
+      background: var(--bg-secondary);
+      border: 1px solid var(--border);
+      border-radius: 12px;
+      overflow: hidden;
+      cursor: pointer;
+      transition: all 0.15s ease;
+    }
+    .card:hover { border-color: var(--accent); transform: translateY(-2px); box-shadow: 0 4px 12px rgba(0,0,0,0.1); }
+    .card.selected { border-color: var(--selected-border); border-width: 2px; }
+    .card-image { background: var(--bg-tertiary); aspect-ratio: 16/10; display: flex; align-items: center; justify-content: center; }
+    .card-body { padding: 1rem; }
+    .card-body h3 { margin-bottom: 0.25rem; }
+    .card-body p { color: var(--text-secondary); font-size: 0.85rem; }
+    /* ===== MOCKUP CONTAINER ===== */
+    .mockup {
+      background: var(--bg-secondary);
+      border: 1px solid var(--border);
+      border-radius: 12px;
+      overflow: hidden;
+      margin-bottom: 1.5rem;
+    }
+    .mockup-header {
+      background: var(--bg-tertiary);
+      padding: 0.5rem 1rem;
+      font-size: 0.75rem;
+      color: var(--text-secondary);
+      border-bottom: 1px solid var(--border);
+    }
+    .mockup-body { padding: 1.5rem; }
+    /* ===== SPLIT VIEW (side-by-side comparison) ===== */
+    .split { display: grid; grid-template-columns: 1fr 1fr; gap: 1.5rem; }
+    @media (max-width: 700px) { .split { grid-template-columns: 1fr; } }
+    /* ===== PROS/CONS ===== */
+    .pros-cons { display: grid; grid-template-columns: 1fr 1fr; gap: 1rem; margin: 1rem 0; }
+    .pros, .cons { background: var(--bg-secondary); border-radius: 8px; padding: 1rem; }
+    .pros h4 { color: var(--success); font-size: 0.85rem; margin-bottom: 0.5rem; }
+    .cons h4 { color: var(--error); font-size: 0.85rem; margin-bottom: 0.5rem; }
+    .pros ul, .cons ul { margin-left: 1.25rem; font-size: 0.85rem; color: var(--text-secondary); }
+    .pros li, .cons li { margin-bottom: 0.25rem; }
+    /* ===== PLACEHOLDER (for mockup areas) ===== */
+    .placeholder {
+      background: var(--bg-tertiary);
+      border: 2px dashed var(--border);
+      border-radius: 8px;
+      padding: 2rem;
+      text-align: center;
+      color: var(--text-tertiary);
+    }
+    /* ===== INLINE MOCKUP ELEMENTS ===== */
+    .mock-nav { background: var(--accent); color: white; padding: 0.75rem 1rem; display: flex; gap: 1.5rem; font-size: 0.9rem; }
+    .mock-sidebar { background: var(--bg-tertiary); padding: 1rem; min-width: 180px; }
+    .mock-content { padding: 1.5rem; flex: 1; }
+    .mock-button { background: var(--accent); color: white; border: none; padding: 0.5rem 1rem; border-radius: 6px; font-size: 0.85rem; }
+    .mock-input { background: var(--bg-primary); border: 1px solid var(--border); border-radius: 6px; padding: 0.5rem; width: 100%; }
+  </style>
+</head>
+<body>
+  <div class="header">
+    <h1><a href="https://github.com/obra/superpowers" style="color: inherit; text-decoration: none;">Superpowers Brainstorming</a></h1>
+    <div class="status">Connected</div>
+  </div>
+  <div class="main">
+    <div id="claude-content">
+      <!-- CONTENT -->
+    </div>
+  </div>
+  <div class="indicator-bar">
+    <span id="indicator-text">Click an option above, then return to the terminal</span>
+  </div>
+</body>
+</html>

.agents/skills/brainstorming/scripts/helper.js ADDED Viewed

	@@ -0,0 +1,88 @@

+(function() {
+  const WS_URL = 'ws://' + window.location.host;
+  let ws = null;
+  let eventQueue = [];
+  function connect() {
+    ws = new WebSocket(WS_URL);
+    ws.onopen = () => {
+      eventQueue.forEach(e => ws.send(JSON.stringify(e)));
+      eventQueue = [];
+    };
+    ws.onmessage = (msg) => {
+      const data = JSON.parse(msg.data);
+      if (data.type === 'reload') {
+        window.location.reload();
+      }
+    };
+    ws.onclose = () => {
+      setTimeout(connect, 1000);
+    };
+  }
+  function sendEvent(event) {
+    event.timestamp = Date.now();
+    if (ws && ws.readyState === WebSocket.OPEN) {
+      ws.send(JSON.stringify(event));
+    } else {
+      eventQueue.push(event);
+    }
+  }
+  // Capture clicks on choice elements
+  document.addEventListener('click', (e) => {
+    const target = e.target.closest('[data-choice]');
+    if (!target) return;
+    sendEvent({
+      type: 'click',
+      text: target.textContent.trim(),
+      choice: target.dataset.choice,
+      id: target.id || null
+    });
+    // Update indicator bar (defer so toggleSelect runs first)
+    setTimeout(() => {
+      const indicator = document.getElementById('indicator-text');
+      if (!indicator) return;
+      const container = target.closest('.options') || target.closest('.cards');
+      const selected = container ? container.querySelectorAll('.selected') : [];
+      if (selected.length === 0) {
+        indicator.textContent = 'Click an option above, then return to the terminal';
+      } else if (selected.length === 1) {
+        const label = selected[0].querySelector('h3, .content h3, .card-body h3')?.textContent?.trim() || selected[0].dataset.choice;
+        indicator.innerHTML = '<span class="selected-text">' + label + ' selected</span> — return to terminal to continue';
+      } else {
+        indicator.innerHTML = '<span class="selected-text">' + selected.length + ' selected</span> — return to terminal to continue';
+      }
+    }, 0);
+  });
+  // Frame UI: selection tracking
+  window.selectedChoice = null;
+  window.toggleSelect = function(el) {
+    const container = el.closest('.options') || el.closest('.cards');
+    const multi = container && container.dataset.multiselect !== undefined;
+    if (container && !multi) {
+      container.querySelectorAll('.option, .card').forEach(o => o.classList.remove('selected'));
+    }
+    if (multi) {
+      el.classList.toggle('selected');
+    } else {
+      el.classList.add('selected');
+    }
+    window.selectedChoice = el.dataset.choice;
+  };
+  // Expose API for explicit use
+  window.brainstorm = {
+    send: sendEvent,
+    choice: (value, metadata = {}) => sendEvent({ type: 'choice', value, ...metadata })
+  };
+  connect();
+})();

.agents/skills/brainstorming/scripts/server.cjs ADDED Viewed

	@@ -0,0 +1,354 @@

+const crypto = require('crypto');
+const http = require('http');
+const fs = require('fs');
+const path = require('path');
+// ========== WebSocket Protocol (RFC 6455) ==========
+const OPCODES = { TEXT: 0x01, CLOSE: 0x08, PING: 0x09, PONG: 0x0A };
+const WS_MAGIC = '258EAFA5-E914-47DA-95CA-C5AB0DC85B11';
+function computeAcceptKey(clientKey) {
+  return crypto.createHash('sha1').update(clientKey + WS_MAGIC).digest('base64');
+}
+function encodeFrame(opcode, payload) {
+  const fin = 0x80;
+  const len = payload.length;
+  let header;
+  if (len < 126) {
+    header = Buffer.alloc(2);
+    header[0] = fin | opcode;
+    header[1] = len;
+  } else if (len < 65536) {
+    header = Buffer.alloc(4);
+    header[0] = fin | opcode;
+    header[1] = 126;
+    header.writeUInt16BE(len, 2);
+  } else {
+    header = Buffer.alloc(10);
+    header[0] = fin | opcode;
+    header[1] = 127;
+    header.writeBigUInt64BE(BigInt(len), 2);
+  }
+  return Buffer.concat([header, payload]);
+}
+function decodeFrame(buffer) {
+  if (buffer.length < 2) return null;
+  const secondByte = buffer[1];
+  const opcode = buffer[0] & 0x0F;
+  const masked = (secondByte & 0x80) !== 0;
+  let payloadLen = secondByte & 0x7F;
+  let offset = 2;
+  if (!masked) throw new Error('Client frames must be masked');
+  if (payloadLen === 126) {
+    if (buffer.length < 4) return null;
+    payloadLen = buffer.readUInt16BE(2);
+    offset = 4;
+  } else if (payloadLen === 127) {
+    if (buffer.length < 10) return null;
+    payloadLen = Number(buffer.readBigUInt64BE(2));
+    offset = 10;
+  }
+  const maskOffset = offset;
+  const dataOffset = offset + 4;
+  const totalLen = dataOffset + payloadLen;
+  if (buffer.length < totalLen) return null;
+  const mask = buffer.slice(maskOffset, dataOffset);
+  const data = Buffer.alloc(payloadLen);
+  for (let i = 0; i < payloadLen; i++) {
+    data[i] = buffer[dataOffset + i] ^ mask[i % 4];
+  }
+  return { opcode, payload: data, bytesConsumed: totalLen };
+}
+// ========== Configuration ==========
+const PORT = process.env.BRAINSTORM_PORT || (49152 + Math.floor(Math.random() * 16383));
+const HOST = process.env.BRAINSTORM_HOST || '127.0.0.1';
+const URL_HOST = process.env.BRAINSTORM_URL_HOST || (HOST === '127.0.0.1' ? 'localhost' : HOST);
+const SESSION_DIR = process.env.BRAINSTORM_DIR || '/tmp/brainstorm';
+const CONTENT_DIR = path.join(SESSION_DIR, 'content');
+const STATE_DIR = path.join(SESSION_DIR, 'state');
+let ownerPid = process.env.BRAINSTORM_OWNER_PID ? Number(process.env.BRAINSTORM_OWNER_PID) : null;
+const MIME_TYPES = {
+  '.html': 'text/html', '.css': 'text/css', '.js': 'application/javascript',
+  '.json': 'application/json', '.png': 'image/png', '.jpg': 'image/jpeg',
+  '.jpeg': 'image/jpeg', '.gif': 'image/gif', '.svg': 'image/svg+xml'
+};
+// ========== Templates and Constants ==========
+const WAITING_PAGE = `<!DOCTYPE html>
+<html>
+<head><meta charset="utf-8"><title>Brainstorm Companion</title>
+<style>body { font-family: system-ui, sans-serif; padding: 2rem; max-width: 800px; margin: 0 auto; }
+h1 { color: #333; } p { color: #666; }</style>
+</head>
+<body><h1>Brainstorm Companion</h1>
+<p>Waiting for the agent to push a screen...</p></body></html>`;
+const frameTemplate = fs.readFileSync(path.join(__dirname, 'frame-template.html'), 'utf-8');
+const helperScript = fs.readFileSync(path.join(__dirname, 'helper.js'), 'utf-8');
+const helperInjection = '<script>\n' + helperScript + '\n</script>';
+// ========== Helper Functions ==========
+function isFullDocument(html) {
+  const trimmed = html.trimStart().toLowerCase();
+  return trimmed.startsWith('<!doctype') || trimmed.startsWith('<html');
+}
+function wrapInFrame(content) {
+  return frameTemplate.replace('<!-- CONTENT -->', content);
+}
+function getNewestScreen() {
+  const files = fs.readdirSync(CONTENT_DIR)
+    .filter(f => f.endsWith('.html'))
+    .map(f => {
+      const fp = path.join(CONTENT_DIR, f);
+      return { path: fp, mtime: fs.statSync(fp).mtime.getTime() };
+    })
+    .sort((a, b) => b.mtime - a.mtime);
+  return files.length > 0 ? files[0].path : null;
+}
+// ========== HTTP Request Handler ==========
+function handleRequest(req, res) {
+  touchActivity();
+  if (req.method === 'GET' && req.url === '/') {
+    const screenFile = getNewestScreen();
+    let html = screenFile
+      ? (raw => isFullDocument(raw) ? raw : wrapInFrame(raw))(fs.readFileSync(screenFile, 'utf-8'))
+      : WAITING_PAGE;
+    if (html.includes('</body>')) {
+      html = html.replace('</body>', helperInjection + '\n</body>');
+    } else {
+      html += helperInjection;
+    }
+    res.writeHead(200, { 'Content-Type': 'text/html; charset=utf-8' });
+    res.end(html);
+  } else if (req.method === 'GET' && req.url.startsWith('/files/')) {
+    const fileName = req.url.slice(7);
+    const filePath = path.join(CONTENT_DIR, path.basename(fileName));
+    if (!fs.existsSync(filePath)) {
+      res.writeHead(404);
+      res.end('Not found');
+      return;
+    }
+    const ext = path.extname(filePath).toLowerCase();
+    const contentType = MIME_TYPES[ext] || 'application/octet-stream';
+    res.writeHead(200, { 'Content-Type': contentType });
+    res.end(fs.readFileSync(filePath));
+  } else {
+    res.writeHead(404);
+    res.end('Not found');
+  }
+}
+// ========== WebSocket Connection Handling ==========
+const clients = new Set();
+function handleUpgrade(req, socket) {
+  const key = req.headers['sec-websocket-key'];
+  if (!key) { socket.destroy(); return; }
+  const accept = computeAcceptKey(key);
+  socket.write(
+    'HTTP/1.1 101 Switching Protocols\r\n' +
+    'Upgrade: websocket\r\n' +
+    'Connection: Upgrade\r\n' +
+    'Sec-WebSocket-Accept: ' + accept + '\r\n\r\n'
+  );
+  let buffer = Buffer.alloc(0);
+  clients.add(socket);
+  socket.on('data', (chunk) => {
+    buffer = Buffer.concat([buffer, chunk]);
+    while (buffer.length > 0) {
+      let result;
+      try {
+        result = decodeFrame(buffer);
+      } catch (e) {
+        socket.end(encodeFrame(OPCODES.CLOSE, Buffer.alloc(0)));
+        clients.delete(socket);
+        return;
+      }
+      if (!result) break;
+      buffer = buffer.slice(result.bytesConsumed);
+      switch (result.opcode) {
+        case OPCODES.TEXT:
+          handleMessage(result.payload.toString());
+          break;
+        case OPCODES.CLOSE:
+          socket.end(encodeFrame(OPCODES.CLOSE, Buffer.alloc(0)));
+          clients.delete(socket);
+          return;
+        case OPCODES.PING:
+          socket.write(encodeFrame(OPCODES.PONG, result.payload));
+          break;
+        case OPCODES.PONG:
+          break;
+        default: {
+          const closeBuf = Buffer.alloc(2);
+          closeBuf.writeUInt16BE(1003);
+          socket.end(encodeFrame(OPCODES.CLOSE, closeBuf));
+          clients.delete(socket);
+          return;
+        }
+      }
+    }
+  });
+  socket.on('close', () => clients.delete(socket));
+  socket.on('error', () => clients.delete(socket));
+}
+function handleMessage(text) {
+  let event;
+  try {
+    event = JSON.parse(text);
+  } catch (e) {
+    console.error('Failed to parse WebSocket message:', e.message);
+    return;
+  }
+  touchActivity();
+  console.log(JSON.stringify({ source: 'user-event', ...event }));
+  if (event.choice) {
+    const eventsFile = path.join(STATE_DIR, 'events');
+    fs.appendFileSync(eventsFile, JSON.stringify(event) + '\n');
+  }
+}
+function broadcast(msg) {
+  const frame = encodeFrame(OPCODES.TEXT, Buffer.from(JSON.stringify(msg)));
+  for (const socket of clients) {
+    try { socket.write(frame); } catch (e) { clients.delete(socket); }
+  }
+}
+// ========== Activity Tracking ==========
+const IDLE_TIMEOUT_MS = 30 * 60 * 1000; // 30 minutes
+let lastActivity = Date.now();
+function touchActivity() {
+  lastActivity = Date.now();
+}
+// ========== File Watching ==========
+const debounceTimers = new Map();
+// ========== Server Startup ==========
+function startServer() {
+  if (!fs.existsSync(CONTENT_DIR)) fs.mkdirSync(CONTENT_DIR, { recursive: true });
+  if (!fs.existsSync(STATE_DIR)) fs.mkdirSync(STATE_DIR, { recursive: true });
+  // Track known files to distinguish new screens from updates.
+  // macOS fs.watch reports 'rename' for both new files and overwrites,
+  // so we can't rely on eventType alone.
+  const knownFiles = new Set(
+    fs.readdirSync(CONTENT_DIR).filter(f => f.endsWith('.html'))
+  );
+  const server = http.createServer(handleRequest);
+  server.on('upgrade', handleUpgrade);
+  const watcher = fs.watch(CONTENT_DIR, (eventType, filename) => {
+    if (!filename || !filename.endsWith('.html')) return;
+    if (debounceTimers.has(filename)) clearTimeout(debounceTimers.get(filename));
+    debounceTimers.set(filename, setTimeout(() => {
+      debounceTimers.delete(filename);
+      const filePath = path.join(CONTENT_DIR, filename);
+      if (!fs.existsSync(filePath)) return; // file was deleted
+      touchActivity();
+      if (!knownFiles.has(filename)) {
+        knownFiles.add(filename);
+        const eventsFile = path.join(STATE_DIR, 'events');
+        if (fs.existsSync(eventsFile)) fs.unlinkSync(eventsFile);
+        console.log(JSON.stringify({ type: 'screen-added', file: filePath }));
+      } else {
+        console.log(JSON.stringify({ type: 'screen-updated', file: filePath }));
+      }
+      broadcast({ type: 'reload' });
+    }, 100));
+  });
+  watcher.on('error', (err) => console.error('fs.watch error:', err.message));
+  function shutdown(reason) {
+    console.log(JSON.stringify({ type: 'server-stopped', reason }));
+    const infoFile = path.join(STATE_DIR, 'server-info');
+    if (fs.existsSync(infoFile)) fs.unlinkSync(infoFile);
+    fs.writeFileSync(
+      path.join(STATE_DIR, 'server-stopped'),
+      JSON.stringify({ reason, timestamp: Date.now() }) + '\n'
+    );
+    watcher.close();
+    clearInterval(lifecycleCheck);
+    server.close(() => process.exit(0));
+  }
+  function ownerAlive() {
+    if (!ownerPid) return true;
+    try { process.kill(ownerPid, 0); return true; } catch (e) { return e.code === 'EPERM'; }
+  }
+  // Check every 60s: exit if owner process died or idle for 30 minutes
+  const lifecycleCheck = setInterval(() => {
+    if (!ownerAlive()) shutdown('owner process exited');
+    else if (Date.now() - lastActivity > IDLE_TIMEOUT_MS) shutdown('idle timeout');
+  }, 60 * 1000);
+  lifecycleCheck.unref();
+  // Validate owner PID at startup. If it's already dead, the PID resolution
+  // was wrong (common on WSL, Tailscale SSH, and cross-user scenarios).
+  // Disable monitoring and rely on the idle timeout instead.
+  if (ownerPid) {
+    try { process.kill(ownerPid, 0); }
+    catch (e) {
+      if (e.code !== 'EPERM') {
+        console.log(JSON.stringify({ type: 'owner-pid-invalid', pid: ownerPid, reason: 'dead at startup' }));
+        ownerPid = null;
+      }
+    }
+  }
+  server.listen(PORT, HOST, () => {
+    const info = JSON.stringify({
+      type: 'server-started', port: Number(PORT), host: HOST,
+      url_host: URL_HOST, url: 'http://' + URL_HOST + ':' + PORT,
+      screen_dir: CONTENT_DIR, state_dir: STATE_DIR
+    });
+    console.log(info);
+    fs.writeFileSync(path.join(STATE_DIR, 'server-info'), info + '\n');
+  });
+}
+if (require.main === module) {
+  startServer();
+}
+module.exports = { computeAcceptKey, encodeFrame, decodeFrame, OPCODES };

.agents/skills/brainstorming/scripts/start-server.sh ADDED Viewed

	@@ -0,0 +1,148 @@

+#!/usr/bin/env bash
+# Start the brainstorm server and output connection info
+# Usage: start-server.sh [--project-dir <path>] [--host <bind-host>] [--url-host <display-host>] [--foreground] [--background]
+#
+# Starts server on a random high port, outputs JSON with URL.
+# Each session gets its own directory to avoid conflicts.
+#
+# Options:
+#   --project-dir <path>  Store session files under <path>/.superpowers/brainstorm/
+#                         instead of /tmp. Files persist after server stops.
+#   --host <bind-host>    Host/interface to bind (default: 127.0.0.1).
+#                         Use 0.0.0.0 in remote/containerized environments.
+#   --url-host <host>     Hostname shown in returned URL JSON.
+#   --foreground          Run server in the current terminal (no backgrounding).
+#   --background          Force background mode (overrides Codex auto-foreground).
+SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
+# Parse arguments
+PROJECT_DIR=""
+FOREGROUND="false"
+FORCE_BACKGROUND="false"
+BIND_HOST="127.0.0.1"
+URL_HOST=""
+while [[ $# -gt 0 ]]; do
+  case "$1" in
+    --project-dir)
+      PROJECT_DIR="$2"
+      shift 2
+      ;;
+    --host)
+      BIND_HOST="$2"
+      shift 2
+      ;;
+    --url-host)
+      URL_HOST="$2"
+      shift 2
+      ;;
+    --foreground|--no-daemon)
+      FOREGROUND="true"
+      shift
+      ;;
+    --background|--daemon)
+      FORCE_BACKGROUND="true"
+      shift
+      ;;
+    *)
+      echo "{\"error\": \"Unknown argument: $1\"}"
+      exit 1
+      ;;
+  esac
+done
+if [[ -z "$URL_HOST" ]]; then
+  if [[ "$BIND_HOST" == "127.0.0.1" || "$BIND_HOST" == "localhost" ]]; then
+    URL_HOST="localhost"
+  else
+    URL_HOST="$BIND_HOST"
+  fi
+fi
+# Some environments reap detached/background processes. Auto-foreground when detected.
+if [[ -n "${CODEX_CI:-}" && "$FOREGROUND" != "true" && "$FORCE_BACKGROUND" != "true" ]]; then
+  FOREGROUND="true"
+fi
+# Windows/Git Bash reaps nohup background processes. Auto-foreground when detected.
+if [[ "$FOREGROUND" != "true" && "$FORCE_BACKGROUND" != "true" ]]; then
+  case "${OSTYPE:-}" in
+    msys*|cygwin*|mingw*) FOREGROUND="true" ;;
+  esac
+  if [[ -n "${MSYSTEM:-}" ]]; then
+    FOREGROUND="true"
+  fi
+fi
+# Generate unique session directory
+SESSION_ID="$$-$(date +%s)"
+if [[ -n "$PROJECT_DIR" ]]; then
+  SESSION_DIR="${PROJECT_DIR}/.superpowers/brainstorm/${SESSION_ID}"
+else
+  SESSION_DIR="/tmp/brainstorm-${SESSION_ID}"
+fi
+STATE_DIR="${SESSION_DIR}/state"
+PID_FILE="${STATE_DIR}/server.pid"
+LOG_FILE="${STATE_DIR}/server.log"
+# Create fresh session directory with content and state peers
+mkdir -p "${SESSION_DIR}/content" "$STATE_DIR"
+# Kill any existing server
+if [[ -f "$PID_FILE" ]]; then
+  old_pid=$(cat "$PID_FILE")
+  kill "$old_pid" 2>/dev/null
+  rm -f "$PID_FILE"
+fi
+cd "$SCRIPT_DIR"
+# Resolve the harness PID (grandparent of this script).
+# $PPID is the ephemeral shell the harness spawned to run us — it dies
+# when this script exits. The harness itself is $PPID's parent.
+OWNER_PID="$(ps -o ppid= -p "$PPID" 2>/dev/null | tr -d ' ')"
+if [[ -z "$OWNER_PID" || "$OWNER_PID" == "1" ]]; then
+  OWNER_PID="$PPID"
+fi
+# Foreground mode for environments that reap detached/background processes.
+if [[ "$FOREGROUND" == "true" ]]; then
+  echo "$$" > "$PID_FILE"
+  env BRAINSTORM_DIR="$SESSION_DIR" BRAINSTORM_HOST="$BIND_HOST" BRAINSTORM_URL_HOST="$URL_HOST" BRAINSTORM_OWNER_PID="$OWNER_PID" node server.cjs
+  exit $?
+fi
+# Start server, capturing output to log file
+# Use nohup to survive shell exit; disown to remove from job table
+nohup env BRAINSTORM_DIR="$SESSION_DIR" BRAINSTORM_HOST="$BIND_HOST" BRAINSTORM_URL_HOST="$URL_HOST" BRAINSTORM_OWNER_PID="$OWNER_PID" node server.cjs > "$LOG_FILE" 2>&1 &
+SERVER_PID=$!
+disown "$SERVER_PID" 2>/dev/null
+echo "$SERVER_PID" > "$PID_FILE"
+# Wait for server-started message (check log file)
+for i in {1..50}; do
+  if grep -q "server-started" "$LOG_FILE" 2>/dev/null; then
+    # Verify server is still alive after a short window (catches process reapers)
+    alive="true"
+    for _ in {1..20}; do
+      if ! kill -0 "$SERVER_PID" 2>/dev/null; then
+        alive="false"
+        break
+      fi
+      sleep 0.1
+    done
+    if [[ "$alive" != "true" ]]; then
+      echo "{\"error\": \"Server started but was killed. Retry in a persistent terminal with: $SCRIPT_DIR/start-server.sh${PROJECT_DIR:+ --project-dir $PROJECT_DIR} --host $BIND_HOST --url-host $URL_HOST --foreground\"}"
+      exit 1
+    fi
+    grep "server-started" "$LOG_FILE" | head -1
+    exit 0
+  fi
+  sleep 0.1
+done
+# Timeout - server didn't start
+echo '{"error": "Server failed to start within 5 seconds"}'
+exit 1

.agents/skills/brainstorming/scripts/stop-server.sh ADDED Viewed

	@@ -0,0 +1,56 @@

+#!/usr/bin/env bash
+# Stop the brainstorm server and clean up
+# Usage: stop-server.sh <session_dir>
+#
+# Kills the server process. Only deletes session directory if it's
+# under /tmp (ephemeral). Persistent directories (.superpowers/) are
+# kept so mockups can be reviewed later.
+SESSION_DIR="$1"
+if [[ -z "$SESSION_DIR" ]]; then
+  echo '{"error": "Usage: stop-server.sh <session_dir>"}'
+  exit 1
+fi
+STATE_DIR="${SESSION_DIR}/state"
+PID_FILE="${STATE_DIR}/server.pid"
+if [[ -f "$PID_FILE" ]]; then
+  pid=$(cat "$PID_FILE")
+  # Try to stop gracefully, fallback to force if still alive
+  kill "$pid" 2>/dev/null || true
+  # Wait for graceful shutdown (up to ~2s)
+  for i in {1..20}; do
+    if ! kill -0 "$pid" 2>/dev/null; then
+      break
+    fi
+    sleep 0.1
+  done
+  # If still running, escalate to SIGKILL
+  if kill -0 "$pid" 2>/dev/null; then
+    kill -9 "$pid" 2>/dev/null || true
+    # Give SIGKILL a moment to take effect
+    sleep 0.1
+  fi
+  if kill -0 "$pid" 2>/dev/null; then
+    echo '{"status": "failed", "error": "process still running"}'
+    exit 1
+  fi
+  rm -f "$PID_FILE" "${STATE_DIR}/server.log"
+  # Only delete ephemeral /tmp directories
+  if [[ "$SESSION_DIR" == /tmp/* ]]; then
+    rm -rf "$SESSION_DIR"
+  fi
+  echo '{"status": "stopped"}'
+else
+  echo '{"status": "not_running"}'
+fi

.agents/skills/brainstorming/spec-document-reviewer-prompt.md ADDED Viewed

	@@ -0,0 +1,49 @@

+# Spec Document Reviewer Prompt Template
+Use this template when dispatching a spec document reviewer subagent.
+**Purpose:** Verify the spec is complete, consistent, and ready for implementation planning.
+**Dispatch after:** Spec document is written to docs/superpowers/specs/
+```
+Task tool (general-purpose):
+  description: "Review spec document"
+  prompt: |
+    You are a spec document reviewer. Verify this spec is complete and ready for planning.
+    **Spec to review:** [SPEC_FILE_PATH]
+    ## What to Check
+    | Category | What to Look For |
+    |----------|------------------|
+    | Completeness | TODOs, placeholders, "TBD", incomplete sections |
+    | Consistency | Internal contradictions, conflicting requirements |
+    | Clarity | Requirements ambiguous enough to cause someone to build the wrong thing |
+    | Scope | Focused enough for a single plan — not covering multiple independent subsystems |
+    | YAGNI | Unrequested features, over-engineering |
+    ## Calibration
+    **Only flag issues that would cause real problems during implementation planning.**
+    A missing section, a contradiction, or a requirement so ambiguous it could be
+    interpreted two different ways — those are issues. Minor wording improvements,
+    stylistic preferences, and "sections less detailed than others" are not.
+    Approve unless there are serious gaps that would lead to a flawed plan.
+    ## Output Format
+    ## Spec Review
+    **Status:** Approved | Issues Found
+    **Issues (if any):**
+    - [Section X]: [specific issue] - [why it matters for planning]
+    **Recommendations (advisory, do not block approval):**
+    - [suggestions for improvement]
+```
+**Reviewer returns:** Status, Issues (if any), Recommendations

.agents/skills/brainstorming/visual-companion.md ADDED Viewed

	@@ -0,0 +1,287 @@

+# Visual Companion Guide
+Browser-based visual brainstorming companion for showing mockups, diagrams, and options.
+## When to Use
+Decide per-question, not per-session. The test: **would the user understand this better by seeing it than reading it?**
+**Use the browser** when the content itself is visual:
+- **UI mockups** — wireframes, layouts, navigation structures, component designs
+- **Architecture diagrams** — system components, data flow, relationship maps
+- **Side-by-side visual comparisons** — comparing two layouts, two color schemes, two design directions
+- **Design polish** — when the question is about look and feel, spacing, visual hierarchy
+- **Spatial relationships** — state machines, flowcharts, entity relationships rendered as diagrams
+**Use the terminal** when the content is text or tabular:
+- **Requirements and scope questions** — "what does X mean?", "which features are in scope?"
+- **Conceptual A/B/C choices** — picking between approaches described in words
+- **Tradeoff lists** — pros/cons, comparison tables
+- **Technical decisions** — API design, data modeling, architectural approach selection
+- **Clarifying questions** — anything where the answer is words, not a visual preference
+A question *about* a UI topic is not automatically a visual question. "What kind of wizard do you want?" is conceptual — use the terminal. "Which of these wizard layouts feels right?" is visual — use the browser.
+## How It Works
+The server watches a directory for HTML files and serves the newest one to the browser. You write HTML content to `screen_dir`, the user sees it in their browser and can click to select options. Selections are recorded to `state_dir/events` that you read on your next turn.
+**Content fragments vs full documents:** If your HTML file starts with `<!DOCTYPE` or `<html`, the server serves it as-is (just injects the helper script). Otherwise, the server automatically wraps your content in the frame template — adding the header, CSS theme, selection indicator, and all interactive infrastructure. **Write content fragments by default.** Only write full documents when you need complete control over the page.
+## Starting a Session
+```bash
+# Start server with persistence (mockups saved to project)
+scripts/start-server.sh --project-dir /path/to/project
+# Returns: {"type":"server-started","port":52341,"url":"http://localhost:52341",
+#           "screen_dir":"/path/to/project/.superpowers/brainstorm/12345-1706000000/content",
+#           "state_dir":"/path/to/project/.superpowers/brainstorm/12345-1706000000/state"}
+```
+Save `screen_dir` and `state_dir` from the response. Tell user to open the URL.
+**Finding connection info:** The server writes its startup JSON to `$STATE_DIR/server-info`. If you launched the server in the background and didn't capture stdout, read that file to get the URL and port. When using `--project-dir`, check `<project>/.superpowers/brainstorm/` for the session directory.
+**Note:** Pass the project root as `--project-dir` so mockups persist in `.superpowers/brainstorm/` and survive server restarts. Without it, files go to `/tmp` and get cleaned up. Remind the user to add `.superpowers/` to `.gitignore` if it's not already there.
+**Launching the server by platform:**
+**Claude Code (macOS / Linux):**
+```bash
+# Default mode works — the script backgrounds the server itself
+scripts/start-server.sh --project-dir /path/to/project
+```
+**Claude Code (Windows):**
+```bash
+# Windows auto-detects and uses foreground mode, which blocks the tool call.
+# Use run_in_background: true on the Bash tool call so the server survives
+# across conversation turns.
+scripts/start-server.sh --project-dir /path/to/project
+```
+When calling this via the Bash tool, set `run_in_background: true`. Then read `$STATE_DIR/server-info` on the next turn to get the URL and port.
+**Codex:**
+```bash
+# Codex reaps background processes. The script auto-detects CODEX_CI and
+# switches to foreground mode. Run it normally — no extra flags needed.
+scripts/start-server.sh --project-dir /path/to/project
+```
+**Gemini CLI:**
+```bash
+# Use --foreground and set is_background: true on your shell tool call
+# so the process survives across turns
+scripts/start-server.sh --project-dir /path/to/project --foreground
+```
+**Other environments:** The server must keep running in the background across conversation turns. If your environment reaps detached processes, use `--foreground` and launch the command with your platform's background execution mechanism.
+If the URL is unreachable from your browser (common in remote/containerized setups), bind a non-loopback host:
+```bash
+scripts/start-server.sh \
+  --project-dir /path/to/project \
+  --host 0.0.0.0 \
+  --url-host localhost
+```
+Use `--url-host` to control what hostname is printed in the returned URL JSON.
+## The Loop
+1. **Check server is alive**, then **write HTML** to a new file in `screen_dir`:
+   - Before each write, check that `$STATE_DIR/server-info` exists. If it doesn't (or `$STATE_DIR/server-stopped` exists), the server has shut down — restart it with `start-server.sh` before continuing. The server auto-exits after 30 minutes of inactivity.
+   - Use semantic filenames: `platform.html`, `visual-style.html`, `layout.html`
+   - **Never reuse filenames** — each screen gets a fresh file
+   - Use Write tool — **never use cat/heredoc** (dumps noise into terminal)
+   - Server automatically serves the newest file
+2. **Tell user what to expect and end your turn:**
+   - Remind them of the URL (every step, not just first)
+   - Give a brief text summary of what's on screen (e.g., "Showing 3 layout options for the homepage")
+   - Ask them to respond in the terminal: "Take a look and let me know what you think. Click to select an option if you'd like."
+3. **On your next turn** — after the user responds in the terminal:
+   - Read `$STATE_DIR/events` if it exists — this contains the user's browser interactions (clicks, selections) as JSON lines
+   - Merge with the user's terminal text to get the full picture
+   - The terminal message is the primary feedback; `state_dir/events` provides structured interaction data
+4. **Iterate or advance** — if feedback changes current screen, write a new file (e.g., `layout-v2.html`). Only move to the next question when the current step is validated.
+5. **Unload when returning to terminal** — when the next step doesn't need the browser (e.g., a clarifying question, a tradeoff discussion), push a waiting screen to clear the stale content:
+   ```html
+   <!-- filename: waiting.html (or waiting-2.html, etc.) -->
+   <div style="display:flex;align-items:center;justify-content:center;min-height:60vh">
+     <p class="subtitle">Continuing in terminal...</p>
+   </div>
+   ```
+   This prevents the user from staring at a resolved choice while the conversation has moved on. When the next visual question comes up, push a new content file as usual.
+6. Repeat until done.
+## Writing Content Fragments
+Write just the content that goes inside the page. The server wraps it in the frame template automatically (header, theme CSS, selection indicator, and all interactive infrastructure).
+**Minimal example:**
+```html
+<h2>Which layout works better?</h2>
+<p class="subtitle">Consider readability and visual hierarchy</p>
+<div class="options">
+  <div class="option" data-choice="a" onclick="toggleSelect(this)">
+    <div class="letter">A</div>
+    <div class="content">
+      <h3>Single Column</h3>
+      <p>Clean, focused reading experience</p>
+    </div>
+  </div>
+  <div class="option" data-choice="b" onclick="toggleSelect(this)">
+    <div class="letter">B</div>
+    <div class="content">
+      <h3>Two Column</h3>
+      <p>Sidebar navigation with main content</p>
+    </div>
+  </div>
+</div>
+```
+That's it. No `<html>`, no CSS, no `<script>` tags needed. The server provides all of that.
+## CSS Classes Available
+The frame template provides these CSS classes for your content:
+### Options (A/B/C choices)
+```html
+<div class="options">
+  <div class="option" data-choice="a" onclick="toggleSelect(this)">
+    <div class="letter">A</div>
+    <div class="content">
+      <h3>Title</h3>
+      <p>Description</p>
+    </div>
+  </div>
+</div>
+```
+**Multi-select:** Add `data-multiselect` to the container to let users select multiple options. Each click toggles the item. The indicator bar shows the count.
+```html
+<div class="options" data-multiselect>
+  <!-- same option markup — users can select/deselect multiple -->
+</div>
+```
+### Cards (visual designs)
+```html
+<div class="cards">
+  <div class="card" data-choice="design1" onclick="toggleSelect(this)">
+    <div class="card-image"><!-- mockup content --></div>
+    <div class="card-body">
+      <h3>Name</h3>
+      <p>Description</p>
+    </div>
+  </div>
+</div>
+```
+### Mockup container
+```html
+<div class="mockup">
+  <div class="mockup-header">Preview: Dashboard Layout</div>
+  <div class="mockup-body"><!-- your mockup HTML --></div>
+</div>
+```
+### Split view (side-by-side)
+```html
+<div class="split">
+  <div class="mockup"><!-- left --></div>
+  <div class="mockup"><!-- right --></div>
+</div>
+```
+### Pros/Cons
+```html
+<div class="pros-cons">
+  <div class="pros"><h4>Pros</h4><ul><li>Benefit</li></ul></div>
+  <div class="cons"><h4>Cons</h4><ul><li>Drawback</li></ul></div>
+</div>
+```
+### Mock elements (wireframe building blocks)
+```html
+<div class="mock-nav">Logo | Home | About | Contact</div>
+<div style="display: flex;">
+  <div class="mock-sidebar">Navigation</div>
+  <div class="mock-content">Main content area</div>
+</div>
+<button class="mock-button">Action Button</button>
+<input class="mock-input" placeholder="Input field">
+<div class="placeholder">Placeholder area</div>
+```
+### Typography and sections
+- `h2` — page title
+- `h3` — section heading
+- `.subtitle` — secondary text below title
+- `.section` — content block with bottom margin
+- `.label` — small uppercase label text
+## Browser Events Format
+When the user clicks options in the browser, their interactions are recorded to `$STATE_DIR/events` (one JSON object per line). The file is cleared automatically when you push a new screen.
+```jsonl
+{"type":"click","choice":"a","text":"Option A - Simple Layout","timestamp":1706000101}
+{"type":"click","choice":"c","text":"Option C - Complex Grid","timestamp":1706000108}
+{"type":"click","choice":"b","text":"Option B - Hybrid","timestamp":1706000115}
+```
+The full event stream shows the user's exploration path — they may click multiple options before settling. The last `choice` event is typically the final selection, but the pattern of clicks can reveal hesitation or preferences worth asking about.
+If `$STATE_DIR/events` doesn't exist, the user didn't interact with the browser — use only their terminal text.
+## Design Tips
+- **Scale fidelity to the question** — wireframes for layout, polish for polish questions
+- **Explain the question on each page** — "Which layout feels more professional?" not just "Pick one"
+- **Iterate before advancing** — if feedback changes current screen, write a new version
+- **2-4 options max** per screen
+- **Use real content when it matters** — for a photography portfolio, use actual images (Unsplash). Placeholder content obscures design issues.
+- **Keep mockups simple** — focus on layout and structure, not pixel-perfect design
+## File Naming
+- Use semantic names: `platform.html`, `visual-style.html`, `layout.html`
+- Never reuse filenames — each screen must be a new file
+- For iterations: append version suffix like `layout-v2.html`, `layout-v3.html`
+- Server serves newest file by modification time
+## Cleaning Up
+```bash
+scripts/stop-server.sh $SESSION_DIR
+```
+If the session used `--project-dir`, mockup files persist in `.superpowers/brainstorm/` for later reference. Only `/tmp` sessions get deleted on stop.
+## Reference
+- Frame template (CSS reference): `scripts/frame-template.html`
+- Helper script (client-side): `scripts/helper.js`

.agents/skills/caveman-commit/SKILL.md ADDED Viewed

	@@ -0,0 +1,65 @@

+---
+name: caveman-commit
+description: >
+  Ultra-compressed commit message generator. Cuts noise from commit messages while preserving
+  intent and reasoning. Conventional Commits format. Subject ≤50 chars, body only when "why"
+  isn't obvious. Use when user says "write a commit", "commit message", "generate commit",
+  "/commit", or invokes /caveman-commit. Auto-triggers when staging changes.
+---
+Write commit messages terse and exact. Conventional Commits format. No fluff. Why over what.
+## Rules
+**Subject line:**
+- `<type>(<scope>): <imperative summary>` — `<scope>` optional
+- Types: `feat`, `fix`, `refactor`, `perf`, `docs`, `test`, `chore`, `build`, `ci`, `style`, `revert`
+- Imperative mood: "add", "fix", "remove" — not "added", "adds", "adding"
+- ≤50 chars when possible, hard cap 72
+- No trailing period
+- Match project convention for capitalization after the colon
+**Body (only if needed):**
+- Skip entirely when subject is self-explanatory
+- Add body only for: non-obvious *why*, breaking changes, migration notes, linked issues
+- Wrap at 72 chars
+- Bullets `-` not `*`
+- Reference issues/PRs at end: `Closes #42`, `Refs #17`
+**What NEVER goes in:**
+- "This commit does X", "I", "we", "now", "currently" — the diff says what
+- "As requested by..." — use Co-authored-by trailer
+- "Generated with Claude Code" or any AI attribution
+- Emoji (unless project convention requires)
+- Restating the file name when scope already says it
+## Examples
+Diff: new endpoint for user profile with body explaining the why
+- ❌ "feat: add a new endpoint to get user profile information from the database"
+- ✅
+  ```
+  feat(api): add GET /users/:id/profile
+  Mobile client needs profile data without the full user payload
+  to reduce LTE bandwidth on cold-launch screens.
+  Closes #128
+  ```
+Diff: breaking API change
+- ✅
+  ```
+  feat(api)!: rename /v1/orders to /v1/checkout
+  BREAKING CHANGE: clients on /v1/orders must migrate to /v1/checkout
+  before 2026-06-01. Old route returns 410 after that date.
+  ```
+## Auto-Clarity
+Always include body for: breaking changes, security fixes, data migrations, anything reverting a prior commit. Never compress these into subject-only — future debuggers need the context.
+## Boundaries
+Only generates the commit message. Does not run `git commit`, does not stage files, does not amend. Output the message as a code block ready to paste. "stop caveman-commit" or "normal mode": revert to verbose commit style.

.agents/skills/caveman-help/SKILL.md ADDED Viewed

	@@ -0,0 +1,59 @@

+---
+name: caveman-help
+description: >
+  Quick-reference card for all caveman modes, skills, and commands.
+  One-shot display, not a persistent mode. Trigger: /caveman-help,
+  "caveman help", "what caveman commands", "how do I use caveman".
+---
+# Caveman Help
+Display this reference card when invoked. One-shot — do NOT change mode, write flag files, or persist anything. Output in caveman style.
+## Modes
+| Mode | Trigger | What change |
+|------|---------|-------------|
+| **Lite** | `/caveman lite` | Drop filler. Keep sentence structure. |
+| **Full** | `/caveman` | Drop articles, filler, pleasantries, hedging. Fragments OK. Default. |
+| **Ultra** | `/caveman ultra` | Extreme compression. Bare fragments. Tables over prose. |
+| **Wenyan-Lite** | `/caveman wenyan-lite` | Classical Chinese style, light compression. |
+| **Wenyan-Full** | `/caveman wenyan` | Full 文言文. Maximum classical terseness. |
+| **Wenyan-Ultra** | `/caveman wenyan-ultra` | Extreme. Ancient scholar on a budget. |
+Mode stick until changed or session end.
+## Skills
+| Skill | Trigger | What it do |
+|-------|---------|-----------|
+| **caveman-commit** | `/caveman-commit` | Terse commit messages. Conventional Commits. ≤50 char subject. |
+| **caveman-review** | `/caveman-review` | One-line PR comments: `L42: bug: user null. Add guard.` |
+| **caveman-compress** | `/caveman:compress <file>` | Compress .md files to caveman prose. Saves ~46% input tokens. |
+| **caveman-help** | `/caveman-help` | This card. |
+## Deactivate
+Say "stop caveman" or "normal mode". Resume anytime with `/caveman`.
+## Configure Default Mode
+Default mode = `full`. Change it:
+**Environment variable** (highest priority):
+```bash
+export CAVEMAN_DEFAULT_MODE=ultra
+```
+**Config file** (`~/.config/caveman/config.json`):
+```json
+{ "defaultMode": "lite" }
+```
+Set `"off"` to disable auto-activation on session start. User can still activate manually with `/caveman`.
+Resolution: env var > config file > `full`.
+## More
+Full docs: https://github.com/JuliusBrussee/caveman

.agents/skills/caveman-review/SKILL.md ADDED Viewed

	@@ -0,0 +1,55 @@

+---
+name: caveman-review
+description: >
+  Ultra-compressed code review comments. Cuts noise from PR feedback while preserving
+  the actionable signal. Each comment is one line: location, problem, fix. Use when user
+  says "review this PR", "code review", "review the diff", "/review", or invokes
+  /caveman-review. Auto-triggers when reviewing pull requests.
+---
+Write code review comments terse and actionable. One line per finding. Location, problem, fix. No throat-clearing.
+## Rules
+**Format:** `L<line>: <problem>. <fix>.` — or `<file>:L<line>: ...` when reviewing multi-file diffs.
+**Severity prefix (optional, when mixed):**
+- `🔴 bug:` — broken behavior, will cause incident
+- `🟡 risk:` — works but fragile (race, missing null check, swallowed error)
+- `🔵 nit:` — style, naming, micro-optim. Author can ignore
+- `❓ q:` — genuine question, not a suggestion
+**Drop:**
+- "I noticed that...", "It seems like...", "You might want to consider..."
+- "This is just a suggestion but..." — use `nit:` instead
+- "Great work!", "Looks good overall but..." — say it once at the top, not per comment
+- Restating what the line does — the reviewer can read the diff
+- Hedging ("perhaps", "maybe", "I think") — if unsure use `q:`
+**Keep:**
+- Exact line numbers
+- Exact symbol/function/variable names in backticks
+- Concrete fix, not "consider refactoring this"
+- The *why* if the fix isn't obvious from the problem statement
+## Examples
+❌ "I noticed that on line 42 you're not checking if the user object is null before accessing the email property. This could potentially cause a crash if the user is not found in the database. You might want to add a null check here."
+✅ `L42: 🔴 bug: user can be null after .find(). Add guard before .email.`
+❌ "It looks like this function is doing a lot of things and might benefit from being broken up into smaller functions for readability."
+✅ `L88-140: 🔵 nit: 50-line fn does 4 things. Extract validate/normalize/persist.`
+❌ "Have you considered what happens if the API returns a 429? I think we should probably handle that case."
+✅ `L23: 🟡 risk: no retry on 429. Wrap in withBackoff(3).`
+## Auto-Clarity
+Drop terse mode for: security findings (CVE-class bugs need full explanation + reference), architectural disagreements (need rationale, not just a one-liner), and onboarding contexts where the author is new and needs the "why". In those cases write a normal paragraph, then resume terse for the rest.
+## Boundaries
+Reviews only — does not write the code fix, does not approve/request-changes, does not run linters. Output the comment(s) ready to paste into the PR. "stop caveman-review" or "normal mode": revert to verbose review style.

.agents/skills/caveman/SKILL.md ADDED Viewed

	@@ -0,0 +1,67 @@

+---
+name: caveman
+description: >
+  Ultra-compressed communication mode. Cuts token usage ~75% by speaking like caveman
+  while keeping full technical accuracy. Supports intensity levels: lite, full (default), ultra,
+  wenyan-lite, wenyan-full, wenyan-ultra.
+  Use when user says "caveman mode", "talk like caveman", "use caveman", "less tokens",
+  "be brief", or invokes /caveman. Also auto-triggers when token efficiency is requested.
+---
+Respond terse like smart caveman. All technical substance stay. Only fluff die.
+## Persistence
+ACTIVE EVERY RESPONSE. No revert after many turns. No filler drift. Still active if unsure. Off only: "stop caveman" / "normal mode".
+Default: **full**. Switch: `/caveman lite|full|ultra`.
+## Rules
+Drop: articles (a/an/the), filler (just/really/basically/actually/simply), pleasantries (sure/certainly/of course/happy to), hedging. Fragments OK. Short synonyms (big not extensive, fix not "implement a solution for"). Technical terms exact. Code blocks unchanged. Errors quoted exact.
+Pattern: `[thing] [action] [reason]. [next step].`
+Not: "Sure! I'd be happy to help you with that. The issue you're experiencing is likely caused by..."
+Yes: "Bug in auth middleware. Token expiry check use `<` not `<=`. Fix:"
+## Intensity
+| Level | What change |
+|-------|------------|
+| **lite** | No filler/hedging. Keep articles + full sentences. Professional but tight |
+| **full** | Drop articles, fragments OK, short synonyms. Classic caveman |
+| **ultra** | Abbreviate (DB/auth/config/req/res/fn/impl), strip conjunctions, arrows for causality (X → Y), one word when one word enough |
+| **wenyan-lite** | Semi-classical. Drop filler/hedging but keep grammar structure, classical register |
+| **wenyan-full** | Maximum classical terseness. Fully 文言文. 80-90% character reduction. Classical sentence patterns, verbs precede objects, subjects often omitted, classical particles (之/乃/為/其) |
+| **wenyan-ultra** | Extreme abbreviation while keeping classical Chinese feel. Maximum compression, ultra terse |
+Example — "Why React component re-render?"
+- lite: "Your component re-renders because you create a new object reference each render. Wrap it in `useMemo`."
+- full: "New object ref each render. Inline object prop = new ref = re-render. Wrap in `useMemo`."
+- ultra: "Inline obj prop → new ref → re-render. `useMemo`."
+- wenyan-lite: "組件頻重繪，以每繪新生對象參照故。以 useMemo 包之。"
+- wenyan-full: "物出新參照，致重繪。useMemo .Wrap之。"
+- wenyan-ultra: "新參照→重繪。useMemo Wrap。"
+Example — "Explain database connection pooling."
+- lite: "Connection pooling reuses open connections instead of creating new ones per request. Avoids repeated handshake overhead."
+- full: "Pool reuse open DB connections. No new connection per request. Skip handshake overhead."
+- ultra: "Pool = reuse DB conn. Skip handshake → fast under load."
+- wenyan-full: "池reuse open connection。不每req新開。skip handshake overhead。"
+- wenyan-ultra: "池reuse conn。skip handshake → fast。"
+## Auto-Clarity
+Drop caveman for: security warnings, irreversible action confirmations, multi-step sequences where fragment order risks misread, user asks to clarify or repeats question. Resume caveman after clear part done.
+Example — destructive op:
+> **Warning:** This will permanently delete all rows in the `users` table and cannot be undone.
+> ```sql
+> DROP TABLE users;
+> ```
+> Caveman resume. Verify backup exist first.
+## Boundaries
+Code/commits/PRs: write normal. "stop caveman" or "normal mode": revert. Level persist until changed or session end.

.agents/skills/deep-research/SKILL.md ADDED Viewed

	@@ -0,0 +1,155 @@

+---
+name: deep-research
+description: Multi-source deep research using firecrawl and exa MCPs. Searches the web, synthesizes findings, and delivers cited reports with source attribution. Use when the user wants thorough research on any topic with evidence and citations.
+origin: ECC
+---
+# Deep Research
+Produce thorough, cited research reports from multiple web sources using firecrawl and exa MCP tools.
+## When to Activate
+- User asks to research any topic in depth
+- Competitive analysis, technology evaluation, or market sizing
+- Due diligence on companies, investors, or technologies
+- Any question requiring synthesis from multiple sources
+- User says "research", "deep dive", "investigate", or "what's the current state of"
+## MCP Requirements
+At least one of:
+- **firecrawl** — `firecrawl_search`, `firecrawl_scrape`, `firecrawl_crawl`
+- **exa** — `web_search_exa`, `web_search_advanced_exa`, `crawling_exa`
+Both together give the best coverage. Configure in `~/.claude.json` or `~/.codex/config.toml`.
+## Workflow
+### Step 1: Understand the Goal
+Ask 1-2 quick clarifying questions:
+- "What's your goal — learning, making a decision, or writing something?"
+- "Any specific angle or depth you want?"
+If the user says "just research it" — skip ahead with reasonable defaults.
+### Step 2: Plan the Research
+Break the topic into 3-5 research sub-questions. Example:
+- Topic: "Impact of AI on healthcare"
+  - What are the main AI applications in healthcare today?
+  - What clinical outcomes have been measured?
+  - What are the regulatory challenges?
+  - What companies are leading this space?
+  - What's the market size and growth trajectory?
+### Step 3: Execute Multi-Source Search
+For EACH sub-question, search using available MCP tools:
+**With firecrawl:**
+```
+firecrawl_search(query: "<sub-question keywords>", limit: 8)
+```
+**With exa:**
+```
+web_search_exa(query: "<sub-question keywords>", numResults: 8)
+web_search_advanced_exa(query: "<keywords>", numResults: 5, startPublishedDate: "2025-01-01")
+```
+**Search strategy:**
+- Use 2-3 different keyword variations per sub-question
+- Mix general and news-focused queries
+- Aim for 15-30 unique sources total
+- Prioritize: academic, official, reputable news > blogs > forums
+### Step 4: Deep-Read Key Sources
+For the most promising URLs, fetch full content:
+**With firecrawl:**
+```
+firecrawl_scrape(url: "<url>")
+```
+**With exa:**
+```
+crawling_exa(url: "<url>", tokensNum: 5000)
+```
+Read 3-5 key sources in full for depth. Do not rely only on search snippets.
+### Step 5: Synthesize and Write Report
+Structure the report:
+```markdown
+# [Topic]: Research Report
+*Generated: [date] | Sources: [N] | Confidence: [High/Medium/Low]*
+## Executive Summary
+[3-5 sentence overview of key findings]
+## 1. [First Major Theme]
+[Findings with inline citations]
+- Key point ([Source Name](url))
+- Supporting data ([Source Name](url))
+## 2. [Second Major Theme]
+...
+## 3. [Third Major Theme]
+...
+## Key Takeaways
+- [Actionable insight 1]
+- [Actionable insight 2]
+- [Actionable insight 3]
+## Sources
+1. [Title](url) — [one-line summary]
+2. ...
+## Methodology
+Searched [N] queries across web and news. Analyzed [M] sources.
+Sub-questions investigated: [list]
+```
+### Step 6: Deliver
+- **Short topics**: Post the full report in chat
+- **Long reports**: Post the executive summary + key takeaways, save full report to a file
+## Parallel Research with Subagents
+For broad topics, use Claude Code's Task tool to parallelize:
+```
+Launch 3 research agents in parallel:
+1. Agent 1: Research sub-questions 1-2
+2. Agent 2: Research sub-questions 3-4
+3. Agent 3: Research sub-question 5 + cross-cutting themes
+```
+Each agent searches, reads sources, and returns findings. The main session synthesizes into the final report.
+## Quality Rules
+1. **Every claim needs a source.** No unsourced assertions.
+2. **Cross-reference.** If only one source says it, flag it as unverified.
+3. **Recency matters.** Prefer sources from the last 12 months.
+4. **Acknowledge gaps.** If you couldn't find good info on a sub-question, say so.
+5. **No hallucination.** If you don't know, say "insufficient data found."
+6. **Separate fact from inference.** Label estimates, projections, and opinions clearly.
+## Examples
+```
+"Research the current state of nuclear fusion energy"
+"Deep dive into Rust vs Go for backend services in 2026"
+"Research the best strategies for bootstrapping a SaaS business"
+"What's happening with the US housing market right now?"
+"Investigate the competitive landscape for AI code editors"
+```

.agents/skills/dispatching-parallel-agents/SKILL.md ADDED Viewed

	@@ -0,0 +1,182 @@

+---
+name: dispatching-parallel-agents
+description: Use when facing 2+ independent tasks that can be worked on without shared state or sequential dependencies
+---
+# Dispatching Parallel Agents
+## Overview
+You delegate tasks to specialized agents with isolated context. By precisely crafting their instructions and context, you ensure they stay focused and succeed at their task. They should never inherit your session's context or history — you construct exactly what they need. This also preserves your own context for coordination work.
+When you have multiple unrelated failures (different test files, different subsystems, different bugs), investigating them sequentially wastes time. Each investigation is independent and can happen in parallel.
+**Core principle:** Dispatch one agent per independent problem domain. Let them work concurrently.
+## When to Use
+```dot
+digraph when_to_use {
+    "Multiple failures?" [shape=diamond];
+    "Are they independent?" [shape=diamond];
+    "Single agent investigates all" [shape=box];
+    "One agent per problem domain" [shape=box];
+    "Can they work in parallel?" [shape=diamond];
+    "Sequential agents" [shape=box];
+    "Parallel dispatch" [shape=box];
+    "Multiple failures?" -> "Are they independent?" [label="yes"];
+    "Are they independent?" -> "Single agent investigates all" [label="no - related"];
+    "Are they independent?" -> "Can they work in parallel?" [label="yes"];
+    "Can they work in parallel?" -> "Parallel dispatch" [label="yes"];
+    "Can they work in parallel?" -> "Sequential agents" [label="no - shared state"];
+}
+```
+**Use when:**
+- 3+ test files failing with different root causes
+- Multiple subsystems broken independently
+- Each problem can be understood without context from others
+- No shared state between investigations
+**Don't use when:**
+- Failures are related (fix one might fix others)
+- Need to understand full system state
+- Agents would interfere with each other
+## The Pattern
+### 1. Identify Independent Domains
+Group failures by what's broken:
+- File A tests: Tool approval flow
+- File B tests: Batch completion behavior
+- File C tests: Abort functionality
+Each domain is independent - fixing tool approval doesn't affect abort tests.
+### 2. Create Focused Agent Tasks
+Each agent gets:
+- **Specific scope:** One test file or subsystem
+- **Clear goal:** Make these tests pass
+- **Constraints:** Don't change other code
+- **Expected output:** Summary of what you found and fixed
+### 3. Dispatch in Parallel
+```typescript
+// In Claude Code / AI environment
+Task("Fix agent-tool-abort.test.ts failures")
+Task("Fix batch-completion-behavior.test.ts failures")
+Task("Fix tool-approval-race-conditions.test.ts failures")
+// All three run concurrently
+```
+### 4. Review and Integrate
+When agents return:
+- Read each summary
+- Verify fixes don't conflict
+- Run full test suite
+- Integrate all changes
+## Agent Prompt Structure
+Good agent prompts are:
+1. **Focused** - One clear problem domain
+2. **Self-contained** - All context needed to understand the problem
+3. **Specific about output** - What should the agent return?
+```markdown
+Fix the 3 failing tests in src/agents/agent-tool-abort.test.ts:
+1. "should abort tool with partial output capture" - expects 'interrupted at' in message
+2. "should handle mixed completed and aborted tools" - fast tool aborted instead of completed
+3. "should properly track pendingToolCount" - expects 3 results but gets 0
+These are timing/race condition issues. Your task:
+1. Read the test file and understand what each test verifies
+2. Identify root cause - timing issues or actual bugs?
+3. Fix by:
+   - Replacing arbitrary timeouts with event-based waiting
+   - Fixing bugs in abort implementation if found
+   - Adjusting test expectations if testing changed behavior
+Do NOT just increase timeouts - find the real issue.
+Return: Summary of what you found and what you fixed.
+```
+## Common Mistakes
+**❌ Too broad:** "Fix all the tests" - agent gets lost
+**✅ Specific:** "Fix agent-tool-abort.test.ts" - focused scope
+**❌ No context:** "Fix the race condition" - agent doesn't know where
+**✅ Context:** Paste the error messages and test names
+**❌ No constraints:** Agent might refactor everything
+**✅ Constraints:** "Do NOT change production code" or "Fix tests only"
+**❌ Vague output:** "Fix it" - you don't know what changed
+**✅ Specific:** "Return summary of root cause and changes"
+## When NOT to Use
+**Related failures:** Fixing one might fix others - investigate together first
+**Need full context:** Understanding requires seeing entire system
+**Exploratory debugging:** You don't know what's broken yet
+**Shared state:** Agents would interfere (editing same files, using same resources)
+## Real Example from Session
+**Scenario:** 6 test failures across 3 files after major refactoring
+**Failures:**
+- agent-tool-abort.test.ts: 3 failures (timing issues)
+- batch-completion-behavior.test.ts: 2 failures (tools not executing)
+- tool-approval-race-conditions.test.ts: 1 failure (execution count = 0)
+**Decision:** Independent domains - abort logic separate from batch completion separate from race conditions
+**Dispatch:**
+```
+Agent 1 → Fix agent-tool-abort.test.ts
+Agent 2 → Fix batch-completion-behavior.test.ts
+Agent 3 → Fix tool-approval-race-conditions.test.ts
+```
+**Results:**
+- Agent 1: Replaced timeouts with event-based waiting
+- Agent 2: Fixed event structure bug (threadId in wrong place)
+- Agent 3: Added wait for async tool execution to complete
+**Integration:** All fixes independent, no conflicts, full suite green
+**Time saved:** 3 problems solved in parallel vs sequentially
+## Key Benefits
+1. **Parallelization** - Multiple investigations happen simultaneously
+2. **Focus** - Each agent has narrow scope, less context to track
+3. **Independence** - Agents don't interfere with each other
+4. **Speed** - 3 problems solved in time of 1
+## Verification
+After agents return:
+1. **Review each summary** - Understand what changed
+2. **Check for conflicts** - Did agents edit same code?
+3. **Run full suite** - Verify all fixes work together
+4. **Spot check** - Agents can make systematic errors
+## Real-World Impact
+From debugging session (2025-10-03):
+- 6 failures across 3 files
+- 3 agents dispatched in parallel
+- All investigations completed concurrently
+- All fixes integrated successfully
+- Zero conflicts between agent changes

.agents/skills/documentation-lookup/SKILL.md ADDED Viewed

	@@ -0,0 +1,90 @@

+---
+name: documentation-lookup
+description: Use up-to-date library and framework docs via Context7 MCP instead of training data. Activates for setup questions, API references, code examples, or when the user names a framework (e.g. React, Next.js, Prisma).
+origin: ECC
+---
+# Documentation Lookup (Context7)
+When the user asks about libraries, frameworks, or APIs, fetch current documentation via the Context7 MCP (tools `resolve-library-id` and `query-docs`) instead of relying on training data.
+## Core Concepts
+- **Context7**: MCP server that exposes live documentation; use it instead of training data for libraries and APIs.
+- **resolve-library-id**: Returns Context7-compatible library IDs (e.g. `/vercel/next.js`) from a library name and query.
+- **query-docs**: Fetches documentation and code snippets for a given library ID and question. Always call resolve-library-id first to get a valid library ID.
+## When to use
+Activate when the user:
+- Asks setup or configuration questions (e.g. "How do I configure Next.js middleware?")
+- Requests code that depends on a library ("Write a Prisma query for...")
+- Needs API or reference information ("What are the Supabase auth methods?")
+- Mentions specific frameworks or libraries (React, Vue, Svelte, Express, Tailwind, Prisma, Supabase, etc.)
+Use this skill whenever the request depends on accurate, up-to-date behavior of a library, framework, or API. Applies across harnesses that have the Context7 MCP configured (e.g. Claude Code, Cursor, Codex).
+## How it works
+### Step 1: Resolve the Library ID
+Call the **resolve-library-id** MCP tool with:
+- **libraryName**: The library or product name taken from the user's question (e.g. `Next.js`, `Prisma`, `Supabase`).
+- **query**: The user's full question. This improves relevance ranking of results.
+You must obtain a Context7-compatible library ID (format `/org/project` or `/org/project/version`) before querying docs. Do not call query-docs without a valid library ID from this step.
+### Step 2: Select the Best Match
+From the resolution results, choose one result using:
+- **Name match**: Prefer exact or closest match to what the user asked for.
+- **Benchmark score**: Higher scores indicate better documentation quality (100 is highest).
+- **Source reputation**: Prefer High or Medium reputation when available.
+- **Version**: If the user specified a version (e.g. "React 19", "Next.js 15"), prefer a version-specific library ID if listed (e.g. `/org/project/v1.2.0`).
+### Step 3: Fetch the Documentation
+Call the **query-docs** MCP tool with:
+- **libraryId**: The selected Context7 library ID from Step 2 (e.g. `/vercel/next.js`).
+- **query**: The user's specific question or task. Be specific to get relevant snippets.
+Limit: do not call query-docs (or resolve-library-id) more than 3 times per question. If the answer is unclear after 3 calls, state the uncertainty and use the best information you have rather than guessing.
+### Step 4: Use the Documentation
+- Answer the user's question using the fetched, current information.
+- Include relevant code examples from the docs when helpful.
+- Cite the library or version when it matters (e.g. "In Next.js 15...").
+## Examples
+### Example: Next.js middleware
+1. Call **resolve-library-id** with `libraryName: "Next.js"`, `query: "How do I set up Next.js middleware?"`.
+2. From results, pick the best match (e.g. `/vercel/next.js`) by name and benchmark score.
+3. Call **query-docs** with `libraryId: "/vercel/next.js"`, `query: "How do I set up Next.js middleware?"`.
+4. Use the returned snippets and text to answer; include a minimal `middleware.ts` example from the docs if relevant.
+### Example: Prisma query
+1. Call **resolve-library-id** with `libraryName: "Prisma"`, `query: "How do I query with relations?"`.
+2. Select the official Prisma library ID (e.g. `/prisma/prisma`).
+3. Call **query-docs** with that `libraryId` and the query.
+4. Return the Prisma Client pattern (e.g. `include` or `select`) with a short code snippet from the docs.
+### Example: Supabase auth methods
+1. Call **resolve-library-id** with `libraryName: "Supabase"`, `query: "What are the auth methods?"`.
+2. Pick the Supabase docs library ID.
+3. Call **query-docs**; summarize the auth methods and show minimal examples from the fetched docs.
+## Best Practices
+- **Be specific**: Use the user's full question as the query where possible for better relevance.
+- **Version awareness**: When users mention versions, use version-specific library IDs from the resolve step when available.
+- **Prefer official sources**: When multiple matches exist, prefer official or primary packages over community forks.
+- **No sensitive data**: Redact API keys, passwords, tokens, and other secrets from any query sent to Context7. Treat the user's question as potentially containing secrets before passing it to resolve-library-id or query-docs.

.agents/skills/e2e-testing/SKILL.md ADDED Viewed

	@@ -0,0 +1,326 @@

+---
+name: e2e-testing
+description: Playwright E2E testing patterns, Page Object Model, configuration, CI/CD integration, artifact management, and flaky test strategies.
+origin: ECC
+---
+# E2E Testing Patterns
+Comprehensive Playwright patterns for building stable, fast, and maintainable E2E test suites.
+## Test File Organization
+```
+tests/
+├── e2e/
+│   ├── auth/
+│   │   ├── login.spec.ts
+│   │   ├── logout.spec.ts
+│   │   └── register.spec.ts
+│   ├── features/
+│   │   ├── browse.spec.ts
+│   │   ├── search.spec.ts
+│   │   └── create.spec.ts
+│   └── api/
+│       └── endpoints.spec.ts
+├── fixtures/
+│   ├── auth.ts
+│   └── data.ts
+└── playwright.config.ts
+```
+## Page Object Model (POM)
+```typescript
+import { Page, Locator } from '@playwright/test'
+export class ItemsPage {
+  readonly page: Page
+  readonly searchInput: Locator
+  readonly itemCards: Locator
+  readonly createButton: Locator
+  constructor(page: Page) {
+    this.page = page
+    this.searchInput = page.locator('[data-testid="search-input"]')
+    this.itemCards = page.locator('[data-testid="item-card"]')
+    this.createButton = page.locator('[data-testid="create-btn"]')
+  }
+  async goto() {
+    await this.page.goto('/items')
+    await this.page.waitForLoadState('networkidle')
+  }
+  async search(query: string) {
+    await this.searchInput.fill(query)
+    await this.page.waitForResponse(resp => resp.url().includes('/api/search'))
+    await this.page.waitForLoadState('networkidle')
+  }
+  async getItemCount() {
+    return await this.itemCards.count()
+  }
+}
+```
+## Test Structure
+```typescript
+import { test, expect } from '@playwright/test'
+import { ItemsPage } from '../../pages/ItemsPage'
+test.describe('Item Search', () => {
+  let itemsPage: ItemsPage
+  test.beforeEach(async ({ page }) => {
+    itemsPage = new ItemsPage(page)
+    await itemsPage.goto()
+  })
+  test('should search by keyword', async ({ page }) => {
+    await itemsPage.search('test')
+    const count = await itemsPage.getItemCount()
+    expect(count).toBeGreaterThan(0)
+    await expect(itemsPage.itemCards.first()).toContainText(/test/i)
+    await page.screenshot({ path: 'artifacts/search-results.png' })
+  })
+  test('should handle no results', async ({ page }) => {
+    await itemsPage.search('xyznonexistent123')
+    await expect(page.locator('[data-testid="no-results"]')).toBeVisible()
+    expect(await itemsPage.getItemCount()).toBe(0)
+  })
+})
+```
+## Playwright Configuration
+```typescript
+import { defineConfig, devices } from '@playwright/test'
+export default defineConfig({
+  testDir: './tests/e2e',
+  fullyParallel: true,
+  forbidOnly: !!process.env.CI,
+  retries: process.env.CI ? 2 : 0,
+  workers: process.env.CI ? 1 : undefined,
+  reporter: [
+    ['html', { outputFolder: 'playwright-report' }],
+    ['junit', { outputFile: 'playwright-results.xml' }],
+    ['json', { outputFile: 'playwright-results.json' }]
+  ],
+  use: {
+    baseURL: process.env.BASE_URL || 'http://localhost:3000',
+    trace: 'on-first-retry',
+    screenshot: 'only-on-failure',
+    video: 'retain-on-failure',
+    actionTimeout: 10000,
+    navigationTimeout: 30000,
+  },
+  projects: [
+    { name: 'chromium', use: { ...devices['Desktop Chrome'] } },
+    { name: 'firefox', use: { ...devices['Desktop Firefox'] } },
+    { name: 'webkit', use: { ...devices['Desktop Safari'] } },
+    { name: 'mobile-chrome', use: { ...devices['Pixel 5'] } },
+  ],
+  webServer: {
+    command: 'npm run dev',
+    url: 'http://localhost:3000',
+    reuseExistingServer: !process.env.CI,
+    timeout: 120000,
+  },
+})
+```
+## Flaky Test Patterns
+### Quarantine
+```typescript
+test('flaky: complex search', async ({ page }) => {
+  test.fixme(true, 'Flaky - Issue #123')
+  // test code...
+})
+test('conditional skip', async ({ page }) => {
+  test.skip(process.env.CI, 'Flaky in CI - Issue #123')
+  // test code...
+})
+```
+### Identify Flakiness
+```bash
+npx playwright test tests/search.spec.ts --repeat-each=10
+npx playwright test tests/search.spec.ts --retries=3
+```
+### Common Causes & Fixes
+**Race conditions:**
+```typescript
+// Bad: assumes element is ready
+await page.click('[data-testid="button"]')
+// Good: auto-wait locator
+await page.locator('[data-testid="button"]').click()
+```
+**Network timing:**
+```typescript
+// Bad: arbitrary timeout
+await page.waitForTimeout(5000)
+// Good: wait for specific condition
+await page.waitForResponse(resp => resp.url().includes('/api/data'))
+```
+**Animation timing:**
+```typescript
+// Bad: click during animation
+await page.click('[data-testid="menu-item"]')
+// Good: wait for stability
+await page.locator('[data-testid="menu-item"]').waitFor({ state: 'visible' })
+await page.waitForLoadState('networkidle')
+await page.locator('[data-testid="menu-item"]').click()
+```
+## Artifact Management
+### Screenshots
+```typescript
+await page.screenshot({ path: 'artifacts/after-login.png' })
+await page.screenshot({ path: 'artifacts/full-page.png', fullPage: true })
+await page.locator('[data-testid="chart"]').screenshot({ path: 'artifacts/chart.png' })
+```
+### Traces
+```typescript
+await browser.startTracing(page, {
+  path: 'artifacts/trace.json',
+  screenshots: true,
+  snapshots: true,
+})
+// ... test actions ...
+await browser.stopTracing()
+```
+### Video
+```typescript
+// In playwright.config.ts
+use: {
+  video: 'retain-on-failure',
+  videosPath: 'artifacts/videos/'
+}
+```
+## CI/CD Integration
+```yaml
+# .github/workflows/e2e.yml
+name: E2E Tests
+on: [push, pull_request]
+jobs:
+  test:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+      - uses: actions/setup-node@v4
+        with:
+          node-version: 20
+      - run: npm ci
+      - run: npx playwright install --with-deps
+      - run: npx playwright test
+        env:
+          BASE_URL: ${{ vars.STAGING_URL }}
+      - uses: actions/upload-artifact@v4
+        if: always()
+        with:
+          name: playwright-report
+          path: playwright-report/
+          retention-days: 30
+```
+## Test Report Template
+```markdown
+# E2E Test Report
+**Date:** YYYY-MM-DD HH:MM
+**Duration:** Xm Ys
+**Status:** PASSING / FAILING
+## Summary
+- Total: X | Passed: Y (Z%) | Failed: A | Flaky: B | Skipped: C
+## Failed Tests
+### test-name
+**File:** `tests/e2e/feature.spec.ts:45`
+**Error:** Expected element to be visible
+**Screenshot:** artifacts/failed.png
+**Recommended Fix:** [description]
+## Artifacts
+- HTML Report: playwright-report/index.html
+- Screenshots: artifacts/*.png
+- Videos: artifacts/videos/*.webm
+- Traces: artifacts/*.zip
+```
+## Wallet / Web3 Testing
+```typescript
+test('wallet connection', async ({ page, context }) => {
+  // Mock wallet provider
+  await context.addInitScript(() => {
+    window.ethereum = {
+      isMetaMask: true,
+      request: async ({ method }) => {
+        if (method === 'eth_requestAccounts')
+          return ['0x1234567890123456789012345678901234567890']
+        if (method === 'eth_chainId') return '0x1'
+      }
+    }
+  })
+  await page.goto('/')
+  await page.locator('[data-testid="connect-wallet"]').click()
+  await expect(page.locator('[data-testid="wallet-address"]')).toContainText('0x1234')
+})
+```
+## Financial / Critical Flow Testing
+```typescript
+test('trade execution', async ({ page }) => {
+  // Skip on production — real money
+  test.skip(process.env.NODE_ENV === 'production', 'Skip on production')
+  await page.goto('/markets/test-market')
+  await page.locator('[data-testid="position-yes"]').click()
+  await page.locator('[data-testid="trade-amount"]').fill('1.0')
+  // Verify preview
+  const preview = page.locator('[data-testid="trade-preview"]')
+  await expect(preview).toContainText('1.0')
+  // Confirm and wait for blockchain
+  await page.locator('[data-testid="confirm-trade"]').click()
+  await page.waitForResponse(
+    resp => resp.url().includes('/api/trade') && resp.status() === 200,
+    { timeout: 30000 }
+  )
+  await expect(page.locator('[data-testid="trade-success"]')).toBeVisible()
+})
+```

.agents/skills/eval-harness/SKILL.md ADDED Viewed

	@@ -0,0 +1,270 @@

+---
+name: eval-harness
+description: Formal evaluation framework for Claude Code sessions implementing eval-driven development (EDD) principles
+origin: ECC
+tools: Read, Write, Edit, Bash, Grep, Glob
+---
+# Eval Harness Skill
+A formal evaluation framework for Claude Code sessions, implementing eval-driven development (EDD) principles.
+## When to Activate
+- Setting up eval-driven development (EDD) for AI-assisted workflows
+- Defining pass/fail criteria for Claude Code task completion
+- Measuring agent reliability with pass@k metrics
+- Creating regression test suites for prompt or agent changes
+- Benchmarking agent performance across model versions
+## Philosophy
+Eval-Driven Development treats evals as the "unit tests of AI development":
+- Define expected behavior BEFORE implementation
+- Run evals continuously during development
+- Track regressions with each change
+- Use pass@k metrics for reliability measurement
+## Eval Types
+### Capability Evals
+Test if Claude can do something it couldn't before:
+```markdown
+[CAPABILITY EVAL: feature-name]
+Task: Description of what Claude should accomplish
+Success Criteria:
+  - [ ] Criterion 1
+  - [ ] Criterion 2
+  - [ ] Criterion 3
+Expected Output: Description of expected result
+```
+### Regression Evals
+Ensure changes don't break existing functionality:
+```markdown
+[REGRESSION EVAL: feature-name]
+Baseline: SHA or checkpoint name
+Tests:
+  - existing-test-1: PASS/FAIL
+  - existing-test-2: PASS/FAIL
+  - existing-test-3: PASS/FAIL
+Result: X/Y passed (previously Y/Y)
+```
+## Grader Types
+### 1. Code-Based Grader
+Deterministic checks using code:
+```bash
+# Check if file contains expected pattern
+grep -q "export function handleAuth" src/auth.ts && echo "PASS" || echo "FAIL"
+# Check if tests pass
+npm test -- --testPathPattern="auth" && echo "PASS" || echo "FAIL"
+# Check if build succeeds
+npm run build && echo "PASS" || echo "FAIL"
+```
+### 2. Model-Based Grader
+Use Claude to evaluate open-ended outputs:
+```markdown
+[MODEL GRADER PROMPT]
+Evaluate the following code change:
+1. Does it solve the stated problem?
+2. Is it well-structured?
+3. Are edge cases handled?
+4. Is error handling appropriate?
+Score: 1-5 (1=poor, 5=excellent)
+Reasoning: [explanation]
+```
+### 3. Human Grader
+Flag for manual review:
+```markdown
+[HUMAN REVIEW REQUIRED]
+Change: Description of what changed
+Reason: Why human review is needed
+Risk Level: LOW/MEDIUM/HIGH
+```
+## Metrics
+### pass@k
+"At least one success in k attempts"
+- pass@1: First attempt success rate
+- pass@3: Success within 3 attempts
+- Typical target: pass@3 > 90%
+### pass^k
+"All k trials succeed"
+- Higher bar for reliability
+- pass^3: 3 consecutive successes
+- Use for critical paths
+## Eval Workflow
+### 1. Define (Before Coding)
+```markdown
+## EVAL DEFINITION: feature-xyz
+### Capability Evals
+1. Can create new user account
+2. Can validate email format
+3. Can hash password securely
+### Regression Evals
+1. Existing login still works
+2. Session management unchanged
+3. Logout flow intact
+### Success Metrics
+- pass@3 > 90% for capability evals
+- pass^3 = 100% for regression evals
+```
+### 2. Implement
+Write code to pass the defined evals.
+### 3. Evaluate
+```bash
+# Run capability evals
+[Run each capability eval, record PASS/FAIL]
+# Run regression evals
+npm test -- --testPathPattern="existing"
+# Generate report
+```
+### 4. Report
+```markdown
+EVAL REPORT: feature-xyz
+========================
+Capability Evals:
+  create-user:     PASS (pass@1)
+  validate-email:  PASS (pass@2)
+  hash-password:   PASS (pass@1)
+  Overall:         3/3 passed
+Regression Evals:
+  login-flow:      PASS
+  session-mgmt:    PASS
+  logout-flow:     PASS
+  Overall:         3/3 passed
+Metrics:
+  pass@1: 67% (2/3)
+  pass@3: 100% (3/3)
+Status: READY FOR REVIEW
+```
+## Integration Patterns
+### Pre-Implementation
+```
+/eval define feature-name
+```
+Creates eval definition file at `.claude/evals/feature-name.md`
+### During Implementation
+```
+/eval check feature-name
+```
+Runs current evals and reports status
+### Post-Implementation
+```
+/eval report feature-name
+```
+Generates full eval report
+## Eval Storage
+Store evals in project:
+```
+.claude/
+  evals/
+    feature-xyz.md      # Eval definition
+    feature-xyz.log     # Eval run history
+    baseline.json       # Regression baselines
+```
+## Best Practices
+1. **Define evals BEFORE coding** - Forces clear thinking about success criteria
+2. **Run evals frequently** - Catch regressions early
+3. **Track pass@k over time** - Monitor reliability trends
+4. **Use code graders when possible** - Deterministic > probabilistic
+5. **Human review for security** - Never fully automate security checks
+6. **Keep evals fast** - Slow evals don't get run
+7. **Version evals with code** - Evals are first-class artifacts
+## Example: Adding Authentication
+```markdown
+## EVAL: add-authentication
+### Phase 1: Define (10 min)
+Capability Evals:
+- [ ] User can register with email/password
+- [ ] User can login with valid credentials
+- [ ] Invalid credentials rejected with proper error
+- [ ] Sessions persist across page reloads
+- [ ] Logout clears session
+Regression Evals:
+- [ ] Public routes still accessible
+- [ ] API responses unchanged
+- [ ] Database schema compatible
+### Phase 2: Implement (varies)
+[Write code]
+### Phase 3: Evaluate
+Run: /eval check add-authentication
+### Phase 4: Report
+EVAL REPORT: add-authentication
+==============================
+Capability: 5/5 passed (pass@3: 100%)
+Regression: 3/3 passed (pass^3: 100%)
+Status: SHIP IT
+```
+## Product Evals (v1.8)
+Use product evals when behavior quality cannot be captured by unit tests alone.
+### Grader Types
+1. Code grader (deterministic assertions)
+2. Rule grader (regex/schema constraints)
+3. Model grader (LLM-as-judge rubric)
+4. Human grader (manual adjudication for ambiguous outputs)
+### pass@k Guidance
+- `pass@1`: direct reliability
+- `pass@3`: practical reliability under controlled retries
+- `pass^3`: stability test (all 3 runs must pass)
+Recommended thresholds:
+- Capability evals: pass@3 >= 0.90
+- Regression evals: pass^3 = 1.00 for release-critical paths
+### Eval Anti-Patterns
+- Overfitting prompts to known eval examples
+- Measuring only happy-path outputs
+- Ignoring cost and latency drift while chasing pass rates
+- Allowing flaky graders in release gates
+### Minimal Eval Artifact Layout
+- `.claude/evals/<feature>.md` definition
+- `.claude/evals/<feature>.log` run history
+- `docs/releases/<version>/eval-summary.md` release snapshot

.agents/skills/executing-plans/SKILL.md ADDED Viewed

	@@ -0,0 +1,70 @@

+---
+name: executing-plans
+description: Use when you have a written implementation plan to execute in a separate session with review checkpoints
+---
+# Executing Plans
+## Overview
+Load plan, review critically, execute all tasks, report when complete.
+**Announce at start:** "I'm using the executing-plans skill to implement this plan."
+**Note:** Tell your human partner that Superpowers works much better with access to subagents. The quality of its work will be significantly higher if run on a platform with subagent support (such as Claude Code or Codex). If subagents are available, use superpowers:subagent-driven-development instead of this skill.
+## The Process
+### Step 1: Load and Review Plan
+1. Read plan file
+2. Review critically - identify any questions or concerns about the plan
+3. If concerns: Raise them with your human partner before starting
+4. If no concerns: Create TodoWrite and proceed
+### Step 2: Execute Tasks
+For each task:
+1. Mark as in_progress
+2. Follow each step exactly (plan has bite-sized steps)
+3. Run verifications as specified
+4. Mark as completed
+### Step 3: Complete Development
+After all tasks complete and verified:
+- Announce: "I'm using the finishing-a-development-branch skill to complete this work."
+- **REQUIRED SUB-SKILL:** Use superpowers:finishing-a-development-branch
+- Follow that skill to verify tests, present options, execute choice
+## When to Stop and Ask for Help
+**STOP executing immediately when:**
+- Hit a blocker (missing dependency, test fails, instruction unclear)
+- Plan has critical gaps preventing starting
+- You don't understand an instruction
+- Verification fails repeatedly
+**Ask for clarification rather than guessing.**
+## When to Revisit Earlier Steps
+**Return to Review (Step 1) when:**
+- Partner updates the plan based on your feedback
+- Fundamental approach needs rethinking
+**Don't force through blockers** - stop and ask.
+## Remember
+- Review plan critically first
+- Follow plan steps exactly
+- Don't skip verifications
+- Reference skills when plan says to
+- Stop when blocked, don't guess
+- Never start implementation on main/master branch without explicit user consent
+## Integration
+**Required workflow skills:**
+- **superpowers:using-git-worktrees** - REQUIRED: Set up isolated workspace before starting
+- **superpowers:writing-plans** - Creates the plan this skill executes
+- **superpowers:finishing-a-development-branch** - Complete development after all tasks

.agents/skills/finishing-a-development-branch/SKILL.md ADDED Viewed

	@@ -0,0 +1,200 @@

+---
+name: finishing-a-development-branch
+description: Use when implementation is complete, all tests pass, and you need to decide how to integrate the work - guides completion of development work by presenting structured options for merge, PR, or cleanup
+---
+# Finishing a Development Branch
+## Overview
+Guide completion of development work by presenting clear options and handling chosen workflow.
+**Core principle:** Verify tests → Present options → Execute choice → Clean up.
+**Announce at start:** "I'm using the finishing-a-development-branch skill to complete this work."
+## The Process
+### Step 1: Verify Tests
+**Before presenting options, verify tests pass:**
+```bash
+# Run project's test suite
+npm test / cargo test / pytest / go test ./...
+```
+**If tests fail:**
+```
+Tests failing (<N> failures). Must fix before completing:
+[Show failures]
+Cannot proceed with merge/PR until tests pass.
+```
+Stop. Don't proceed to Step 2.
+**If tests pass:** Continue to Step 2.
+### Step 2: Determine Base Branch
+```bash
+# Try common base branches
+git merge-base HEAD main 2>/dev/null || git merge-base HEAD master 2>/dev/null
+```
+Or ask: "This branch split from main - is that correct?"
+### Step 3: Present Options
+Present exactly these 4 options:
+```
+Implementation complete. What would you like to do?
+1. Merge back to <base-branch> locally
+2. Push and create a Pull Request
+3. Keep the branch as-is (I'll handle it later)
+4. Discard this work
+Which option?
+```
+**Don't add explanation** - keep options concise.
+### Step 4: Execute Choice
+#### Option 1: Merge Locally
+```bash
+# Switch to base branch
+git checkout <base-branch>
+# Pull latest
+git pull
+# Merge feature branch
+git merge <feature-branch>
+# Verify tests on merged result
+<test command>
+# If tests pass
+git branch -d <feature-branch>
+```
+Then: Cleanup worktree (Step 5)
+#### Option 2: Push and Create PR
+```bash
+# Push branch
+git push -u origin <feature-branch>
+# Create PR
+gh pr create --title "<title>" --body "$(cat <<'EOF'
+## Summary
+<2-3 bullets of what changed>
+## Test Plan
+- [ ] <verification steps>
+EOF
+)"
+```
+Then: Cleanup worktree (Step 5)
+#### Option 3: Keep As-Is
+Report: "Keeping branch <name>. Worktree preserved at <path>."
+**Don't cleanup worktree.**
+#### Option 4: Discard
+**Confirm first:**
+```
+This will permanently delete:
+- Branch <name>
+- All commits: <commit-list>
+- Worktree at <path>
+Type 'discard' to confirm.
+```
+Wait for exact confirmation.
+If confirmed:
+```bash
+git checkout <base-branch>
+git branch -D <feature-branch>
+```
+Then: Cleanup worktree (Step 5)
+### Step 5: Cleanup Worktree
+**For Options 1, 2, 4:**
+Check if in worktree:
+```bash
+git worktree list | grep $(git branch --show-current)
+```
+If yes:
+```bash
+git worktree remove <worktree-path>
+```
+**For Option 3:** Keep worktree.
+## Quick Reference
+| Option | Merge | Push | Keep Worktree | Cleanup Branch |
+|--------|-------|------|---------------|----------------|
+| 1. Merge locally | ✓ | - | - | ✓ |
+| 2. Create PR | - | ✓ | ✓ | - |
+| 3. Keep as-is | - | - | ✓ | - |
+| 4. Discard | - | - | - | ✓ (force) |
+## Common Mistakes
+**Skipping test verification**
+- **Problem:** Merge broken code, create failing PR
+- **Fix:** Always verify tests before offering options
+**Open-ended questions**
+- **Problem:** "What should I do next?" → ambiguous
+- **Fix:** Present exactly 4 structured options
+**Automatic worktree cleanup**
+- **Problem:** Remove worktree when might need it (Option 2, 3)
+- **Fix:** Only cleanup for Options 1 and 4
+**No confirmation for discard**
+- **Problem:** Accidentally delete work
+- **Fix:** Require typed "discard" confirmation
+## Red Flags
+**Never:**
+- Proceed with failing tests
+- Merge without verifying tests on result
+- Delete work without confirmation
+- Force-push without explicit request
+**Always:**
+- Verify tests before offering options
+- Present exactly 4 options
+- Get typed confirmation for Option 4
+- Clean up worktree for Options 1 & 4 only
+## Integration
+**Called by:**
+- **subagent-driven-development** (Step 7) - After all tasks complete
+- **executing-plans** (Step 5) - After all batches complete
+**Pairs with:**
+- **using-git-worktrees** - Cleans up worktree created by that skill

.agents/skills/frontend-slides/SKILL.md ADDED Viewed

	@@ -0,0 +1,184 @@

+---
+name: frontend-slides
+description: Create stunning, animation-rich HTML presentations from scratch or by converting PowerPoint files. Use when the user wants to build a presentation, convert a PPT/PPTX to web, or create slides for a talk/pitch. Helps non-designers discover their aesthetic through visual exploration rather than abstract choices.
+origin: ECC
+---
+# Frontend Slides
+Create zero-dependency, animation-rich HTML presentations that run entirely in the browser.
+Inspired by the visual exploration approach showcased in work by zarazhangrui (credit: @zarazhangrui).
+## When to Activate
+- Creating a talk deck, pitch deck, workshop deck, or internal presentation
+- Converting `.ppt` or `.pptx` slides into an HTML presentation
+- Improving an existing HTML presentation's layout, motion, or typography
+- Exploring presentation styles with a user who does not know their design preference yet
+## Non-Negotiables
+1. **Zero dependencies**: default to one self-contained HTML file with inline CSS and JS.
+2. **Viewport fit is mandatory**: every slide must fit inside one viewport with no internal scrolling.
+3. **Show, don't tell**: use visual previews instead of abstract style questionnaires.
+4. **Distinctive design**: avoid generic purple-gradient, Inter-on-white, template-looking decks.
+5. **Production quality**: keep code commented, accessible, responsive, and performant.
+Before generating, read `STYLE_PRESETS.md` for the viewport-safe CSS base, density limits, preset catalog, and CSS gotchas.
+## Workflow
+### 1. Detect Mode
+Choose one path:
+- **New presentation**: user has a topic, notes, or full draft
+- **PPT conversion**: user has `.ppt` or `.pptx`
+- **Enhancement**: user already has HTML slides and wants improvements
+### 2. Discover Content
+Ask only the minimum needed:
+- purpose: pitch, teaching, conference talk, internal update
+- length: short (5-10), medium (10-20), long (20+)
+- content state: finished copy, rough notes, topic only
+If the user has content, ask them to paste it before styling.
+### 3. Discover Style
+Default to visual exploration.
+If the user already knows the desired preset, skip previews and use it directly.
+Otherwise:
+1. Ask what feeling the deck should create: impressed, energized, focused, inspired.
+2. Generate **3 single-slide preview files** in `.ecc-design/slide-previews/`.
+3. Each preview must be self-contained, show typography/color/motion clearly, and stay under roughly 100 lines of slide content.
+4. Ask the user which preview to keep or what elements to mix.
+Use the preset guide in `STYLE_PRESETS.md` when mapping mood to style.
+### 4. Build the Presentation
+Output either:
+- `presentation.html`
+- `[presentation-name].html`
+Use an `assets/` folder only when the deck contains extracted or user-supplied images.
+Required structure:
+- semantic slide sections
+- a viewport-safe CSS base from `STYLE_PRESETS.md`
+- CSS custom properties for theme values
+- a presentation controller class for keyboard, wheel, and touch navigation
+- Intersection Observer for reveal animations
+- reduced-motion support
+### 5. Enforce Viewport Fit
+Treat this as a hard gate.
+Rules:
+- every `.slide` must use `height: 100vh; height: 100dvh; overflow: hidden;`
+- all type and spacing must scale with `clamp()`
+- when content does not fit, split into multiple slides
+- never solve overflow by shrinking text below readable sizes
+- never allow scrollbars inside a slide
+Use the density limits and mandatory CSS block in `STYLE_PRESETS.md`.
+### 6. Validate
+Check the finished deck at these sizes:
+- 1920x1080
+- 1280x720
+- 768x1024
+- 375x667
+- 667x375
+If browser automation is available, use it to verify no slide overflows and that keyboard navigation works.
+### 7. Deliver
+At handoff:
+- delete temporary preview files unless the user wants to keep them
+- open the deck with the platform-appropriate opener when useful
+- summarize file path, preset used, slide count, and easy theme customization points
+Use the correct opener for the current OS:
+- macOS: `open file.html`
+- Linux: `xdg-open file.html`
+- Windows: `start "" file.html`
+## PPT / PPTX Conversion
+For PowerPoint conversion:
+1. Prefer `python3` with `python-pptx` to extract text, images, and notes.
+2. If `python-pptx` is unavailable, ask whether to install it or fall back to a manual/export-based workflow.
+3. Preserve slide order, speaker notes, and extracted assets.
+4. After extraction, run the same style-selection workflow as a new presentation.
+Keep conversion cross-platform. Do not rely on macOS-only tools when Python can do the job.
+## Implementation Requirements
+### HTML / CSS
+- Use inline CSS and JS unless the user explicitly wants a multi-file project.
+- Fonts may come from Google Fonts or Fontshare.
+- Prefer atmospheric backgrounds, strong type hierarchy, and a clear visual direction.
+- Use abstract shapes, gradients, grids, noise, and geometry rather than illustrations.
+### JavaScript
+Include:
+- keyboard navigation
+- touch / swipe navigation
+- mouse wheel navigation
+- progress indicator or slide index
+- reveal-on-enter animation triggers
+### Accessibility
+- use semantic structure (`main`, `section`, `nav`)
+- keep contrast readable
+- support keyboard-only navigation
+- respect `prefers-reduced-motion`
+## Content Density Limits
+Use these maxima unless the user explicitly asks for denser slides and readability still holds:
+| Slide type | Limit |
+|------------|-------|
+| Title | 1 heading + 1 subtitle + optional tagline |
+| Content | 1 heading + 4-6 bullets or 2 short paragraphs |
+| Feature grid | 6 cards max |
+| Code | 8-10 lines max |
+| Quote | 1 quote + attribution |
+| Image | 1 image constrained by viewport |
+## Anti-Patterns
+- generic startup gradients with no visual identity
+- system-font decks unless intentionally editorial
+- long bullet walls
+- code blocks that need scrolling
+- fixed-height content boxes that break on short screens
+- invalid negated CSS functions like `-clamp(...)`
+## Related ECC Skills
+- `frontend-patterns` for component and interaction patterns around the deck
+- `liquid-glass-design` when a presentation intentionally borrows Apple glass aesthetics
+- `e2e-testing` if you need automated browser verification for the final deck
+## Deliverable Checklist
+- presentation runs from a local file in a browser
+- every slide fits the viewport without scrolling
+- style is distinctive and intentional
+- animation is meaningful, not noisy
+- reduced motion is respected
+- file paths and customization points are explained at handoff

.agents/skills/frontend-slides/STYLE_PRESETS.md ADDED Viewed

	@@ -0,0 +1,330 @@

+# Style Presets Reference
+Curated visual styles for `frontend-slides`.
+Use this file for:
+- the mandatory viewport-fitting CSS base
+- preset selection and mood mapping
+- CSS gotchas and validation rules
+Abstract shapes only. Avoid illustrations unless the user explicitly asks for them.
+## Viewport Fit Is Non-Negotiable
+Every slide must fully fit in one viewport.
+### Golden Rule
+```text
+Each slide = exactly one viewport height.
+Too much content = split into more slides.
+Never scroll inside a slide.
+```
+### Density Limits
+| Slide Type | Maximum Content |
+|------------|-----------------|
+| Title slide | 1 heading + 1 subtitle + optional tagline |
+| Content slide | 1 heading + 4-6 bullets or 2 paragraphs |
+| Feature grid | 6 cards maximum |
+| Code slide | 8-10 lines maximum |
+| Quote slide | 1 quote + attribution |
+| Image slide | 1 image, ideally under 60vh |
+## Mandatory Base CSS
+Copy this block into every generated presentation and then theme on top of it.
+```css
+/* ===========================================
+   VIEWPORT FITTING: MANDATORY BASE STYLES
+   =========================================== */
+html, body {
+    height: 100%;
+    overflow-x: hidden;
+}
+html {
+    scroll-snap-type: y mandatory;
+    scroll-behavior: smooth;
+}
+.slide {
+    width: 100vw;
+    height: 100vh;
+    height: 100dvh;
+    overflow: hidden;
+    scroll-snap-align: start;
+    display: flex;
+    flex-direction: column;
+    position: relative;
+}
+.slide-content {
+    flex: 1;
+    display: flex;
+    flex-direction: column;
+    justify-content: center;
+    max-height: 100%;
+    overflow: hidden;
+    padding: var(--slide-padding);
+}
+:root {
+    --title-size: clamp(1.5rem, 5vw, 4rem);
+    --h2-size: clamp(1.25rem, 3.5vw, 2.5rem);
+    --h3-size: clamp(1rem, 2.5vw, 1.75rem);
+    --body-size: clamp(0.75rem, 1.5vw, 1.125rem);
+    --small-size: clamp(0.65rem, 1vw, 0.875rem);
+    --slide-padding: clamp(1rem, 4vw, 4rem);
+    --content-gap: clamp(0.5rem, 2vw, 2rem);
+    --element-gap: clamp(0.25rem, 1vw, 1rem);
+}
+.card, .container, .content-box {
+    max-width: min(90vw, 1000px);
+    max-height: min(80vh, 700px);
+}
+.feature-list, .bullet-list {
+    gap: clamp(0.4rem, 1vh, 1rem);
+}
+.feature-list li, .bullet-list li {
+    font-size: var(--body-size);
+    line-height: 1.4;
+}
+.grid {
+    display: grid;
+    grid-template-columns: repeat(auto-fit, minmax(min(100%, 250px), 1fr));
+    gap: clamp(0.5rem, 1.5vw, 1rem);
+}
+img, .image-container {
+    max-width: 100%;
+    max-height: min(50vh, 400px);
+    object-fit: contain;
+}
+@media (max-height: 700px) {
+    :root {
+        --slide-padding: clamp(0.75rem, 3vw, 2rem);
+        --content-gap: clamp(0.4rem, 1.5vw, 1rem);
+        --title-size: clamp(1.25rem, 4.5vw, 2.5rem);
+        --h2-size: clamp(1rem, 3vw, 1.75rem);
+    }
+}
+@media (max-height: 600px) {
+    :root {
+        --slide-padding: clamp(0.5rem, 2.5vw, 1.5rem);
+        --content-gap: clamp(0.3rem, 1vw, 0.75rem);
+        --title-size: clamp(1.1rem, 4vw, 2rem);
+        --body-size: clamp(0.7rem, 1.2vw, 0.95rem);
+    }
+    .nav-dots, .keyboard-hint, .decorative {
+        display: none;
+    }
+}
+@media (max-height: 500px) {
+    :root {
+        --slide-padding: clamp(0.4rem, 2vw, 1rem);
+        --title-size: clamp(1rem, 3.5vw, 1.5rem);
+        --h2-size: clamp(0.9rem, 2.5vw, 1.25rem);
+        --body-size: clamp(0.65rem, 1vw, 0.85rem);
+    }
+}
+@media (max-width: 600px) {
+    :root {
+        --title-size: clamp(1.25rem, 7vw, 2.5rem);
+    }
+    .grid {
+        grid-template-columns: 1fr;
+    }
+}
+@media (prefers-reduced-motion: reduce) {
+    *, *::before, *::after {
+        animation-duration: 0.01ms !important;
+        transition-duration: 0.2s !important;
+    }
+    html {
+        scroll-behavior: auto;
+    }
+}
+```
+## Viewport Checklist
+- every `.slide` has `height: 100vh`, `height: 100dvh`, and `overflow: hidden`
+- all typography uses `clamp()`
+- all spacing uses `clamp()` or viewport units
+- images have `max-height` constraints
+- grids adapt with `auto-fit` + `minmax()`
+- short-height breakpoints exist at `700px`, `600px`, and `500px`
+- if anything feels cramped, split the slide
+## Mood to Preset Mapping
+| Mood | Good Presets |
+|------|--------------|
+| Impressed / Confident | Bold Signal, Electric Studio, Dark Botanical |
+| Excited / Energized | Creative Voltage, Neon Cyber, Split Pastel |
+| Calm / Focused | Notebook Tabs, Paper & Ink, Swiss Modern |
+| Inspired / Moved | Dark Botanical, Vintage Editorial, Pastel Geometry |
+## Preset Catalog
+### 1. Bold Signal
+- Vibe: confident, high-impact, keynote-ready
+- Best for: pitch decks, launches, statements
+- Fonts: Archivo Black + Space Grotesk
+- Palette: charcoal base, hot orange focal card, crisp white text
+- Signature: oversized section numbers, high-contrast card on dark field
+### 2. Electric Studio
+- Vibe: clean, bold, agency-polished
+- Best for: client presentations, strategic reviews
+- Fonts: Manrope only
+- Palette: black, white, saturated cobalt accent
+- Signature: two-panel split and sharp editorial alignment
+### 3. Creative Voltage
+- Vibe: energetic, retro-modern, playful confidence
+- Best for: creative studios, brand work, product storytelling
+- Fonts: Syne + Space Mono
+- Palette: electric blue, neon yellow, deep navy
+- Signature: halftone textures, badges, punchy contrast
+### 4. Dark Botanical
+- Vibe: elegant, premium, atmospheric
+- Best for: luxury brands, thoughtful narratives, premium product decks
+- Fonts: Cormorant + IBM Plex Sans
+- Palette: near-black, warm ivory, blush, gold, terracotta
+- Signature: blurred abstract circles, fine rules, restrained motion
+### 5. Notebook Tabs
+- Vibe: editorial, organized, tactile
+- Best for: reports, reviews, structured storytelling
+- Fonts: Bodoni Moda + DM Sans
+- Palette: cream paper on charcoal with pastel tabs
+- Signature: paper sheet, colored side tabs, binder details
+### 6. Pastel Geometry
+- Vibe: approachable, modern, friendly
+- Best for: product overviews, onboarding, lighter brand decks
+- Fonts: Plus Jakarta Sans only
+- Palette: pale blue field, cream card, soft pink/mint/lavender accents
+- Signature: vertical pills, rounded cards, soft shadows
+### 7. Split Pastel
+- Vibe: playful, modern, creative
+- Best for: agency intros, workshops, portfolios
+- Fonts: Outfit only
+- Palette: peach + lavender split with mint badges
+- Signature: split backdrop, rounded tags, light grid overlays
+### 8. Vintage Editorial
+- Vibe: witty, personality-driven, magazine-inspired
+- Best for: personal brands, opinionated talks, storytelling
+- Fonts: Fraunces + Work Sans
+- Palette: cream, charcoal, dusty warm accents
+- Signature: geometric accents, bordered callouts, punchy serif headlines
+### 9. Neon Cyber
+- Vibe: futuristic, techy, kinetic
+- Best for: AI, infra, dev tools, future-of-X talks
+- Fonts: Clash Display + Satoshi
+- Palette: midnight navy, cyan, magenta
+- Signature: glow, particles, grids, data-radar energy
+### 10. Terminal Green
+- Vibe: developer-focused, hacker-clean
+- Best for: APIs, CLI tools, engineering demos
+- Fonts: JetBrains Mono only
+- Palette: GitHub dark + terminal green
+- Signature: scan lines, command-line framing, precise monospace rhythm
+### 11. Swiss Modern
+- Vibe: minimal, precise, data-forward
+- Best for: corporate, product strategy, analytics
+- Fonts: Archivo + Nunito
+- Palette: white, black, signal red
+- Signature: visible grids, asymmetry, geometric discipline
+### 12. Paper & Ink
+- Vibe: literary, thoughtful, story-driven
+- Best for: essays, keynote narratives, manifesto decks
+- Fonts: Cormorant Garamond + Source Serif 4
+- Palette: warm cream, charcoal, crimson accent
+- Signature: pull quotes, drop caps, elegant rules
+## Direct Selection Prompts
+If the user already knows the style they want, let them pick directly from the preset names above instead of forcing preview generation.
+## Animation Feel Mapping
+| Feeling | Motion Direction |
+|---------|------------------|
+| Dramatic / Cinematic | slow fades, parallax, large scale-ins |
+| Techy / Futuristic | glow, particles, grid motion, scramble text |
+| Playful / Friendly | springy easing, rounded shapes, floating motion |
+| Professional / Corporate | subtle 200-300ms transitions, clean slides |
+| Calm / Minimal | very restrained movement, whitespace-first |
+| Editorial / Magazine | strong hierarchy, staggered text and image interplay |
+## CSS Gotcha: Negating Functions
+Never write these:
+```css
+right: -clamp(28px, 3.5vw, 44px);
+margin-left: -min(10vw, 100px);
+```
+Browsers ignore them silently.
+Always write this instead:
+```css
+right: calc(-1 * clamp(28px, 3.5vw, 44px));
+margin-left: calc(-1 * min(10vw, 100px));
+```
+## Validation Sizes
+Test at minimum:
+- Desktop: `1920x1080`, `1440x900`, `1280x720`
+- Tablet: `1024x768`, `768x1024`
+- Mobile: `375x667`, `414x896`
+- Landscape phone: `667x375`, `896x414`
+## Anti-Patterns
+Do not use:
+- purple-on-white startup templates
+- Inter / Roboto / Arial as the visual voice unless the user explicitly wants utilitarian neutrality
+- bullet walls, tiny type, or code blocks that require scrolling
+- decorative illustrations when abstract geometry would do the job better

.agents/skills/karpathy-guidelines/SKILL.md ADDED Viewed

	@@ -0,0 +1,67 @@

+---
+name: karpathy-guidelines
+description: Behavioral guidelines to reduce common LLM coding mistakes. Use when writing, reviewing, or refactoring code to avoid overcomplication, make surgical changes, surface assumptions, and define verifiable success criteria.
+license: MIT
+---
+# Karpathy Guidelines
+Behavioral guidelines to reduce common LLM coding mistakes, derived from [Andrej Karpathy's observations](https://x.com/karpathy/status/2015883857489522876) on LLM coding pitfalls.
+**Tradeoff:** These guidelines bias toward caution over speed. For trivial tasks, use judgment.
+## 1. Think Before Coding
+**Don't assume. Don't hide confusion. Surface tradeoffs.**
+Before implementing:
+- State your assumptions explicitly. If uncertain, ask.
+- If multiple interpretations exist, present them - don't pick silently.
+- If a simpler approach exists, say so. Push back when warranted.
+- If something is unclear, stop. Name what's confusing. Ask.
+## 2. Simplicity First
+**Minimum code that solves the problem. Nothing speculative.**
+- No features beyond what was asked.
+- No abstractions for single-use code.
+- No "flexibility" or "configurability" that wasn't requested.
+- No error handling for impossible scenarios.
+- If you write 200 lines and it could be 50, rewrite it.
+Ask yourself: "Would a senior engineer say this is overcomplicated?" If yes, simplify.
+## 3. Surgical Changes
+**Touch only what you must. Clean up only your own mess.**
+When editing existing code:
+- Don't "improve" adjacent code, comments, or formatting.
+- Don't refactor things that aren't broken.
+- Match existing style, even if you'd do it differently.
+- If you notice unrelated dead code, mention it - don't delete it.
+When your changes create orphans:
+- Remove imports/variables/functions that YOUR changes made unused.
+- Don't remove pre-existing dead code unless asked.
+The test: Every changed line should trace directly to the user's request.
+## 4. Goal-Driven Execution
+**Define success criteria. Loop until verified.**
+Transform tasks into verifiable goals:
+- "Add validation" → "Write tests for invalid inputs, then make them pass"
+- "Fix the bug" → "Write a test that reproduces it, then make it pass"
+- "Refactor X" → "Ensure tests pass before and after"
+For multi-step tasks, state a brief plan:
+```
+1. [Step] → verify: [check]
+2. [Step] → verify: [check]
+3. [Step] → verify: [check]
+```
+Strong success criteria let you loop independently. Weak criteria ("make it work") require constant clarification.

.agents/skills/openenv-cli/SKILL.md ADDED Viewed

	@@ -0,0 +1,18 @@

+---
+name: openenv-cli
+description: "OpenEnv CLI (`openenv`) for scaffolding, validating, building, and pushing OpenEnv environments."
+---
+Install: `pip install openenv-core`
+The OpenEnv CLI command `openenv` is available.
+Use `openenv --help` to view available commands.
+Generated with `openenv-core v0.2.3`. Run `openenv skills add --force` to regenerate.
+## Tips
+- Start with `openenv init <env_name>` to scaffold a new environment
+- Validate projects with `openenv validate`
+- Build and deploy with `openenv build` and `openenv push`
+- Use `openenv <command> --help` for command-specific options

.agents/skills/python-testing/SKILL.md ADDED Viewed

	@@ -0,0 +1,816 @@

+---
+name: python-testing
+description: Python testing strategies using pytest, TDD methodology, fixtures, mocking, parametrization, and coverage requirements.
+origin: ECC
+---
+# Python Testing Patterns
+Comprehensive testing strategies for Python applications using pytest, TDD methodology, and best practices.
+## When to Activate
+- Writing new Python code (follow TDD: red, green, refactor)
+- Designing test suites for Python projects
+- Reviewing Python test coverage
+- Setting up testing infrastructure
+## Core Testing Philosophy
+### Test-Driven Development (TDD)
+Always follow the TDD cycle:
+1. **RED**: Write a failing test for the desired behavior
+2. **GREEN**: Write minimal code to make the test pass
+3. **REFACTOR**: Improve code while keeping tests green
+```python
+# Step 1: Write failing test (RED)
+def test_add_numbers():
+    result = add(2, 3)
+    assert result == 5
+# Step 2: Write minimal implementation (GREEN)
+def add(a, b):
+    return a + b
+# Step 3: Refactor if needed (REFACTOR)
+```
+### Coverage Requirements
+- **Target**: 80%+ code coverage
+- **Critical paths**: 100% coverage required
+- Use `pytest --cov` to measure coverage
+```bash
+pytest --cov=mypackage --cov-report=term-missing --cov-report=html
+```
+## pytest Fundamentals
+### Basic Test Structure
+```python
+import pytest
+def test_addition():
+    """Test basic addition."""
+    assert 2 + 2 == 4
+def test_string_uppercase():
+    """Test string uppercasing."""
+    text = "hello"
+    assert text.upper() == "HELLO"
+def test_list_append():
+    """Test list append."""
+    items = [1, 2, 3]
+    items.append(4)
+    assert 4 in items
+    assert len(items) == 4
+```
+### Assertions
+```python
+# Equality
+assert result == expected
+# Inequality
+assert result != unexpected
+# Truthiness
+assert result  # Truthy
+assert not result  # Falsy
+assert result is True  # Exactly True
+assert result is False  # Exactly False
+assert result is None  # Exactly None
+# Membership
+assert item in collection
+assert item not in collection
+# Comparisons
+assert result > 0
+assert 0 <= result <= 100
+# Type checking
+assert isinstance(result, str)
+# Exception testing (preferred approach)
+with pytest.raises(ValueError):
+    raise ValueError("error message")
+# Check exception message
+with pytest.raises(ValueError, match="invalid input"):
+    raise ValueError("invalid input provided")
+# Check exception attributes
+with pytest.raises(ValueError) as exc_info:
+    raise ValueError("error message")
+assert str(exc_info.value) == "error message"
+```
+## Fixtures
+### Basic Fixture Usage
+```python
+import pytest
+@pytest.fixture
+def sample_data():
+    """Fixture providing sample data."""
+    return {"name": "Alice", "age": 30}
+def test_sample_data(sample_data):
+    """Test using the fixture."""
+    assert sample_data["name"] == "Alice"
+    assert sample_data["age"] == 30
+```
+### Fixture with Setup/Teardown
+```python
+@pytest.fixture
+def database():
+    """Fixture with setup and teardown."""
+    # Setup
+    db = Database(":memory:")
+    db.create_tables()
+    db.insert_test_data()
+    yield db  # Provide to test
+    # Teardown
+    db.close()
+def test_database_query(database):
+    """Test database operations."""
+    result = database.query("SELECT * FROM users")
+    assert len(result) > 0
+```
+### Fixture Scopes
+```python
+# Function scope (default) - runs for each test
+@pytest.fixture
+def temp_file():
+    with open("temp.txt", "w") as f:
+        yield f
+    os.remove("temp.txt")
+# Module scope - runs once per module
+@pytest.fixture(scope="module")
+def module_db():
+    db = Database(":memory:")
+    db.create_tables()
+    yield db
+    db.close()
+# Session scope - runs once per test session
+@pytest.fixture(scope="session")
+def shared_resource():
+    resource = ExpensiveResource()
+    yield resource
+    resource.cleanup()
+```
+### Fixture with Parameters
+```python
+@pytest.fixture(params=[1, 2, 3])
+def number(request):
+    """Parameterized fixture."""
+    return request.param
+def test_numbers(number):
+    """Test runs 3 times, once for each parameter."""
+    assert number > 0
+```
+### Using Multiple Fixtures
+```python
+@pytest.fixture
+def user():
+    return User(id=1, name="Alice")
+@pytest.fixture
+def admin():
+    return User(id=2, name="Admin", role="admin")
+def test_user_admin_interaction(user, admin):
+    """Test using multiple fixtures."""
+    assert admin.can_manage(user)
+```
+### Autouse Fixtures
+```python
+@pytest.fixture(autouse=True)
+def reset_config():
+    """Automatically runs before every test."""
+    Config.reset()
+    yield
+    Config.cleanup()
+def test_without_fixture_call():
+    # reset_config runs automatically
+    assert Config.get_setting("debug") is False
+```
+### Conftest.py for Shared Fixtures
+```python
+# tests/conftest.py
+import pytest
+@pytest.fixture
+def client():
+    """Shared fixture for all tests."""
+    app = create_app(testing=True)
+    with app.test_client() as client:
+        yield client
+@pytest.fixture
+def auth_headers(client):
+    """Generate auth headers for API testing."""
+    response = client.post("/api/login", json={
+        "username": "test",
+        "password": "test"
+    })
+    token = response.json["token"]
+    return {"Authorization": f"Bearer {token}"}
+```
+## Parametrization
+### Basic Parametrization
+```python
+@pytest.mark.parametrize("input,expected", [
+    ("hello", "HELLO"),
+    ("world", "WORLD"),
+    ("PyThOn", "PYTHON"),
+])
+def test_uppercase(input, expected):
+    """Test runs 3 times with different inputs."""
+    assert input.upper() == expected
+```
+### Multiple Parameters
+```python
+@pytest.mark.parametrize("a,b,expected", [
+    (2, 3, 5),
+    (0, 0, 0),
+    (-1, 1, 0),
+    (100, 200, 300),
+])
+def test_add(a, b, expected):
+    """Test addition with multiple inputs."""
+    assert add(a, b) == expected
+```
+### Parametrize with IDs
+```python
+@pytest.mark.parametrize("input,expected", [
+    ("valid@email.com", True),
+    ("invalid", False),
+    ("@no-domain.com", False),
+], ids=["valid-email", "missing-at", "missing-domain"])
+def test_email_validation(input, expected):
+    """Test email validation with readable test IDs."""
+    assert is_valid_email(input) is expected
+```
+### Parametrized Fixtures
+```python
+@pytest.fixture(params=["sqlite", "postgresql", "mysql"])
+def db(request):
+    """Test against multiple database backends."""
+    if request.param == "sqlite":
+        return Database(":memory:")
+    elif request.param == "postgresql":
+        return Database("postgresql://localhost/test")
+    elif request.param == "mysql":
+        return Database("mysql://localhost/test")
+def test_database_operations(db):
+    """Test runs 3 times, once for each database."""
+    result = db.query("SELECT 1")
+    assert result is not None
+```
+## Markers and Test Selection
+### Custom Markers
+```python
+# Mark slow tests
+@pytest.mark.slow
+def test_slow_operation():
+    time.sleep(5)
+# Mark integration tests
+@pytest.mark.integration
+def test_api_integration():
+    response = requests.get("https://api.example.com")
+    assert response.status_code == 200
+# Mark unit tests
+@pytest.mark.unit
+def test_unit_logic():
+    assert calculate(2, 3) == 5
+```
+### Run Specific Tests
+```bash
+# Run only fast tests
+pytest -m "not slow"
+# Run only integration tests
+pytest -m integration
+# Run integration or slow tests
+pytest -m "integration or slow"
+# Run tests marked as unit but not slow
+pytest -m "unit and not slow"
+```
+### Configure Markers in pytest.ini
+```ini
+[pytest]
+markers =
+    slow: marks tests as slow
+    integration: marks tests as integration tests
+    unit: marks tests as unit tests
+    django: marks tests as requiring Django
+```
+## Mocking and Patching
+### Mocking Functions
+```python
+from unittest.mock import patch, Mock
+@patch("mypackage.external_api_call")
+def test_with_mock(api_call_mock):
+    """Test with mocked external API."""
+    api_call_mock.return_value = {"status": "success"}
+    result = my_function()
+    api_call_mock.assert_called_once()
+    assert result["status"] == "success"
+```
+### Mocking Return Values
+```python
+@patch("mypackage.Database.connect")
+def test_database_connection(connect_mock):
+    """Test with mocked database connection."""
+    connect_mock.return_value = MockConnection()
+    db = Database()
+    db.connect()
+    connect_mock.assert_called_once_with("localhost")
+```
+### Mocking Exceptions
+```python
+@patch("mypackage.api_call")
+def test_api_error_handling(api_call_mock):
+    """Test error handling with mocked exception."""
+    api_call_mock.side_effect = ConnectionError("Network error")
+    with pytest.raises(ConnectionError):
+        api_call()
+    api_call_mock.assert_called_once()
+```
+### Mocking Context Managers
+```python
+@patch("builtins.open", new_callable=mock_open)
+def test_file_reading(mock_file):
+    """Test file reading with mocked open."""
+    mock_file.return_value.read.return_value = "file content"
+    result = read_file("test.txt")
+    mock_file.assert_called_once_with("test.txt", "r")
+    assert result == "file content"
+```
+### Using Autospec
+```python
+@patch("mypackage.DBConnection", autospec=True)
+def test_autospec(db_mock):
+    """Test with autospec to catch API misuse."""
+    db = db_mock.return_value
+    db.query("SELECT * FROM users")
+    # This would fail if DBConnection doesn't have query method
+    db_mock.assert_called_once()
+```
+### Mock Class Instances
+```python
+class TestUserService:
+    @patch("mypackage.UserRepository")
+    def test_create_user(self, repo_mock):
+        """Test user creation with mocked repository."""
+        repo_mock.return_value.save.return_value = User(id=1, name="Alice")
+        service = UserService(repo_mock.return_value)
+        user = service.create_user(name="Alice")
+        assert user.name == "Alice"
+        repo_mock.return_value.save.assert_called_once()
+```
+### Mock Property
+```python
+@pytest.fixture
+def mock_config():
+    """Create a mock with a property."""
+    config = Mock()
+    type(config).debug = PropertyMock(return_value=True)
+    type(config).api_key = PropertyMock(return_value="test-key")
+    return config
+def test_with_mock_config(mock_config):
+    """Test with mocked config properties."""
+    assert mock_config.debug is True
+    assert mock_config.api_key == "test-key"
+```
+## Testing Async Code
+### Async Tests with pytest-asyncio
+```python
+import pytest
+@pytest.mark.asyncio
+async def test_async_function():
+    """Test async function."""
+    result = await async_add(2, 3)
+    assert result == 5
+@pytest.mark.asyncio
+async def test_async_with_fixture(async_client):
+    """Test async with async fixture."""
+    response = await async_client.get("/api/users")
+    assert response.status_code == 200
+```
+### Async Fixture
+```python
+@pytest.fixture
+async def async_client():
+    """Async fixture providing async test client."""
+    app = create_app()
+    async with app.test_client() as client:
+        yield client
+@pytest.mark.asyncio
+async def test_api_endpoint(async_client):
+    """Test using async fixture."""
+    response = await async_client.get("/api/data")
+    assert response.status_code == 200
+```
+### Mocking Async Functions
+```python
+@pytest.mark.asyncio
+@patch("mypackage.async_api_call")
+async def test_async_mock(api_call_mock):
+    """Test async function with mock."""
+    api_call_mock.return_value = {"status": "ok"}
+    result = await my_async_function()
+    api_call_mock.assert_awaited_once()
+    assert result["status"] == "ok"
+```
+## Testing Exceptions
+### Testing Expected Exceptions
+```python
+def test_divide_by_zero():
+    """Test that dividing by zero raises ZeroDivisionError."""
+    with pytest.raises(ZeroDivisionError):
+        divide(10, 0)
+def test_custom_exception():
+    """Test custom exception with message."""
+    with pytest.raises(ValueError, match="invalid input"):
+        validate_input("invalid")
+```
+### Testing Exception Attributes
+```python
+def test_exception_with_details():
+    """Test exception with custom attributes."""
+    with pytest.raises(CustomError) as exc_info:
+        raise CustomError("error", code=400)
+    assert exc_info.value.code == 400
+    assert "error" in str(exc_info.value)
+```
+## Testing Side Effects
+### Testing File Operations
+```python
+import tempfile
+import os
+def test_file_processing():
+    """Test file processing with temp file."""
+    with tempfile.NamedTemporaryFile(mode='w', delete=False, suffix='.txt') as f:
+        f.write("test content")
+        temp_path = f.name
+    try:
+        result = process_file(temp_path)
+        assert result == "processed: test content"
+    finally:
+        os.unlink(temp_path)
+```
+### Testing with pytest's tmp_path Fixture
+```python
+def test_with_tmp_path(tmp_path):
+    """Test using pytest's built-in temp path fixture."""
+    test_file = tmp_path / "test.txt"
+    test_file.write_text("hello world")
+    result = process_file(str(test_file))
+    assert result == "hello world"
+    # tmp_path automatically cleaned up
+```
+### Testing with tmpdir Fixture
+```python
+def test_with_tmpdir(tmpdir):
+    """Test using pytest's tmpdir fixture."""
+    test_file = tmpdir.join("test.txt")
+    test_file.write("data")
+    result = process_file(str(test_file))
+    assert result == "data"
+```
+## Test Organization
+### Directory Structure
+```
+tests/
+├── conftest.py                 # Shared fixtures
+├── __init__.py
+├── unit/                       # Unit tests
+│   ├── __init__.py
+│   ├── test_models.py
+│   ├── test_utils.py
+│   └── test_services.py
+├── integration/                # Integration tests
+│   ├── __init__.py
+│   ├── test_api.py
+│   └── test_database.py
+└── e2e/                        # End-to-end tests
+    ├── __init__.py
+    └── test_user_flow.py
+```
+### Test Classes
+```python
+class TestUserService:
+    """Group related tests in a class."""
+    @pytest.fixture(autouse=True)
+    def setup(self):
+        """Setup runs before each test in this class."""
+        self.service = UserService()
+    def test_create_user(self):
+        """Test user creation."""
+        user = self.service.create_user("Alice")
+        assert user.name == "Alice"
+    def test_delete_user(self):
+        """Test user deletion."""
+        user = User(id=1, name="Bob")
+        self.service.delete_user(user)
+        assert not self.service.user_exists(1)
+```
+## Best Practices
+### DO
+- **Follow TDD**: Write tests before code (red-green-refactor)
+- **Test one thing**: Each test should verify a single behavior
+- **Use descriptive names**: `test_user_login_with_invalid_credentials_fails`
+- **Use fixtures**: Eliminate duplication with fixtures
+- **Mock external dependencies**: Don't depend on external services
+- **Test edge cases**: Empty inputs, None values, boundary conditions
+- **Aim for 80%+ coverage**: Focus on critical paths
+- **Keep tests fast**: Use marks to separate slow tests
+### DON'T
+- **Don't test implementation**: Test behavior, not internals
+- **Don't use complex conditionals in tests**: Keep tests simple
+- **Don't ignore test failures**: All tests must pass
+- **Don't test third-party code**: Trust libraries to work
+- **Don't share state between tests**: Tests should be independent
+- **Don't catch exceptions in tests**: Use `pytest.raises`
+- **Don't use print statements**: Use assertions and pytest output
+- **Don't write tests that are too brittle**: Avoid over-specific mocks
+## Common Patterns
+### Testing API Endpoints (FastAPI/Flask)
+```python
+@pytest.fixture
+def client():
+    app = create_app(testing=True)
+    return app.test_client()
+def test_get_user(client):
+    response = client.get("/api/users/1")
+    assert response.status_code == 200
+    assert response.json["id"] == 1
+def test_create_user(client):
+    response = client.post("/api/users", json={
+        "name": "Alice",
+        "email": "alice@example.com"
+    })
+    assert response.status_code == 201
+    assert response.json["name"] == "Alice"
+```
+### Testing Database Operations
+```python
+@pytest.fixture
+def db_session():
+    """Create a test database session."""
+    session = Session(bind=engine)
+    session.begin_nested()
+    yield session
+    session.rollback()
+    session.close()
+def test_create_user(db_session):
+    user = User(name="Alice", email="alice@example.com")
+    db_session.add(user)
+    db_session.commit()
+    retrieved = db_session.query(User).filter_by(name="Alice").first()
+    assert retrieved.email == "alice@example.com"
+```
+### Testing Class Methods
+```python
+class TestCalculator:
+    @pytest.fixture
+    def calculator(self):
+        return Calculator()
+    def test_add(self, calculator):
+        assert calculator.add(2, 3) == 5
+    def test_divide_by_zero(self, calculator):
+        with pytest.raises(ZeroDivisionError):
+            calculator.divide(10, 0)
+```
+## pytest Configuration
+### pytest.ini
+```ini
+[pytest]
+testpaths = tests
+python_files = test_*.py
+python_classes = Test*
+python_functions = test_*
+addopts =
+    --strict-markers
+    --disable-warnings
+    --cov=mypackage
+    --cov-report=term-missing
+    --cov-report=html
+markers =
+    slow: marks tests as slow
+    integration: marks tests as integration tests
+    unit: marks tests as unit tests
+```
+### pyproject.toml
+```toml
+[tool.pytest.ini_options]
+testpaths = ["tests"]
+python_files = ["test_*.py"]
+python_classes = ["Test*"]
+python_functions = ["test_*"]
+addopts = [
+    "--strict-markers",
+    "--cov=mypackage",
+    "--cov-report=term-missing",
+    "--cov-report=html",
+]
+markers = [
+    "slow: marks tests as slow",
+    "integration: marks tests as integration tests",
+    "unit: marks tests as unit tests",
+]
+```
+## Running Tests
+```bash
+# Run all tests
+pytest
+# Run specific file
+pytest tests/test_utils.py
+# Run specific test
+pytest tests/test_utils.py::test_function
+# Run with verbose output
+pytest -v
+# Run with coverage
+pytest --cov=mypackage --cov-report=html
+# Run only fast tests
+pytest -m "not slow"
+# Run until first failure
+pytest -x
+# Run and stop on N failures
+pytest --maxfail=3
+# Run last failed tests
+pytest --lf
+# Run tests with pattern
+pytest -k "test_user"
+# Run with debugger on failure
+pytest --pdb
+```
+## Quick Reference
+| Pattern | Usage |
+|---------|-------|
+| `pytest.raises()` | Test expected exceptions |
+| `@pytest.fixture()` | Create reusable test fixtures |
+| `@pytest.mark.parametrize()` | Run tests with multiple inputs |
+| `@pytest.mark.slow` | Mark slow tests |
+| `pytest -m "not slow"` | Skip slow tests |
+| `@patch()` | Mock functions and classes |
+| `tmp_path` fixture | Automatic temp directory |
+| `pytest --cov` | Generate coverage report |
+| `assert` | Simple and readable assertions |
+**Remember**: Tests are code too. Keep them clean, readable, and maintainable. Good tests catch bugs; great tests prevent them.

.agents/skills/receiving-code-review/SKILL.md ADDED Viewed

	@@ -0,0 +1,213 @@

+---
+name: receiving-code-review
+description: Use when receiving code review feedback, before implementing suggestions, especially if feedback seems unclear or technically questionable - requires technical rigor and verification, not performative agreement or blind implementation
+---
+# Code Review Reception
+## Overview
+Code review requires technical evaluation, not emotional performance.
+**Core principle:** Verify before implementing. Ask before assuming. Technical correctness over social comfort.
+## The Response Pattern
+```
+WHEN receiving code review feedback:
+1. READ: Complete feedback without reacting
+2. UNDERSTAND: Restate requirement in own words (or ask)
+3. VERIFY: Check against codebase reality
+4. EVALUATE: Technically sound for THIS codebase?
+5. RESPOND: Technical acknowledgment or reasoned pushback
+6. IMPLEMENT: One item at a time, test each
+```
+## Forbidden Responses
+**NEVER:**
+- "You're absolutely right!" (explicit CLAUDE.md violation)
+- "Great point!" / "Excellent feedback!" (performative)
+- "Let me implement that now" (before verification)
+**INSTEAD:**
+- Restate the technical requirement
+- Ask clarifying questions
+- Push back with technical reasoning if wrong
+- Just start working (actions > words)
+## Handling Unclear Feedback
+```
+IF any item is unclear:
+  STOP - do not implement anything yet
+  ASK for clarification on unclear items
+WHY: Items may be related. Partial understanding = wrong implementation.
+```
+**Example:**
+```
+your human partner: "Fix 1-6"
+You understand 1,2,3,6. Unclear on 4,5.
+❌ WRONG: Implement 1,2,3,6 now, ask about 4,5 later
+✅ RIGHT: "I understand items 1,2,3,6. Need clarification on 4 and 5 before proceeding."
+```
+## Source-Specific Handling
+### From your human partner
+- **Trusted** - implement after understanding
+- **Still ask** if scope unclear
+- **No performative agreement**
+- **Skip to action** or technical acknowledgment
+### From External Reviewers
+```
+BEFORE implementing:
+  1. Check: Technically correct for THIS codebase?
+  2. Check: Breaks existing functionality?
+  3. Check: Reason for current implementation?
+  4. Check: Works on all platforms/versions?
+  5. Check: Does reviewer understand full context?
+IF suggestion seems wrong:
+  Push back with technical reasoning
+IF can't easily verify:
+  Say so: "I can't verify this without [X]. Should I [investigate/ask/proceed]?"
+IF conflicts with your human partner's prior decisions:
+  Stop and discuss with your human partner first
+```
+**your human partner's rule:** "External feedback - be skeptical, but check carefully"
+## YAGNI Check for "Professional" Features
+```
+IF reviewer suggests "implementing properly":
+  grep codebase for actual usage
+  IF unused: "This endpoint isn't called. Remove it (YAGNI)?"
+  IF used: Then implement properly
+```
+**your human partner's rule:** "You and reviewer both report to me. If we don't need this feature, don't add it."
+## Implementation Order
+```
+FOR multi-item feedback:
+  1. Clarify anything unclear FIRST
+  2. Then implement in this order:
+     - Blocking issues (breaks, security)
+     - Simple fixes (typos, imports)
+     - Complex fixes (refactoring, logic)
+  3. Test each fix individually
+  4. Verify no regressions
+```
+## When To Push Back
+Push back when:
+- Suggestion breaks existing functionality
+- Reviewer lacks full context
+- Violates YAGNI (unused feature)
+- Technically incorrect for this stack
+- Legacy/compatibility reasons exist
+- Conflicts with your human partner's architectural decisions
+**How to push back:**
+- Use technical reasoning, not defensiveness
+- Ask specific questions
+- Reference working tests/code
+- Involve your human partner if architectural
+**Signal if uncomfortable pushing back out loud:** "Strange things are afoot at the Circle K"
+## Acknowledging Correct Feedback
+When feedback IS correct:
+```
+✅ "Fixed. [Brief description of what changed]"
+✅ "Good catch - [specific issue]. Fixed in [location]."
+✅ [Just fix it and show in the code]
+❌ "You're absolutely right!"
+❌ "Great point!"
+❌ "Thanks for catching that!"
+❌ "Thanks for [anything]"
+❌ ANY gratitude expression
+```
+**Why no thanks:** Actions speak. Just fix it. The code itself shows you heard the feedback.
+**If you catch yourself about to write "Thanks":** DELETE IT. State the fix instead.
+## Gracefully Correcting Your Pushback
+If you pushed back and were wrong:
+```
+✅ "You were right - I checked [X] and it does [Y]. Implementing now."
+✅ "Verified this and you're correct. My initial understanding was wrong because [reason]. Fixing."
+❌ Long apology
+❌ Defending why you pushed back
+❌ Over-explaining
+```
+State the correction factually and move on.
+## Common Mistakes
+| Mistake | Fix |
+|---------|-----|
+| Performative agreement | State requirement or just act |
+| Blind implementation | Verify against codebase first |
+| Batch without testing | One at a time, test each |
+| Assuming reviewer is right | Check if breaks things |
+| Avoiding pushback | Technical correctness > comfort |
+| Partial implementation | Clarify all items first |
+| Can't verify, proceed anyway | State limitation, ask for direction |
+## Real Examples
+**Performative Agreement (Bad):**
+```
+Reviewer: "Remove legacy code"
+❌ "You're absolutely right! Let me remove that..."
+```
+**Technical Verification (Good):**
+```
+Reviewer: "Remove legacy code"
+✅ "Checking... build target is 10.15+, this API needs 13+. Need legacy for backward compat. Current impl has wrong bundle ID - fix it or drop pre-13 support?"
+```
+**YAGNI (Good):**
+```
+Reviewer: "Implement proper metrics tracking with database, date filters, CSV export"
+✅ "Grepped codebase - nothing calls this endpoint. Remove it (YAGNI)? Or is there usage I'm missing?"
+```
+**Unclear Item (Good):**
+```
+your human partner: "Fix items 1-6"
+You understand 1,2,3,6. Unclear on 4,5.
+✅ "Understand 1,2,3,6. Need clarification on 4 and 5 before implementing."
+```
+## GitHub Thread Replies
+When replying to inline review comments on GitHub, reply in the comment thread (`gh api repos/{owner}/{repo}/pulls/{pr}/comments/{id}/replies`), not as a top-level PR comment.
+## The Bottom Line
+**External feedback = suggestions to evaluate, not orders to follow.**
+Verify. Question. Then implement.
+No performative agreement. Technical rigor always.

.agents/skills/requesting-code-review/SKILL.md ADDED Viewed

	@@ -0,0 +1,105 @@

+---
+name: requesting-code-review
+description: Use when completing tasks, implementing major features, or before merging to verify work meets requirements
+---
+# Requesting Code Review
+Dispatch superpowers:code-reviewer subagent to catch issues before they cascade. The reviewer gets precisely crafted context for evaluation — never your session's history. This keeps the reviewer focused on the work product, not your thought process, and preserves your own context for continued work.
+**Core principle:** Review early, review often.
+## When to Request Review
+**Mandatory:**
+- After each task in subagent-driven development
+- After completing major feature
+- Before merge to main
+**Optional but valuable:**
+- When stuck (fresh perspective)
+- Before refactoring (baseline check)
+- After fixing complex bug
+## How to Request
+**1. Get git SHAs:**
+```bash
+BASE_SHA=$(git rev-parse HEAD~1)  # or origin/main
+HEAD_SHA=$(git rev-parse HEAD)
+```
+**2. Dispatch code-reviewer subagent:**
+Use Task tool with superpowers:code-reviewer type, fill template at `code-reviewer.md`
+**Placeholders:**
+- `{WHAT_WAS_IMPLEMENTED}` - What you just built
+- `{PLAN_OR_REQUIREMENTS}` - What it should do
+- `{BASE_SHA}` - Starting commit
+- `{HEAD_SHA}` - Ending commit
+- `{DESCRIPTION}` - Brief summary
+**3. Act on feedback:**
+- Fix Critical issues immediately
+- Fix Important issues before proceeding
+- Note Minor issues for later
+- Push back if reviewer is wrong (with reasoning)
+## Example
+```
+[Just completed Task 2: Add verification function]
+You: Let me request code review before proceeding.
+BASE_SHA=$(git log --oneline | grep "Task 1" | head -1 | awk '{print $1}')
+HEAD_SHA=$(git rev-parse HEAD)
+[Dispatch superpowers:code-reviewer subagent]
+  WHAT_WAS_IMPLEMENTED: Verification and repair functions for conversation index
+  PLAN_OR_REQUIREMENTS: Task 2 from docs/superpowers/plans/deployment-plan.md
+  BASE_SHA: a7981ec
+  HEAD_SHA: 3df7661
+  DESCRIPTION: Added verifyIndex() and repairIndex() with 4 issue types
+[Subagent returns]:
+  Strengths: Clean architecture, real tests
+  Issues:
+    Important: Missing progress indicators
+    Minor: Magic number (100) for reporting interval
+  Assessment: Ready to proceed
+You: [Fix progress indicators]
+[Continue to Task 3]
+```
+## Integration with Workflows
+**Subagent-Driven Development:**
+- Review after EACH task
+- Catch issues before they compound
+- Fix before moving to next task
+**Executing Plans:**
+- Review after each batch (3 tasks)
+- Get feedback, apply, continue
+**Ad-Hoc Development:**
+- Review before merge
+- Review when stuck
+## Red Flags
+**Never:**
+- Skip review because "it's simple"
+- Ignore Critical issues
+- Proceed with unfixed Important issues
+- Argue with valid technical feedback
+**If reviewer wrong:**
+- Push back with technical reasoning
+- Show code/tests that prove it works
+- Request clarification
+See template at: requesting-code-review/code-reviewer.md

.agents/skills/requesting-code-review/code-reviewer.md ADDED Viewed

	@@ -0,0 +1,146 @@

+# Code Review Agent
+You are reviewing code changes for production readiness.
+**Your task:**
+1. Review {WHAT_WAS_IMPLEMENTED}
+2. Compare against {PLAN_OR_REQUIREMENTS}
+3. Check code quality, architecture, testing
+4. Categorize issues by severity
+5. Assess production readiness
+## What Was Implemented
+{DESCRIPTION}
+## Requirements/Plan
+{PLAN_REFERENCE}
+## Git Range to Review
+**Base:** {BASE_SHA}
+**Head:** {HEAD_SHA}
+```bash
+git diff --stat {BASE_SHA}..{HEAD_SHA}
+git diff {BASE_SHA}..{HEAD_SHA}
+```
+## Review Checklist
+**Code Quality:**
+- Clean separation of concerns?
+- Proper error handling?
+- Type safety (if applicable)?
+- DRY principle followed?
+- Edge cases handled?
+**Architecture:**
+- Sound design decisions?
+- Scalability considerations?
+- Performance implications?
+- Security concerns?
+**Testing:**
+- Tests actually test logic (not mocks)?
+- Edge cases covered?
+- Integration tests where needed?
+- All tests passing?
+**Requirements:**
+- All plan requirements met?
+- Implementation matches spec?
+- No scope creep?
+- Breaking changes documented?
+**Production Readiness:**
+- Migration strategy (if schema changes)?
+- Backward compatibility considered?
+- Documentation complete?
+- No obvious bugs?
+## Output Format
+### Strengths
+[What's well done? Be specific.]
+### Issues
+#### Critical (Must Fix)
+[Bugs, security issues, data loss risks, broken functionality]
+#### Important (Should Fix)
+[Architecture problems, missing features, poor error handling, test gaps]
+#### Minor (Nice to Have)
+[Code style, optimization opportunities, documentation improvements]
+**For each issue:**
+- File:line reference
+- What's wrong
+- Why it matters
+- How to fix (if not obvious)
+### Recommendations
+[Improvements for code quality, architecture, or process]
+### Assessment
+**Ready to merge?** [Yes/No/With fixes]
+**Reasoning:** [Technical assessment in 1-2 sentences]
+## Critical Rules
+**DO:**
+- Categorize by actual severity (not everything is Critical)
+- Be specific (file:line, not vague)
+- Explain WHY issues matter
+- Acknowledge strengths
+- Give clear verdict
+**DON'T:**
+- Say "looks good" without checking
+- Mark nitpicks as Critical
+- Give feedback on code you didn't review
+- Be vague ("improve error handling")
+- Avoid giving a clear verdict
+## Example Output
+```
+### Strengths
+- Clean database schema with proper migrations (db.ts:15-42)
+- Comprehensive test coverage (18 tests, all edge cases)
+- Good error handling with fallbacks (summarizer.ts:85-92)
+### Issues
+#### Important
+1. **Missing help text in CLI wrapper**
+   - File: index-conversations:1-31
+   - Issue: No --help flag, users won't discover --concurrency
+   - Fix: Add --help case with usage examples
+2. **Date validation missing**
+   - File: search.ts:25-27
+   - Issue: Invalid dates silently return no results
+   - Fix: Validate ISO format, throw error with example
+#### Minor
+1. **Progress indicators**
+   - File: indexer.ts:130
+   - Issue: No "X of Y" counter for long operations
+   - Impact: Users don't know how long to wait
+### Recommendations
+- Add progress reporting for user experience
+- Consider config file for excluded projects (portability)
+### Assessment
+**Ready to merge: With fixes**
+**Reasoning:** Core implementation is solid with good architecture and tests. Important issues (help text, date validation) are easily fixed and don't affect core functionality.
+```

.agents/skills/search-first/SKILL.md ADDED Viewed

	@@ -0,0 +1,161 @@

+---
+name: search-first
+description: Research-before-coding workflow. Search for existing tools, libraries, and patterns before writing custom code. Invokes the researcher agent.
+origin: ECC
+---
+# /search-first — Research Before You Code
+Systematizes the "search for existing solutions before implementing" workflow.
+## Trigger
+Use this skill when:
+- Starting a new feature that likely has existing solutions
+- Adding a dependency or integration
+- The user asks "add X functionality" and you're about to write code
+- Before creating a new utility, helper, or abstraction
+## Workflow
+```
+┌─────────────────────────────────────────────┐
+│  1. NEED ANALYSIS                           │
+│     Define what functionality is needed      │
+│     Identify language/framework constraints  │
+├─────────────────────────────────────────────┤
+│  2. PARALLEL SEARCH (researcher agent)      │
+│     ┌──────────┐ ┌──────────┐ ┌──────────┐  │
+│     │  npm /   │ │  MCP /   │ │  GitHub / │  │
+│     │  PyPI    │ │  Skills  │ │  Web      │  │
+│     └──────────┘ └──────────┘ └──────────┘  │
+├─────────────────────────────────────────────┤
+│  3. EVALUATE                                │
+│     Score candidates (functionality, maint, │
+│     community, docs, license, deps)         │
+├─────────────────────────────────────────────┤
+│  4. DECIDE                                  │
+│     ┌─────────┐  ┌──────────┐  ┌─────────┐  │
+│     │  Adopt  │  │  Extend  │  │  Build   │  │
+│     │ as-is   │  │  /Wrap   │  │  Custom  │  │
+│     └─────────┘  └──────────┘  └─────────┘  │
+├─────────────────────────────────────────────┤
+│  5. IMPLEMENT                               │
+│     Install package / Configure MCP /       │
+│     Write minimal custom code               │
+└─────────────────────────────────────────────┘
+```
+## Decision Matrix
+| Signal | Action |
+|--------|--------|
+| Exact match, well-maintained, MIT/Apache | **Adopt** — install and use directly |
+| Partial match, good foundation | **Extend** — install + write thin wrapper |
+| Multiple weak matches | **Compose** — combine 2-3 small packages |
+| Nothing suitable found | **Build** — write custom, but informed by research |
+## How to Use
+### Quick Mode (inline)
+Before writing a utility or adding functionality, mentally run through:
+0. Does this already exist in the repo? → `rg` through relevant modules/tests first
+1. Is this a common problem? → Search npm/PyPI
+2. Is there an MCP for this? → Check `~/.claude/settings.json` and search
+3. Is there a skill for this? → Check `~/.claude/skills/`
+4. Is there a GitHub implementation/template? → Run GitHub code search for maintained OSS before writing net-new code
+### Full Mode (agent)
+For non-trivial functionality, launch the researcher agent:
+```
+Task(subagent_type="general-purpose", prompt="
+  Research existing tools for: [DESCRIPTION]
+  Language/framework: [LANG]
+  Constraints: [ANY]
+  Search: npm/PyPI, MCP servers, Claude Code skills, GitHub
+  Return: Structured comparison with recommendation
+")
+```
+## Search Shortcuts by Category
+### Development Tooling
+- Linting → `eslint`, `ruff`, `textlint`, `markdownlint`
+- Formatting → `prettier`, `black`, `gofmt`
+- Testing → `jest`, `pytest`, `go test`
+- Pre-commit → `husky`, `lint-staged`, `pre-commit`
+### AI/LLM Integration
+- Claude SDK → Context7 for latest docs
+- Prompt management → Check MCP servers
+- Document processing → `unstructured`, `pdfplumber`, `mammoth`
+### Data & APIs
+- HTTP clients → `httpx` (Python), `ky`/`got` (Node)
+- Validation → `zod` (TS), `pydantic` (Python)
+- Database → Check for MCP servers first
+### Content & Publishing
+- Markdown processing → `remark`, `unified`, `markdown-it`
+- Image optimization → `sharp`, `imagemin`
+## Integration Points
+### With planner agent
+The planner should invoke researcher before Phase 1 (Architecture Review):
+- Researcher identifies available tools
+- Planner incorporates them into the implementation plan
+- Avoids "reinventing the wheel" in the plan
+### With architect agent
+The architect should consult researcher for:
+- Technology stack decisions
+- Integration pattern discovery
+- Existing reference architectures
+### With iterative-retrieval skill
+Combine for progressive discovery:
+- Cycle 1: Broad search (npm, PyPI, MCP)
+- Cycle 2: Evaluate top candidates in detail
+- Cycle 3: Test compatibility with project constraints
+## Examples
+### Example 1: "Add dead link checking"
+```
+Need: Check markdown files for broken links
+Search: npm "markdown dead link checker"
+Found: textlint-rule-no-dead-link (score: 9/10)
+Action: ADOPT — npm install textlint-rule-no-dead-link
+Result: Zero custom code, battle-tested solution
+```
+### Example 2: "Add HTTP client wrapper"
+```
+Need: Resilient HTTP client with retries and timeout handling
+Search: npm "http client retry", PyPI "httpx retry"
+Found: got (Node) with retry plugin, httpx (Python) with built-in retry
+Action: ADOPT — use got/httpx directly with retry config
+Result: Zero custom code, production-proven libraries
+```
+### Example 3: "Add config file linter"
+```
+Need: Validate project config files against a schema
+Search: npm "config linter schema", "json schema validator cli"
+Found: ajv-cli (score: 8/10)
+Action: ADOPT + EXTEND — install ajv-cli, write project-specific schema
+Result: 1 package + 1 schema file, no custom validation logic
+```
+## Anti-Patterns
+- **Jumping to code**: Writing a utility without checking if one exists
+- **Ignoring MCP**: Not checking if an MCP server already provides the capability
+- **Over-customizing**: Wrapping a library so heavily it loses its benefits
+- **Dependency bloat**: Installing a massive package for one small feature

.agents/skills/security-review/SKILL.md ADDED Viewed

	@@ -0,0 +1,495 @@

+---
+name: security-review
+description: Use this skill when adding authentication, handling user input, working with secrets, creating API endpoints, or implementing payment/sensitive features. Provides comprehensive security checklist and patterns.
+origin: ECC
+---
+# Security Review Skill
+This skill ensures all code follows security best practices and identifies potential vulnerabilities.
+## When to Activate
+- Implementing authentication or authorization
+- Handling user input or file uploads
+- Creating new API endpoints
+- Working with secrets or credentials
+- Implementing payment features
+- Storing or transmitting sensitive data
+- Integrating third-party APIs
+## Security Checklist
+### 1. Secrets Management
+#### FAIL: NEVER Do This
+```typescript
+const apiKey = "sk-proj-xxxxx"  // Hardcoded secret
+const dbPassword = "password123" // In source code
+```
+#### PASS: ALWAYS Do This
+```typescript
+const apiKey = process.env.OPENAI_API_KEY
+const dbUrl = process.env.DATABASE_URL
+// Verify secrets exist
+if (!apiKey) {
+  throw new Error('OPENAI_API_KEY not configured')
+}
+```
+#### Verification Steps
+- [ ] No hardcoded API keys, tokens, or passwords
+- [ ] All secrets in environment variables
+- [ ] `.env.local` in .gitignore
+- [ ] No secrets in git history
+- [ ] Production secrets in hosting platform (Vercel, Railway)
+### 2. Input Validation
+#### Always Validate User Input
+```typescript
+import { z } from 'zod'
+// Define validation schema
+const CreateUserSchema = z.object({
+  email: z.string().email(),
+  name: z.string().min(1).max(100),
+  age: z.number().int().min(0).max(150)
+})
+// Validate before processing
+export async function createUser(input: unknown) {
+  try {
+    const validated = CreateUserSchema.parse(input)
+    return await db.users.create(validated)
+  } catch (error) {
+    if (error instanceof z.ZodError) {
+      return { success: false, errors: error.errors }
+    }
+    throw error
+  }
+}
+```
+#### File Upload Validation
+```typescript
+function validateFileUpload(file: File) {
+  // Size check (5MB max)
+  const maxSize = 5 * 1024 * 1024
+  if (file.size > maxSize) {
+    throw new Error('File too large (max 5MB)')
+  }
+  // Type check
+  const allowedTypes = ['image/jpeg', 'image/png', 'image/gif']
+  if (!allowedTypes.includes(file.type)) {
+    throw new Error('Invalid file type')
+  }
+  // Extension check
+  const allowedExtensions = ['.jpg', '.jpeg', '.png', '.gif']
+  const extension = file.name.toLowerCase().match(/\.[^.]+$/)?.[0]
+  if (!extension || !allowedExtensions.includes(extension)) {
+    throw new Error('Invalid file extension')
+  }
+  return true
+}
+```
+#### Verification Steps
+- [ ] All user inputs validated with schemas
+- [ ] File uploads restricted (size, type, extension)
+- [ ] No direct use of user input in queries
+- [ ] Whitelist validation (not blacklist)
+- [ ] Error messages don't leak sensitive info
+### 3. SQL Injection Prevention
+#### FAIL: NEVER Concatenate SQL
+```typescript
+// DANGEROUS - SQL Injection vulnerability
+const query = `SELECT * FROM users WHERE email = '${userEmail}'`
+await db.query(query)
+```
+#### PASS: ALWAYS Use Parameterized Queries
+```typescript
+// Safe - parameterized query
+const { data } = await supabase
+  .from('users')
+  .select('*')
+  .eq('email', userEmail)
+// Or with raw SQL
+await db.query(
+  'SELECT * FROM users WHERE email = $1',
+  [userEmail]
+)
+```
+#### Verification Steps
+- [ ] All database queries use parameterized queries
+- [ ] No string concatenation in SQL
+- [ ] ORM/query builder used correctly
+- [ ] Supabase queries properly sanitized
+### 4. Authentication & Authorization
+#### JWT Token Handling
+```typescript
+// FAIL: WRONG: localStorage (vulnerable to XSS)
+localStorage.setItem('token', token)
+// PASS: CORRECT: httpOnly cookies
+res.setHeader('Set-Cookie',
+  `token=${token}; HttpOnly; Secure; SameSite=Strict; Max-Age=3600`)
+```
+#### Authorization Checks
+```typescript
+export async function deleteUser(userId: string, requesterId: string) {
+  // ALWAYS verify authorization first
+  const requester = await db.users.findUnique({
+    where: { id: requesterId }
+  })
+  if (requester.role !== 'admin') {
+    return NextResponse.json(
+      { error: 'Unauthorized' },
+      { status: 403 }
+    )
+  }
+  // Proceed with deletion
+  await db.users.delete({ where: { id: userId } })
+}
+```
+#### Row Level Security (Supabase)
+```sql
+-- Enable RLS on all tables
+ALTER TABLE users ENABLE ROW LEVEL SECURITY;
+-- Users can only view their own data
+CREATE POLICY "Users view own data"
+  ON users FOR SELECT
+  USING (auth.uid() = id);
+-- Users can only update their own data
+CREATE POLICY "Users update own data"
+  ON users FOR UPDATE
+  USING (auth.uid() = id);
+```
+#### Verification Steps
+- [ ] Tokens stored in httpOnly cookies (not localStorage)
+- [ ] Authorization checks before sensitive operations
+- [ ] Row Level Security enabled in Supabase
+- [ ] Role-based access control implemented
+- [ ] Session management secure
+### 5. XSS Prevention
+#### Sanitize HTML
+```typescript
+import DOMPurify from 'isomorphic-dompurify'
+// ALWAYS sanitize user-provided HTML
+function renderUserContent(html: string) {
+  const clean = DOMPurify.sanitize(html, {
+    ALLOWED_TAGS: ['b', 'i', 'em', 'strong', 'p'],
+    ALLOWED_ATTR: []
+  })
+  return <div dangerouslySetInnerHTML={{ __html: clean }} />
+}
+```
+#### Content Security Policy
+```typescript
+// next.config.js
+const securityHeaders = [
+  {
+    key: 'Content-Security-Policy',
+    value: `
+      default-src 'self';
+      script-src 'self' 'unsafe-eval' 'unsafe-inline';
+      style-src 'self' 'unsafe-inline';
+      img-src 'self' data: https:;
+      font-src 'self';
+      connect-src 'self' https://api.example.com;
+    `.replace(/\s{2,}/g, ' ').trim()
+  }
+]
+```
+#### Verification Steps
+- [ ] User-provided HTML sanitized
+- [ ] CSP headers configured
+- [ ] No unvalidated dynamic content rendering
+- [ ] React's built-in XSS protection used
+### 6. CSRF Protection
+#### CSRF Tokens
+```typescript
+import { csrf } from '@/lib/csrf'
+export async function POST(request: Request) {
+  const token = request.headers.get('X-CSRF-Token')
+  if (!csrf.verify(token)) {
+    return NextResponse.json(
+      { error: 'Invalid CSRF token' },
+      { status: 403 }
+    )
+  }
+  // Process request
+}
+```
+#### SameSite Cookies
+```typescript
+res.setHeader('Set-Cookie',
+  `session=${sessionId}; HttpOnly; Secure; SameSite=Strict`)
+```
+#### Verification Steps
+- [ ] CSRF tokens on state-changing operations
+- [ ] SameSite=Strict on all cookies
+- [ ] Double-submit cookie pattern implemented
+### 7. Rate Limiting
+#### API Rate Limiting
+```typescript
+import rateLimit from 'express-rate-limit'
+const limiter = rateLimit({
+  windowMs: 15 * 60 * 1000, // 15 minutes
+  max: 100, // 100 requests per window
+  message: 'Too many requests'
+})
+// Apply to routes
+app.use('/api/', limiter)
+```
+#### Expensive Operations
+```typescript
+// Aggressive rate limiting for searches
+const searchLimiter = rateLimit({
+  windowMs: 60 * 1000, // 1 minute
+  max: 10, // 10 requests per minute
+  message: 'Too many search requests'
+})
+app.use('/api/search', searchLimiter)
+```
+#### Verification Steps
+- [ ] Rate limiting on all API endpoints
+- [ ] Stricter limits on expensive operations
+- [ ] IP-based rate limiting
+- [ ] User-based rate limiting (authenticated)
+### 8. Sensitive Data Exposure
+#### Logging
+```typescript
+// FAIL: WRONG: Logging sensitive data
+console.log('User login:', { email, password })
+console.log('Payment:', { cardNumber, cvv })
+// PASS: CORRECT: Redact sensitive data
+console.log('User login:', { email, userId })
+console.log('Payment:', { last4: card.last4, userId })
+```
+#### Error Messages
+```typescript
+// FAIL: WRONG: Exposing internal details
+catch (error) {
+  return NextResponse.json(
+    { error: error.message, stack: error.stack },
+    { status: 500 }
+  )
+}
+// PASS: CORRECT: Generic error messages
+catch (error) {
+  console.error('Internal error:', error)
+  return NextResponse.json(
+    { error: 'An error occurred. Please try again.' },
+    { status: 500 }
+  )
+}
+```
+#### Verification Steps
+- [ ] No passwords, tokens, or secrets in logs
+- [ ] Error messages generic for users
+- [ ] Detailed errors only in server logs
+- [ ] No stack traces exposed to users
+### 9. Blockchain Security (Solana)
+#### Wallet Verification
+```typescript
+import { verify } from '@solana/web3.js'
+async function verifyWalletOwnership(
+  publicKey: string,
+  signature: string,
+  message: string
+) {
+  try {
+    const isValid = verify(
+      Buffer.from(message),
+      Buffer.from(signature, 'base64'),
+      Buffer.from(publicKey, 'base64')
+    )
+    return isValid
+  } catch (error) {
+    return false
+  }
+}
+```
+#### Transaction Verification
+```typescript
+async function verifyTransaction(transaction: Transaction) {
+  // Verify recipient
+  if (transaction.to !== expectedRecipient) {
+    throw new Error('Invalid recipient')
+  }
+  // Verify amount
+  if (transaction.amount > maxAmount) {
+    throw new Error('Amount exceeds limit')
+  }
+  // Verify user has sufficient balance
+  const balance = await getBalance(transaction.from)
+  if (balance < transaction.amount) {
+    throw new Error('Insufficient balance')
+  }
+  return true
+}
+```
+#### Verification Steps
+- [ ] Wallet signatures verified
+- [ ] Transaction details validated
+- [ ] Balance checks before transactions
+- [ ] No blind transaction signing
+### 10. Dependency Security
+#### Regular Updates
+```bash
+# Check for vulnerabilities
+npm audit
+# Fix automatically fixable issues
+npm audit fix
+# Update dependencies
+npm update
+# Check for outdated packages
+npm outdated
+```
+#### Lock Files
+```bash
+# ALWAYS commit lock files
+git add package-lock.json
+# Use in CI/CD for reproducible builds
+npm ci  # Instead of npm install
+```
+#### Verification Steps
+- [ ] Dependencies up to date
+- [ ] No known vulnerabilities (npm audit clean)
+- [ ] Lock files committed
+- [ ] Dependabot enabled on GitHub
+- [ ] Regular security updates
+## Security Testing
+### Automated Security Tests
+```typescript
+// Test authentication
+test('requires authentication', async () => {
+  const response = await fetch('/api/protected')
+  expect(response.status).toBe(401)
+})
+// Test authorization
+test('requires admin role', async () => {
+  const response = await fetch('/api/admin', {
+    headers: { Authorization: `Bearer ${userToken}` }
+  })
+  expect(response.status).toBe(403)
+})
+// Test input validation
+test('rejects invalid input', async () => {
+  const response = await fetch('/api/users', {
+    method: 'POST',
+    body: JSON.stringify({ email: 'not-an-email' })
+  })
+  expect(response.status).toBe(400)
+})
+// Test rate limiting
+test('enforces rate limits', async () => {
+  const requests = Array(101).fill(null).map(() =>
+    fetch('/api/endpoint')
+  )
+  const responses = await Promise.all(requests)
+  const tooManyRequests = responses.filter(r => r.status === 429)
+  expect(tooManyRequests.length).toBeGreaterThan(0)
+})
+```
+## Pre-Deployment Security Checklist
+Before ANY production deployment:
+- [ ] **Secrets**: No hardcoded secrets, all in env vars
+- [ ] **Input Validation**: All user inputs validated
+- [ ] **SQL Injection**: All queries parameterized
+- [ ] **XSS**: User content sanitized
+- [ ] **CSRF**: Protection enabled
+- [ ] **Authentication**: Proper token handling
+- [ ] **Authorization**: Role checks in place
+- [ ] **Rate Limiting**: Enabled on all endpoints
+- [ ] **HTTPS**: Enforced in production
+- [ ] **Security Headers**: CSP, X-Frame-Options configured
+- [ ] **Error Handling**: No sensitive data in errors
+- [ ] **Logging**: No sensitive data logged
+- [ ] **Dependencies**: Up to date, no vulnerabilities
+- [ ] **Row Level Security**: Enabled in Supabase
+- [ ] **CORS**: Properly configured
+- [ ] **File Uploads**: Validated (size, type)
+- [ ] **Wallet Signatures**: Verified (if blockchain)
+## Resources
+- [OWASP Top 10](https://owasp.org/www-project-top-ten/)
+- [Next.js Security](https://nextjs.org/docs/security)
+- [Supabase Security](https://supabase.com/docs/guides/auth)
+- [Web Security Academy](https://portswigger.net/web-security)
+---
+**Remember**: Security is not optional. One vulnerability can compromise the entire platform. When in doubt, err on the side of caution.

.agents/skills/security-review/cloud-infrastructure-security.md ADDED Viewed

	@@ -0,0 +1,361 @@

+| name | description |
+|------|-------------|
+| cloud-infrastructure-security | Use this skill when deploying to cloud platforms, configuring infrastructure, managing IAM policies, setting up logging/monitoring, or implementing CI/CD pipelines. Provides cloud security checklist aligned with best practices. |
+# Cloud & Infrastructure Security Skill
+This skill ensures cloud infrastructure, CI/CD pipelines, and deployment configurations follow security best practices and comply with industry standards.
+## When to Activate
+- Deploying applications to cloud platforms (AWS, Vercel, Railway, Cloudflare)
+- Configuring IAM roles and permissions
+- Setting up CI/CD pipelines
+- Implementing infrastructure as code (Terraform, CloudFormation)
+- Configuring logging and monitoring
+- Managing secrets in cloud environments
+- Setting up CDN and edge security
+- Implementing disaster recovery and backup strategies
+## Cloud Security Checklist
+### 1. IAM & Access Control
+#### Principle of Least Privilege
+```yaml
+# PASS: CORRECT: Minimal permissions
+iam_role:
+  permissions:
+    - s3:GetObject  # Only read access
+    - s3:ListBucket
+  resources:
+    - arn:aws:s3:::my-bucket/*  # Specific bucket only
+# FAIL: WRONG: Overly broad permissions
+iam_role:
+  permissions:
+    - s3:*  # All S3 actions
+  resources:
+    - "*"  # All resources
+```
+#### Multi-Factor Authentication (MFA)
+```bash
+# ALWAYS enable MFA for root/admin accounts
+aws iam enable-mfa-device \
+  --user-name admin \
+  --serial-number arn:aws:iam::123456789:mfa/admin \
+  --authentication-code1 123456 \
+  --authentication-code2 789012
+```
+#### Verification Steps
+- [ ] No root account usage in production
+- [ ] MFA enabled for all privileged accounts
+- [ ] Service accounts use roles, not long-lived credentials
+- [ ] IAM policies follow least privilege
+- [ ] Regular access reviews conducted
+- [ ] Unused credentials rotated or removed
+### 2. Secrets Management
+#### Cloud Secrets Managers
+```typescript
+// PASS: CORRECT: Use cloud secrets manager
+import { SecretsManager } from '@aws-sdk/client-secrets-manager';
+const client = new SecretsManager({ region: 'us-east-1' });
+const secret = await client.getSecretValue({ SecretId: 'prod/api-key' });
+const apiKey = JSON.parse(secret.SecretString).key;
+// FAIL: WRONG: Hardcoded or in environment variables only
+const apiKey = process.env.API_KEY; // Not rotated, not audited
+```
+#### Secrets Rotation
+```bash
+# Set up automatic rotation for database credentials
+aws secretsmanager rotate-secret \
+  --secret-id prod/db-password \
+  --rotation-lambda-arn arn:aws:lambda:region:account:function:rotate \
+  --rotation-rules AutomaticallyAfterDays=30
+```
+#### Verification Steps
+- [ ] All secrets stored in cloud secrets manager (AWS Secrets Manager, Vercel Secrets)
+- [ ] Automatic rotation enabled for database credentials
+- [ ] API keys rotated at least quarterly
+- [ ] No secrets in code, logs, or error messages
+- [ ] Audit logging enabled for secret access
+### 3. Network Security
+#### VPC and Firewall Configuration
+```terraform
+# PASS: CORRECT: Restricted security group
+resource "aws_security_group" "app" {
+  name = "app-sg"
+  ingress {
+    from_port   = 443
+    to_port     = 443
+    protocol    = "tcp"
+    cidr_blocks = ["10.0.0.0/16"]  # Internal VPC only
+  }
+  egress {
+    from_port   = 443
+    to_port     = 443
+    protocol    = "tcp"
+    cidr_blocks = ["0.0.0.0/0"]  # Only HTTPS outbound
+  }
+}
+# FAIL: WRONG: Open to the internet
+resource "aws_security_group" "bad" {
+  ingress {
+    from_port   = 0
+    to_port     = 65535
+    protocol    = "tcp"
+    cidr_blocks = ["0.0.0.0/0"]  # All ports, all IPs!
+  }
+}
+```
+#### Verification Steps
+- [ ] Database not publicly accessible
+- [ ] SSH/RDP ports restricted to VPN/bastion only
+- [ ] Security groups follow least privilege
+- [ ] Network ACLs configured
+- [ ] VPC flow logs enabled
+### 4. Logging & Monitoring
+#### CloudWatch/Logging Configuration
+```typescript
+// PASS: CORRECT: Comprehensive logging
+import { CloudWatchLogsClient, CreateLogStreamCommand } from '@aws-sdk/client-cloudwatch-logs';
+const logSecurityEvent = async (event: SecurityEvent) => {
+  await cloudwatch.putLogEvents({
+    logGroupName: '/aws/security/events',
+    logStreamName: 'authentication',
+    logEvents: [{
+      timestamp: Date.now(),
+      message: JSON.stringify({
+        type: event.type,
+        userId: event.userId,
+        ip: event.ip,
+        result: event.result,
+        // Never log sensitive data
+      })
+    }]
+  });
+};
+```
+#### Verification Steps
+- [ ] CloudWatch/logging enabled for all services
+- [ ] Failed authentication attempts logged
+- [ ] Admin actions audited
+- [ ] Log retention configured (90+ days for compliance)
+- [ ] Alerts configured for suspicious activity
+- [ ] Logs centralized and tamper-proof
+### 5. CI/CD Pipeline Security
+#### Secure Pipeline Configuration
+```yaml
+# PASS: CORRECT: Secure GitHub Actions workflow
+name: Deploy
+on:
+  push:
+    branches: [main]
+jobs:
+  deploy:
+    runs-on: ubuntu-latest
+    permissions:
+      contents: read  # Minimal permissions
+    steps:
+      - uses: actions/checkout@v4
+      # Scan for secrets
+      - name: Secret scanning
+        uses: trufflesecurity/trufflehog@main
+      # Dependency audit
+      - name: Audit dependencies
+        run: npm audit --audit-level=high
+      # Use OIDC, not long-lived tokens
+      - name: Configure AWS credentials
+        uses: aws-actions/configure-aws-credentials@v4
+        with:
+          role-to-assume: arn:aws:iam::123456789:role/GitHubActionsRole
+          aws-region: us-east-1
+```
+#### Supply Chain Security
+```json
+// package.json - Use lock files and integrity checks
+{
+  "scripts": {
+    "install": "npm ci",  // Use ci for reproducible builds
+    "audit": "npm audit --audit-level=moderate",
+    "check": "npm outdated"
+  }
+}
+```
+#### Verification Steps
+- [ ] OIDC used instead of long-lived credentials
+- [ ] Secrets scanning in pipeline
+- [ ] Dependency vulnerability scanning
+- [ ] Container image scanning (if applicable)
+- [ ] Branch protection rules enforced
+- [ ] Code review required before merge
+- [ ] Signed commits enforced
+### 6. Cloudflare & CDN Security
+#### Cloudflare Security Configuration
+```typescript
+// PASS: CORRECT: Cloudflare Workers with security headers
+export default {
+  async fetch(request: Request): Promise<Response> {
+    const response = await fetch(request);
+    // Add security headers
+    const headers = new Headers(response.headers);
+    headers.set('X-Frame-Options', 'DENY');
+    headers.set('X-Content-Type-Options', 'nosniff');
+    headers.set('Referrer-Policy', 'strict-origin-when-cross-origin');
+    headers.set('Permissions-Policy', 'geolocation=(), microphone=()');
+    return new Response(response.body, {
+      status: response.status,
+      headers
+    });
+  }
+};
+```
+#### WAF Rules
+```bash
+# Enable Cloudflare WAF managed rules
+# - OWASP Core Ruleset
+# - Cloudflare Managed Ruleset
+# - Rate limiting rules
+# - Bot protection
+```
+#### Verification Steps
+- [ ] WAF enabled with OWASP rules
+- [ ] Rate limiting configured
+- [ ] Bot protection active
+- [ ] DDoS protection enabled
+- [ ] Security headers configured
+- [ ] SSL/TLS strict mode enabled
+### 7. Backup & Disaster Recovery
+#### Automated Backups
+```terraform
+# PASS: CORRECT: Automated RDS backups
+resource "aws_db_instance" "main" {
+  allocated_storage     = 20
+  engine               = "postgres"
+  backup_retention_period = 30  # 30 days retention
+  backup_window          = "03:00-04:00"
+  maintenance_window     = "mon:04:00-mon:05:00"
+  enabled_cloudwatch_logs_exports = ["postgresql"]
+  deletion_protection = true  # Prevent accidental deletion
+}
+```
+#### Verification Steps
+- [ ] Automated daily backups configured
+- [ ] Backup retention meets compliance requirements
+- [ ] Point-in-time recovery enabled
+- [ ] Backup testing performed quarterly
+- [ ] Disaster recovery plan documented
+- [ ] RPO and RTO defined and tested
+## Pre-Deployment Cloud Security Checklist
+Before ANY production cloud deployment:
+- [ ] **IAM**: Root account not used, MFA enabled, least privilege policies
+- [ ] **Secrets**: All secrets in cloud secrets manager with rotation
+- [ ] **Network**: Security groups restricted, no public databases
+- [ ] **Logging**: CloudWatch/logging enabled with retention
+- [ ] **Monitoring**: Alerts configured for anomalies
+- [ ] **CI/CD**: OIDC auth, secrets scanning, dependency audits
+- [ ] **CDN/WAF**: Cloudflare WAF enabled with OWASP rules
+- [ ] **Encryption**: Data encrypted at rest and in transit
+- [ ] **Backups**: Automated backups with tested recovery
+- [ ] **Compliance**: GDPR/HIPAA requirements met (if applicable)
+- [ ] **Documentation**: Infrastructure documented, runbooks created
+- [ ] **Incident Response**: Security incident plan in place
+## Common Cloud Security Misconfigurations
+### S3 Bucket Exposure
+```bash
+# FAIL: WRONG: Public bucket
+aws s3api put-bucket-acl --bucket my-bucket --acl public-read
+# PASS: CORRECT: Private bucket with specific access
+aws s3api put-bucket-acl --bucket my-bucket --acl private
+aws s3api put-bucket-policy --bucket my-bucket --policy file://policy.json
+```
+### RDS Public Access
+```terraform
+# FAIL: WRONG
+resource "aws_db_instance" "bad" {
+  publicly_accessible = true  # NEVER do this!
+}
+# PASS: CORRECT
+resource "aws_db_instance" "good" {
+  publicly_accessible = false
+  vpc_security_group_ids = [aws_security_group.db.id]
+}
+```
+## Resources
+- [AWS Security Best Practices](https://aws.amazon.com/security/best-practices/)
+- [CIS AWS Foundations Benchmark](https://www.cisecurity.org/benchmark/amazon_web_services)
+- [Cloudflare Security Documentation](https://developers.cloudflare.com/security/)
+- [OWASP Cloud Security](https://owasp.org/www-project-cloud-security/)
+- [Terraform Security Best Practices](https://www.terraform.io/docs/cloud/guides/recommended-practices/)
+**Remember**: Cloud misconfigurations are the leading cause of data breaches. A single exposed S3 bucket or overly permissive IAM policy can compromise your entire infrastructure. Always follow the principle of least privilege and defense in depth.

.agents/skills/subagent-driven-development/SKILL.md ADDED Viewed

	@@ -0,0 +1,277 @@

+---
+name: subagent-driven-development
+description: Use when executing implementation plans with independent tasks in the current session
+---
+# Subagent-Driven Development
+Execute plan by dispatching fresh subagent per task, with two-stage review after each: spec compliance review first, then code quality review.
+**Why subagents:** You delegate tasks to specialized agents with isolated context. By precisely crafting their instructions and context, you ensure they stay focused and succeed at their task. They should never inherit your session's context or history — you construct exactly what they need. This also preserves your own context for coordination work.
+**Core principle:** Fresh subagent per task + two-stage review (spec then quality) = high quality, fast iteration
+## When to Use
+```dot
+digraph when_to_use {
+    "Have implementation plan?" [shape=diamond];
+    "Tasks mostly independent?" [shape=diamond];
+    "Stay in this session?" [shape=diamond];
+    "subagent-driven-development" [shape=box];
+    "executing-plans" [shape=box];
+    "Manual execution or brainstorm first" [shape=box];
+    "Have implementation plan?" -> "Tasks mostly independent?" [label="yes"];
+    "Have implementation plan?" -> "Manual execution or brainstorm first" [label="no"];
+    "Tasks mostly independent?" -> "Stay in this session?" [label="yes"];
+    "Tasks mostly independent?" -> "Manual execution or brainstorm first" [label="no - tightly coupled"];
+    "Stay in this session?" -> "subagent-driven-development" [label="yes"];
+    "Stay in this session?" -> "executing-plans" [label="no - parallel session"];
+}
+```
+**vs. Executing Plans (parallel session):**
+- Same session (no context switch)
+- Fresh subagent per task (no context pollution)
+- Two-stage review after each task: spec compliance first, then code quality
+- Faster iteration (no human-in-loop between tasks)
+## The Process
+```dot
+digraph process {
+    rankdir=TB;
+    subgraph cluster_per_task {
+        label="Per Task";
+        "Dispatch implementer subagent (./implementer-prompt.md)" [shape=box];
+        "Implementer subagent asks questions?" [shape=diamond];
+        "Answer questions, provide context" [shape=box];
+        "Implementer subagent implements, tests, commits, self-reviews" [shape=box];
+        "Dispatch spec reviewer subagent (./spec-reviewer-prompt.md)" [shape=box];
+        "Spec reviewer subagent confirms code matches spec?" [shape=diamond];
+        "Implementer subagent fixes spec gaps" [shape=box];
+        "Dispatch code quality reviewer subagent (./code-quality-reviewer-prompt.md)" [shape=box];
+        "Code quality reviewer subagent approves?" [shape=diamond];
+        "Implementer subagent fixes quality issues" [shape=box];
+        "Mark task complete in TodoWrite" [shape=box];
+    }
+    "Read plan, extract all tasks with full text, note context, create TodoWrite" [shape=box];
+    "More tasks remain?" [shape=diamond];
+    "Dispatch final code reviewer subagent for entire implementation" [shape=box];
+    "Use superpowers:finishing-a-development-branch" [shape=box style=filled fillcolor=lightgreen];
+    "Read plan, extract all tasks with full text, note context, create TodoWrite" -> "Dispatch implementer subagent (./implementer-prompt.md)";
+    "Dispatch implementer subagent (./implementer-prompt.md)" -> "Implementer subagent asks questions?";
+    "Implementer subagent asks questions?" -> "Answer questions, provide context" [label="yes"];
+    "Answer questions, provide context" -> "Dispatch implementer subagent (./implementer-prompt.md)";
+    "Implementer subagent asks questions?" -> "Implementer subagent implements, tests, commits, self-reviews" [label="no"];
+    "Implementer subagent implements, tests, commits, self-reviews" -> "Dispatch spec reviewer subagent (./spec-reviewer-prompt.md)";
+    "Dispatch spec reviewer subagent (./spec-reviewer-prompt.md)" -> "Spec reviewer subagent confirms code matches spec?";
+    "Spec reviewer subagent confirms code matches spec?" -> "Implementer subagent fixes spec gaps" [label="no"];
+    "Implementer subagent fixes spec gaps" -> "Dispatch spec reviewer subagent (./spec-reviewer-prompt.md)" [label="re-review"];
+    "Spec reviewer subagent confirms code matches spec?" -> "Dispatch code quality reviewer subagent (./code-quality-reviewer-prompt.md)" [label="yes"];
+    "Dispatch code quality reviewer subagent (./code-quality-reviewer-prompt.md)" -> "Code quality reviewer subagent approves?";
+    "Code quality reviewer subagent approves?" -> "Implementer subagent fixes quality issues" [label="no"];
+    "Implementer subagent fixes quality issues" -> "Dispatch code quality reviewer subagent (./code-quality-reviewer-prompt.md)" [label="re-review"];
+    "Code quality reviewer subagent approves?" -> "Mark task complete in TodoWrite" [label="yes"];
+    "Mark task complete in TodoWrite" -> "More tasks remain?";
+    "More tasks remain?" -> "Dispatch implementer subagent (./implementer-prompt.md)" [label="yes"];
+    "More tasks remain?" -> "Dispatch final code reviewer subagent for entire implementation" [label="no"];
+    "Dispatch final code reviewer subagent for entire implementation" -> "Use superpowers:finishing-a-development-branch";
+}
+```
+## Model Selection
+Use the least powerful model that can handle each role to conserve cost and increase speed.
+**Mechanical implementation tasks** (isolated functions, clear specs, 1-2 files): use a fast, cheap model. Most implementation tasks are mechanical when the plan is well-specified.
+**Integration and judgment tasks** (multi-file coordination, pattern matching, debugging): use a standard model.
+**Architecture, design, and review tasks**: use the most capable available model.
+**Task complexity signals:**
+- Touches 1-2 files with a complete spec → cheap model
+- Touches multiple files with integration concerns → standard model
+- Requires design judgment or broad codebase understanding → most capable model
+## Handling Implementer Status
+Implementer subagents report one of four statuses. Handle each appropriately:
+**DONE:** Proceed to spec compliance review.
+**DONE_WITH_CONCERNS:** The implementer completed the work but flagged doubts. Read the concerns before proceeding. If the concerns are about correctness or scope, address them before review. If they're observations (e.g., "this file is getting large"), note them and proceed to review.
+**NEEDS_CONTEXT:** The implementer needs information that wasn't provided. Provide the missing context and re-dispatch.
+**BLOCKED:** The implementer cannot complete the task. Assess the blocker:
+1. If it's a context problem, provide more context and re-dispatch with the same model
+2. If the task requires more reasoning, re-dispatch with a more capable model
+3. If the task is too large, break it into smaller pieces
+4. If the plan itself is wrong, escalate to the human
+**Never** ignore an escalation or force the same model to retry without changes. If the implementer said it's stuck, something needs to change.
+## Prompt Templates
+- `./implementer-prompt.md` - Dispatch implementer subagent
+- `./spec-reviewer-prompt.md` - Dispatch spec compliance reviewer subagent
+- `./code-quality-reviewer-prompt.md` - Dispatch code quality reviewer subagent
+## Example Workflow
+```
+You: I'm using Subagent-Driven Development to execute this plan.
+[Read plan file once: docs/superpowers/plans/feature-plan.md]
+[Extract all 5 tasks with full text and context]
+[Create TodoWrite with all tasks]
+Task 1: Hook installation script
+[Get Task 1 text and context (already extracted)]
+[Dispatch implementation subagent with full task text + context]
+Implementer: "Before I begin - should the hook be installed at user or system level?"
+You: "User level (~/.config/superpowers/hooks/)"
+Implementer: "Got it. Implementing now..."
+[Later] Implementer:
+  - Implemented install-hook command
+  - Added tests, 5/5 passing
+  - Self-review: Found I missed --force flag, added it
+  - Committed
+[Dispatch spec compliance reviewer]
+Spec reviewer: ✅ Spec compliant - all requirements met, nothing extra
+[Get git SHAs, dispatch code quality reviewer]
+Code reviewer: Strengths: Good test coverage, clean. Issues: None. Approved.
+[Mark Task 1 complete]
+Task 2: Recovery modes
+[Get Task 2 text and context (already extracted)]
+[Dispatch implementation subagent with full task text + context]
+Implementer: [No questions, proceeds]
+Implementer:
+  - Added verify/repair modes
+  - 8/8 tests passing
+  - Self-review: All good
+  - Committed
+[Dispatch spec compliance reviewer]
+Spec reviewer: ❌ Issues:
+  - Missing: Progress reporting (spec says "report every 100 items")
+  - Extra: Added --json flag (not requested)
+[Implementer fixes issues]
+Implementer: Removed --json flag, added progress reporting
+[Spec reviewer reviews again]
+Spec reviewer: ✅ Spec compliant now
+[Dispatch code quality reviewer]
+Code reviewer: Strengths: Solid. Issues (Important): Magic number (100)
+[Implementer fixes]
+Implementer: Extracted PROGRESS_INTERVAL constant
+[Code reviewer reviews again]
+Code reviewer: ✅ Approved
+[Mark Task 2 complete]
+...
+[After all tasks]
+[Dispatch final code-reviewer]
+Final reviewer: All requirements met, ready to merge
+Done!
+```
+## Advantages
+**vs. Manual execution:**
+- Subagents follow TDD naturally
+- Fresh context per task (no confusion)
+- Parallel-safe (subagents don't interfere)
+- Subagent can ask questions (before AND during work)
+**vs. Executing Plans:**
+- Same session (no handoff)
+- Continuous progress (no waiting)
+- Review checkpoints automatic
+**Efficiency gains:**
+- No file reading overhead (controller provides full text)
+- Controller curates exactly what context is needed
+- Subagent gets complete information upfront
+- Questions surfaced before work begins (not after)
+**Quality gates:**
+- Self-review catches issues before handoff
+- Two-stage review: spec compliance, then code quality
+- Review loops ensure fixes actually work
+- Spec compliance prevents over/under-building
+- Code quality ensures implementation is well-built
+**Cost:**
+- More subagent invocations (implementer + 2 reviewers per task)
+- Controller does more prep work (extracting all tasks upfront)
+- Review loops add iterations
+- But catches issues early (cheaper than debugging later)
+## Red Flags
+**Never:**
+- Start implementation on main/master branch without explicit user consent
+- Skip reviews (spec compliance OR code quality)
+- Proceed with unfixed issues
+- Dispatch multiple implementation subagents in parallel (conflicts)
+- Make subagent read plan file (provide full text instead)
+- Skip scene-setting context (subagent needs to understand where task fits)
+- Ignore subagent questions (answer before letting them proceed)
+- Accept "close enough" on spec compliance (spec reviewer found issues = not done)
+- Skip review loops (reviewer found issues = implementer fixes = review again)
+- Let implementer self-review replace actual review (both are needed)
+- **Start code quality review before spec compliance is ✅** (wrong order)
+- Move to next task while either review has open issues
+**If subagent asks questions:**
+- Answer clearly and completely
+- Provide additional context if needed
+- Don't rush them into implementation
+**If reviewer finds issues:**
+- Implementer (same subagent) fixes them
+- Reviewer reviews again
+- Repeat until approved
+- Don't skip the re-review
+**If subagent fails task:**
+- Dispatch fix subagent with specific instructions
+- Don't try to fix manually (context pollution)
+## Integration
+**Required workflow skills:**
+- **superpowers:using-git-worktrees** - REQUIRED: Set up isolated workspace before starting
+- **superpowers:writing-plans** - Creates the plan this skill executes
+- **superpowers:requesting-code-review** - Code review template for reviewer subagents
+- **superpowers:finishing-a-development-branch** - Complete development after all tasks
+**Subagents should use:**
+- **superpowers:test-driven-development** - Subagents follow TDD for each task
+**Alternative workflow:**
+- **superpowers:executing-plans** - Use for parallel session instead of same-session execution

.agents/skills/subagent-driven-development/code-quality-reviewer-prompt.md ADDED Viewed

	@@ -0,0 +1,26 @@

+# Code Quality Reviewer Prompt Template
+Use this template when dispatching a code quality reviewer subagent.
+**Purpose:** Verify implementation is well-built (clean, tested, maintainable)
+**Only dispatch after spec compliance review passes.**
+```
+Task tool (superpowers:code-reviewer):
+  Use template at requesting-code-review/code-reviewer.md
+  WHAT_WAS_IMPLEMENTED: [from implementer's report]
+  PLAN_OR_REQUIREMENTS: Task N from [plan-file]
+  BASE_SHA: [commit before task]
+  HEAD_SHA: [current commit]
+  DESCRIPTION: [task summary]
+```
+**In addition to standard code quality concerns, the reviewer should check:**
+- Does each file have one clear responsibility with a well-defined interface?
+- Are units decomposed so they can be understood and tested independently?
+- Is the implementation following the file structure from the plan?
+- Did this implementation create new files that are already large, or significantly grow existing files? (Don't flag pre-existing file sizes — focus on what this change contributed.)
+**Code reviewer returns:** Strengths, Issues (Critical/Important/Minor), Assessment

.agents/skills/subagent-driven-development/implementer-prompt.md ADDED Viewed

	@@ -0,0 +1,113 @@

+# Implementer Subagent Prompt Template
+Use this template when dispatching an implementer subagent.
+```
+Task tool (general-purpose):
+  description: "Implement Task N: [task name]"
+  prompt: |
+    You are implementing Task N: [task name]
+    ## Task Description
+    [FULL TEXT of task from plan - paste it here, don't make subagent read file]
+    ## Context
+    [Scene-setting: where this fits, dependencies, architectural context]
+    ## Before You Begin
+    If you have questions about:
+    - The requirements or acceptance criteria
+    - The approach or implementation strategy
+    - Dependencies or assumptions
+    - Anything unclear in the task description
+    **Ask them now.** Raise any concerns before starting work.
+    ## Your Job
+    Once you're clear on requirements:
+    1. Implement exactly what the task specifies
+    2. Write tests (following TDD if task says to)
+    3. Verify implementation works
+    4. Commit your work
+    5. Self-review (see below)
+    6. Report back
+    Work from: [directory]
+    **While you work:** If you encounter something unexpected or unclear, **ask questions**.
+    It's always OK to pause and clarify. Don't guess or make assumptions.
+    ## Code Organization
+    You reason best about code you can hold in context at once, and your edits are more
+    reliable when files are focused. Keep this in mind:
+    - Follow the file structure defined in the plan
+    - Each file should have one clear responsibility with a well-defined interface
+    - If a file you're creating is growing beyond the plan's intent, stop and report
+      it as DONE_WITH_CONCERNS — don't split files on your own without plan guidance
+    - If an existing file you're modifying is already large or tangled, work carefully
+      and note it as a concern in your report
+    - In existing codebases, follow established patterns. Improve code you're touching
+      the way a good developer would, but don't restructure things outside your task.
+    ## When You're in Over Your Head
+    It is always OK to stop and say "this is too hard for me." Bad work is worse than
+    no work. You will not be penalized for escalating.
+    **STOP and escalate when:**
+    - The task requires architectural decisions with multiple valid approaches
+    - You need to understand code beyond what was provided and can't find clarity
+    - You feel uncertain about whether your approach is correct
+    - The task involves restructuring existing code in ways the plan didn't anticipate
+    - You've been reading file after file trying to understand the system without progress
+    **How to escalate:** Report back with status BLOCKED or NEEDS_CONTEXT. Describe
+    specifically what you're stuck on, what you've tried, and what kind of help you need.
+    The controller can provide more context, re-dispatch with a more capable model,
+    or break the task into smaller pieces.
+    ## Before Reporting Back: Self-Review
+    Review your work with fresh eyes. Ask yourself:
+    **Completeness:**
+    - Did I fully implement everything in the spec?
+    - Did I miss any requirements?
+    - Are there edge cases I didn't handle?
+    **Quality:**
+    - Is this my best work?
+    - Are names clear and accurate (match what things do, not how they work)?
+    - Is the code clean and maintainable?
+    **Discipline:**
+    - Did I avoid overbuilding (YAGNI)?
+    - Did I only build what was requested?
+    - Did I follow existing patterns in the codebase?
+    **Testing:**
+    - Do tests actually verify behavior (not just mock behavior)?
+    - Did I follow TDD if required?
+    - Are tests comprehensive?
+    If you find issues during self-review, fix them now before reporting.
+    ## Report Format
+    When done, report:
+    - **Status:** DONE | DONE_WITH_CONCERNS | BLOCKED | NEEDS_CONTEXT
+    - What you implemented (or what you attempted, if blocked)
+    - What you tested and test results
+    - Files changed
+    - Self-review findings (if any)
+    - Any issues or concerns
+    Use DONE_WITH_CONCERNS if you completed the work but have doubts about correctness.
+    Use BLOCKED if you cannot complete the task. Use NEEDS_CONTEXT if you need
+    information that wasn't provided. Never silently produce work you're unsure about.
+```

.agents/skills/subagent-driven-development/spec-reviewer-prompt.md ADDED Viewed

	@@ -0,0 +1,61 @@

+# Spec Compliance Reviewer Prompt Template
+Use this template when dispatching a spec compliance reviewer subagent.
+**Purpose:** Verify implementer built what was requested (nothing more, nothing less)
+```
+Task tool (general-purpose):
+  description: "Review spec compliance for Task N"
+  prompt: |
+    You are reviewing whether an implementation matches its specification.
+    ## What Was Requested
+    [FULL TEXT of task requirements]
+    ## What Implementer Claims They Built
+    [From implementer's report]
+    ## CRITICAL: Do Not Trust the Report
+    The implementer finished suspiciously quickly. Their report may be incomplete,
+    inaccurate, or optimistic. You MUST verify everything independently.
+    **DO NOT:**
+    - Take their word for what they implemented
+    - Trust their claims about completeness
+    - Accept their interpretation of requirements
+    **DO:**
+    - Read the actual code they wrote
+    - Compare actual implementation to requirements line by line
+    - Check for missing pieces they claimed to implement
+    - Look for extra features they didn't mention
+    ## Your Job
+    Read the implementation code and verify:
+    **Missing requirements:**
+    - Did they implement everything that was requested?
+    - Are there requirements they skipped or missed?
+    - Did they claim something works but didn't actually implement it?
+    **Extra/unneeded work:**
+    - Did they build things that weren't requested?
+    - Did they over-engineer or add unnecessary features?
+    - Did they add "nice to haves" that weren't in spec?
+    **Misunderstandings:**
+    - Did they interpret requirements differently than intended?
+    - Did they solve the wrong problem?
+    - Did they implement the right feature but wrong way?
+    **Verify by reading code, not by trusting report.**
+    Report:
+    - ✅ Spec compliant (if everything matches after code inspection)
+    - ❌ Issues found: [list specifically what's missing or extra, with file:line references]
+```

.agents/skills/systematic-debugging/CREATION-LOG.md ADDED Viewed

	@@ -0,0 +1,119 @@

+# Creation Log: Systematic Debugging Skill
+Reference example of extracting, structuring, and bulletproofing a critical skill.
+## Source Material
+Extracted debugging framework from `/Users/jesse/.claude/CLAUDE.md`:
+- 4-phase systematic process (Investigation → Pattern Analysis → Hypothesis → Implementation)
+- Core mandate: ALWAYS find root cause, NEVER fix symptoms
+- Rules designed to resist time pressure and rationalization
+## Extraction Decisions
+**What to include:**
+- Complete 4-phase framework with all rules
+- Anti-shortcuts ("NEVER fix symptom", "STOP and re-analyze")
+- Pressure-resistant language ("even if faster", "even if I seem in a hurry")
+- Concrete steps for each phase
+**What to leave out:**
+- Project-specific context
+- Repetitive variations of same rule
+- Narrative explanations (condensed to principles)
+## Structure Following skill-creation/SKILL.md
+1. **Rich when_to_use** - Included symptoms and anti-patterns
+2. **Type: technique** - Concrete process with steps
+3. **Keywords** - "root cause", "symptom", "workaround", "debugging", "investigation"
+4. **Flowchart** - Decision point for "fix failed" → re-analyze vs add more fixes
+5. **Phase-by-phase breakdown** - Scannable checklist format
+6. **Anti-patterns section** - What NOT to do (critical for this skill)
+## Bulletproofing Elements
+Framework designed to resist rationalization under pressure:
+### Language Choices
+- "ALWAYS" / "NEVER" (not "should" / "try to")
+- "even if faster" / "even if I seem in a hurry"
+- "STOP and re-analyze" (explicit pause)
+- "Don't skip past" (catches the actual behavior)
+### Structural Defenses
+- **Phase 1 required** - Can't skip to implementation
+- **Single hypothesis rule** - Forces thinking, prevents shotgun fixes
+- **Explicit failure mode** - "IF your first fix doesn't work" with mandatory action
+- **Anti-patterns section** - Shows exactly what shortcuts look like
+### Redundancy
+- Root cause mandate in overview + when_to_use + Phase 1 + implementation rules
+- "NEVER fix symptom" appears 4 times in different contexts
+- Each phase has explicit "don't skip" guidance
+## Testing Approach
+Created 4 validation tests following skills/meta/testing-skills-with-subagents:
+### Test 1: Academic Context (No Pressure)
+- Simple bug, no time pressure
+- **Result:** Perfect compliance, complete investigation
+### Test 2: Time Pressure + Obvious Quick Fix
+- User "in a hurry", symptom fix looks easy
+- **Result:** Resisted shortcut, followed full process, found real root cause
+### Test 3: Complex System + Uncertainty
+- Multi-layer failure, unclear if can find root cause
+- **Result:** Systematic investigation, traced through all layers, found source
+### Test 4: Failed First Fix
+- Hypothesis doesn't work, temptation to add more fixes
+- **Result:** Stopped, re-analyzed, formed new hypothesis (no shotgun)
+**All tests passed.** No rationalizations found.
+## Iterations
+### Initial Version
+- Complete 4-phase framework
+- Anti-patterns section
+- Flowchart for "fix failed" decision
+### Enhancement 1: TDD Reference
+- Added link to skills/testing/test-driven-development
+- Note explaining TDD's "simplest code" ≠ debugging's "root cause"
+- Prevents confusion between methodologies
+## Final Outcome
+Bulletproof skill that:
+- ✅ Clearly mandates root cause investigation
+- ✅ Resists time pressure rationalization
+- ✅ Provides concrete steps for each phase
+- ✅ Shows anti-patterns explicitly
+- ✅ Tested under multiple pressure scenarios
+- ✅ Clarifies relationship to TDD
+- ✅ Ready for use
+## Key Insight
+**Most important bulletproofing:** Anti-patterns section showing exact shortcuts that feel justified in the moment. When Claude thinks "I'll just add this one quick fix", seeing that exact pattern listed as wrong creates cognitive friction.
+## Usage Example
+When encountering a bug:
+1. Load skill: skills/debugging/systematic-debugging
+2. Read overview (10 sec) - reminded of mandate
+3. Follow Phase 1 checklist - forced investigation
+4. If tempted to skip - see anti-pattern, stop
+5. Complete all phases - root cause found
+**Time investment:** 5-10 minutes
+**Time saved:** Hours of symptom-whack-a-mole
+---
+*Created: 2025-10-03*
+*Purpose: Reference example for skill extraction and bulletproofing*

.agents/skills/systematic-debugging/SKILL.md ADDED Viewed

	@@ -0,0 +1,296 @@

+---
+name: systematic-debugging
+description: Use when encountering any bug, test failure, or unexpected behavior, before proposing fixes
+---
+# Systematic Debugging
+## Overview
+Random fixes waste time and create new bugs. Quick patches mask underlying issues.
+**Core principle:** ALWAYS find root cause before attempting fixes. Symptom fixes are failure.
+**Violating the letter of this process is violating the spirit of debugging.**
+## The Iron Law
+```
+NO FIXES WITHOUT ROOT CAUSE INVESTIGATION FIRST
+```
+If you haven't completed Phase 1, you cannot propose fixes.
+## When to Use
+Use for ANY technical issue:
+- Test failures
+- Bugs in production
+- Unexpected behavior
+- Performance problems
+- Build failures
+- Integration issues
+**Use this ESPECIALLY when:**
+- Under time pressure (emergencies make guessing tempting)
+- "Just one quick fix" seems obvious
+- You've already tried multiple fixes
+- Previous fix didn't work
+- You don't fully understand the issue
+**Don't skip when:**
+- Issue seems simple (simple bugs have root causes too)
+- You're in a hurry (rushing guarantees rework)
+- Manager wants it fixed NOW (systematic is faster than thrashing)
+## The Four Phases
+You MUST complete each phase before proceeding to the next.
+### Phase 1: Root Cause Investigation
+**BEFORE attempting ANY fix:**
+1. **Read Error Messages Carefully**
+   - Don't skip past errors or warnings
+   - They often contain the exact solution
+   - Read stack traces completely
+   - Note line numbers, file paths, error codes
+2. **Reproduce Consistently**
+   - Can you trigger it reliably?
+   - What are the exact steps?
+   - Does it happen every time?
+   - If not reproducible → gather more data, don't guess
+3. **Check Recent Changes**
+   - What changed that could cause this?
+   - Git diff, recent commits
+   - New dependencies, config changes
+   - Environmental differences
+4. **Gather Evidence in Multi-Component Systems**
+   **WHEN system has multiple components (CI → build → signing, API → service → database):**
+   **BEFORE proposing fixes, add diagnostic instrumentation:**
+   ```
+   For EACH component boundary:
+     - Log what data enters component
+     - Log what data exits component
+     - Verify environment/config propagation
+     - Check state at each layer
+   Run once to gather evidence showing WHERE it breaks
+   THEN analyze evidence to identify failing component
+   THEN investigate that specific component
+   ```
+   **Example (multi-layer system):**
+   ```bash
+   # Layer 1: Workflow
+   echo "=== Secrets available in workflow: ==="
+   echo "IDENTITY: ${IDENTITY:+SET}${IDENTITY:-UNSET}"
+   # Layer 2: Build script
+   echo "=== Env vars in build script: ==="
+   env | grep IDENTITY || echo "IDENTITY not in environment"
+   # Layer 3: Signing script
+   echo "=== Keychain state: ==="
+   security list-keychains
+   security find-identity -v
+   # Layer 4: Actual signing
+   codesign --sign "$IDENTITY" --verbose=4 "$APP"
+   ```
+   **This reveals:** Which layer fails (secrets → workflow ✓, workflow → build ✗)
+5. **Trace Data Flow**
+   **WHEN error is deep in call stack:**
+   See `root-cause-tracing.md` in this directory for the complete backward tracing technique.
+   **Quick version:**
+   - Where does bad value originate?
+   - What called this with bad value?
+   - Keep tracing up until you find the source
+   - Fix at source, not at symptom
+### Phase 2: Pattern Analysis
+**Find the pattern before fixing:**
+1. **Find Working Examples**
+   - Locate similar working code in same codebase
+   - What works that's similar to what's broken?
+2. **Compare Against References**
+   - If implementing pattern, read reference implementation COMPLETELY
+   - Don't skim - read every line
+   - Understand the pattern fully before applying
+3. **Identify Differences**
+   - What's different between working and broken?
+   - List every difference, however small
+   - Don't assume "that can't matter"
+4. **Understand Dependencies**
+   - What other components does this need?
+   - What settings, config, environment?
+   - What assumptions does it make?
+### Phase 3: Hypothesis and Testing
+**Scientific method:**
+1. **Form Single Hypothesis**
+   - State clearly: "I think X is the root cause because Y"
+   - Write it down
+   - Be specific, not vague
+2. **Test Minimally**
+   - Make the SMALLEST possible change to test hypothesis
+   - One variable at a time
+   - Don't fix multiple things at once
+3. **Verify Before Continuing**
+   - Did it work? Yes → Phase 4
+   - Didn't work? Form NEW hypothesis
+   - DON'T add more fixes on top
+4. **When You Don't Know**
+   - Say "I don't understand X"
+   - Don't pretend to know
+   - Ask for help
+   - Research more
+### Phase 4: Implementation
+**Fix the root cause, not the symptom:**
+1. **Create Failing Test Case**
+   - Simplest possible reproduction
+   - Automated test if possible
+   - One-off test script if no framework
+   - MUST have before fixing
+   - Use the `superpowers:test-driven-development` skill for writing proper failing tests
+2. **Implement Single Fix**
+   - Address the root cause identified
+   - ONE change at a time
+   - No "while I'm here" improvements
+   - No bundled refactoring
+3. **Verify Fix**
+   - Test passes now?
+   - No other tests broken?
+   - Issue actually resolved?
+4. **If Fix Doesn't Work**
+   - STOP
+   - Count: How many fixes have you tried?
+   - If < 3: Return to Phase 1, re-analyze with new information
+   - **If ≥ 3: STOP and question the architecture (step 5 below)**
+   - DON'T attempt Fix #4 without architectural discussion
+5. **If 3+ Fixes Failed: Question Architecture**
+   **Pattern indicating architectural problem:**
+   - Each fix reveals new shared state/coupling/problem in different place
+   - Fixes require "massive refactoring" to implement
+   - Each fix creates new symptoms elsewhere
+   **STOP and question fundamentals:**
+   - Is this pattern fundamentally sound?
+   - Are we "sticking with it through sheer inertia"?
+   - Should we refactor architecture vs. continue fixing symptoms?
+   **Discuss with your human partner before attempting more fixes**
+   This is NOT a failed hypothesis - this is a wrong architecture.
+## Red Flags - STOP and Follow Process
+If you catch yourself thinking:
+- "Quick fix for now, investigate later"
+- "Just try changing X and see if it works"
+- "Add multiple changes, run tests"
+- "Skip the test, I'll manually verify"
+- "It's probably X, let me fix that"
+- "I don't fully understand but this might work"
+- "Pattern says X but I'll adapt it differently"
+- "Here are the main problems: [lists fixes without investigation]"
+- Proposing solutions before tracing data flow
+- **"One more fix attempt" (when already tried 2+)**
+- **Each fix reveals new problem in different place**
+**ALL of these mean: STOP. Return to Phase 1.**
+**If 3+ fixes failed:** Question the architecture (see Phase 4.5)
+## your human partner's Signals You're Doing It Wrong
+**Watch for these redirections:**
+- "Is that not happening?" - You assumed without verifying
+- "Will it show us...?" - You should have added evidence gathering
+- "Stop guessing" - You're proposing fixes without understanding
+- "Ultrathink this" - Question fundamentals, not just symptoms
+- "We're stuck?" (frustrated) - Your approach isn't working
+**When you see these:** STOP. Return to Phase 1.
+## Common Rationalizations
+| Excuse | Reality |
+|--------|---------|
+| "Issue is simple, don't need process" | Simple issues have root causes too. Process is fast for simple bugs. |
+| "Emergency, no time for process" | Systematic debugging is FASTER than guess-and-check thrashing. |
+| "Just try this first, then investigate" | First fix sets the pattern. Do it right from the start. |
+| "I'll write test after confirming fix works" | Untested fixes don't stick. Test first proves it. |
+| "Multiple fixes at once saves time" | Can't isolate what worked. Causes new bugs. |
+| "Reference too long, I'll adapt the pattern" | Partial understanding guarantees bugs. Read it completely. |
+| "I see the problem, let me fix it" | Seeing symptoms ≠ understanding root cause. |
+| "One more fix attempt" (after 2+ failures) | 3+ failures = architectural problem. Question pattern, don't fix again. |
+## Quick Reference
+| Phase | Key Activities | Success Criteria |
+|-------|---------------|------------------|
+| **1. Root Cause** | Read errors, reproduce, check changes, gather evidence | Understand WHAT and WHY |
+| **2. Pattern** | Find working examples, compare | Identify differences |
+| **3. Hypothesis** | Form theory, test minimally | Confirmed or new hypothesis |
+| **4. Implementation** | Create test, fix, verify | Bug resolved, tests pass |
+## When Process Reveals "No Root Cause"
+If systematic investigation reveals issue is truly environmental, timing-dependent, or external:
+1. You've completed the process
+2. Document what you investigated
+3. Implement appropriate handling (retry, timeout, error message)
+4. Add monitoring/logging for future investigation
+**But:** 95% of "no root cause" cases are incomplete investigation.
+## Supporting Techniques
+These techniques are part of systematic debugging and available in this directory:
+- **`root-cause-tracing.md`** - Trace bugs backward through call stack to find original trigger
+- **`defense-in-depth.md`** - Add validation at multiple layers after finding root cause
+- **`condition-based-waiting.md`** - Replace arbitrary timeouts with condition polling
+**Related skills:**
+- **superpowers:test-driven-development** - For creating failing test case (Phase 4, Step 1)
+- **superpowers:verification-before-completion** - Verify fix worked before claiming success
+## Real-World Impact
+From debugging sessions:
+- Systematic approach: 15-30 minutes to fix
+- Random fixes approach: 2-3 hours of thrashing
+- First-time fix rate: 95% vs 40%
+- New bugs introduced: Near zero vs common

.agents/skills/systematic-debugging/condition-based-waiting-example.ts ADDED Viewed

	@@ -0,0 +1,158 @@

+// Complete implementation of condition-based waiting utilities
+// From: Lace test infrastructure improvements (2025-10-03)
+// Context: Fixed 15 flaky tests by replacing arbitrary timeouts
+import type { ThreadManager } from '~/threads/thread-manager';
+import type { LaceEvent, LaceEventType } from '~/threads/types';
+/**
+ * Wait for a specific event type to appear in thread
+ *
+ * @param threadManager - The thread manager to query
+ * @param threadId - Thread to check for events
+ * @param eventType - Type of event to wait for
+ * @param timeoutMs - Maximum time to wait (default 5000ms)
+ * @returns Promise resolving to the first matching event
+ *
+ * Example:
+ *   await waitForEvent(threadManager, agentThreadId, 'TOOL_RESULT');
+ */
+export function waitForEvent(
+  threadManager: ThreadManager,
+  threadId: string,
+  eventType: LaceEventType,
+  timeoutMs = 5000
+): Promise<LaceEvent> {
+  return new Promise((resolve, reject) => {
+    const startTime = Date.now();
+    const check = () => {
+      const events = threadManager.getEvents(threadId);
+      const event = events.find((e) => e.type === eventType);
+      if (event) {
+        resolve(event);
+      } else if (Date.now() - startTime > timeoutMs) {
+        reject(new Error(`Timeout waiting for ${eventType} event after ${timeoutMs}ms`));
+      } else {
+        setTimeout(check, 10); // Poll every 10ms for efficiency
+      }
+    };
+    check();
+  });
+}
+/**
+ * Wait for a specific number of events of a given type
+ *
+ * @param threadManager - The thread manager to query
+ * @param threadId - Thread to check for events
+ * @param eventType - Type of event to wait for
+ * @param count - Number of events to wait for
+ * @param timeoutMs - Maximum time to wait (default 5000ms)
+ * @returns Promise resolving to all matching events once count is reached
+ *
+ * Example:
+ *   // Wait for 2 AGENT_MESSAGE events (initial response + continuation)
+ *   await waitForEventCount(threadManager, agentThreadId, 'AGENT_MESSAGE', 2);
+ */
+export function waitForEventCount(
+  threadManager: ThreadManager,
+  threadId: string,
+  eventType: LaceEventType,
+  count: number,
+  timeoutMs = 5000
+): Promise<LaceEvent[]> {
+  return new Promise((resolve, reject) => {
+    const startTime = Date.now();
+    const check = () => {
+      const events = threadManager.getEvents(threadId);
+      const matchingEvents = events.filter((e) => e.type === eventType);
+      if (matchingEvents.length >= count) {
+        resolve(matchingEvents);
+      } else if (Date.now() - startTime > timeoutMs) {
+        reject(
+          new Error(
+            `Timeout waiting for ${count} ${eventType} events after ${timeoutMs}ms (got ${matchingEvents.length})`
+          )
+        );
+      } else {
+        setTimeout(check, 10);
+      }
+    };
+    check();
+  });
+}
+/**
+ * Wait for an event matching a custom predicate
+ * Useful when you need to check event data, not just type
+ *
+ * @param threadManager - The thread manager to query
+ * @param threadId - Thread to check for events
+ * @param predicate - Function that returns true when event matches
+ * @param description - Human-readable description for error messages
+ * @param timeoutMs - Maximum time to wait (default 5000ms)
+ * @returns Promise resolving to the first matching event
+ *
+ * Example:
+ *   // Wait for TOOL_RESULT with specific ID
+ *   await waitForEventMatch(
+ *     threadManager,
+ *     agentThreadId,
+ *     (e) => e.type === 'TOOL_RESULT' && e.data.id === 'call_123',
+ *     'TOOL_RESULT with id=call_123'
+ *   );
+ */
+export function waitForEventMatch(
+  threadManager: ThreadManager,
+  threadId: string,
+  predicate: (event: LaceEvent) => boolean,
+  description: string,
+  timeoutMs = 5000
+): Promise<LaceEvent> {
+  return new Promise((resolve, reject) => {
+    const startTime = Date.now();
+    const check = () => {
+      const events = threadManager.getEvents(threadId);
+      const event = events.find(predicate);
+      if (event) {
+        resolve(event);
+      } else if (Date.now() - startTime > timeoutMs) {
+        reject(new Error(`Timeout waiting for ${description} after ${timeoutMs}ms`));
+      } else {
+        setTimeout(check, 10);
+      }
+    };
+    check();
+  });
+}
+// Usage example from actual debugging session:
+//
+// BEFORE (flaky):
+// ---------------
+// const messagePromise = agent.sendMessage('Execute tools');
+// await new Promise(r => setTimeout(r, 300)); // Hope tools start in 300ms
+// agent.abort();
+// await messagePromise;
+// await new Promise(r => setTimeout(r, 50));  // Hope results arrive in 50ms
+// expect(toolResults.length).toBe(2);         // Fails randomly
+//
+// AFTER (reliable):
+// ----------------
+// const messagePromise = agent.sendMessage('Execute tools');
+// await waitForEventCount(threadManager, threadId, 'TOOL_CALL', 2); // Wait for tools to start
+// agent.abort();
+// await messagePromise;
+// await waitForEventCount(threadManager, threadId, 'TOOL_RESULT', 2); // Wait for results
+// expect(toolResults.length).toBe(2); // Always succeeds
+//
+// Result: 60% pass rate → 100%, 40% faster execution

.agents/skills/systematic-debugging/condition-based-waiting.md ADDED Viewed

	@@ -0,0 +1,115 @@

+# Condition-Based Waiting
+## Overview
+Flaky tests often guess at timing with arbitrary delays. This creates race conditions where tests pass on fast machines but fail under load or in CI.
+**Core principle:** Wait for the actual condition you care about, not a guess about how long it takes.
+## When to Use
+```dot
+digraph when_to_use {
+    "Test uses setTimeout/sleep?" [shape=diamond];
+    "Testing timing behavior?" [shape=diamond];
+    "Document WHY timeout needed" [shape=box];
+    "Use condition-based waiting" [shape=box];
+    "Test uses setTimeout/sleep?" -> "Testing timing behavior?" [label="yes"];
+    "Testing timing behavior?" -> "Document WHY timeout needed" [label="yes"];
+    "Testing timing behavior?" -> "Use condition-based waiting" [label="no"];
+}
+```
+**Use when:**
+- Tests have arbitrary delays (`setTimeout`, `sleep`, `time.sleep()`)
+- Tests are flaky (pass sometimes, fail under load)
+- Tests timeout when run in parallel
+- Waiting for async operations to complete
+**Don't use when:**
+- Testing actual timing behavior (debounce, throttle intervals)
+- Always document WHY if using arbitrary timeout
+## Core Pattern
+```typescript
+// ❌ BEFORE: Guessing at timing
+await new Promise(r => setTimeout(r, 50));
+const result = getResult();
+expect(result).toBeDefined();
+// ✅ AFTER: Waiting for condition
+await waitFor(() => getResult() !== undefined);
+const result = getResult();
+expect(result).toBeDefined();
+```
+## Quick Patterns
+| Scenario | Pattern |
+|----------|---------|
+| Wait for event | `waitFor(() => events.find(e => e.type === 'DONE'))` |
+| Wait for state | `waitFor(() => machine.state === 'ready')` |
+| Wait for count | `waitFor(() => items.length >= 5)` |
+| Wait for file | `waitFor(() => fs.existsSync(path))` |
+| Complex condition | `waitFor(() => obj.ready && obj.value > 10)` |
+## Implementation
+Generic polling function:
+```typescript
+async function waitFor<T>(
+  condition: () => T | undefined | null | false,
+  description: string,
+  timeoutMs = 5000
+): Promise<T> {
+  const startTime = Date.now();
+  while (true) {
+    const result = condition();
+    if (result) return result;
+    if (Date.now() - startTime > timeoutMs) {
+      throw new Error(`Timeout waiting for ${description} after ${timeoutMs}ms`);
+    }
+    await new Promise(r => setTimeout(r, 10)); // Poll every 10ms
+  }
+}
+```
+See `condition-based-waiting-example.ts` in this directory for complete implementation with domain-specific helpers (`waitForEvent`, `waitForEventCount`, `waitForEventMatch`) from actual debugging session.
+## Common Mistakes
+**❌ Polling too fast:** `setTimeout(check, 1)` - wastes CPU
+**✅ Fix:** Poll every 10ms
+**❌ No timeout:** Loop forever if condition never met
+**✅ Fix:** Always include timeout with clear error
+**❌ Stale data:** Cache state before loop
+**✅ Fix:** Call getter inside loop for fresh data
+## When Arbitrary Timeout IS Correct
+```typescript
+// Tool ticks every 100ms - need 2 ticks to verify partial output
+await waitForEvent(manager, 'TOOL_STARTED'); // First: wait for condition
+await new Promise(r => setTimeout(r, 200));   // Then: wait for timed behavior
+// 200ms = 2 ticks at 100ms intervals - documented and justified
+```
+**Requirements:**
+1. First wait for triggering condition
+2. Based on known timing (not guessing)
+3. Comment explaining WHY
+## Real-World Impact
+From debugging session (2025-10-03):
+- Fixed 15 flaky tests across 3 files
+- Pass rate: 60% → 100%
+- Execution time: 40% faster
+- No more race conditions

.agents/skills/systematic-debugging/defense-in-depth.md ADDED Viewed

	@@ -0,0 +1,122 @@

+# Defense-in-Depth Validation
+## Overview
+When you fix a bug caused by invalid data, adding validation at one place feels sufficient. But that single check can be bypassed by different code paths, refactoring, or mocks.
+**Core principle:** Validate at EVERY layer data passes through. Make the bug structurally impossible.
+## Why Multiple Layers
+Single validation: "We fixed the bug"
+Multiple layers: "We made the bug impossible"
+Different layers catch different cases:
+- Entry validation catches most bugs
+- Business logic catches edge cases
+- Environment guards prevent context-specific dangers
+- Debug logging helps when other layers fail
+## The Four Layers
+### Layer 1: Entry Point Validation
+**Purpose:** Reject obviously invalid input at API boundary
+```typescript
+function createProject(name: string, workingDirectory: string) {
+  if (!workingDirectory || workingDirectory.trim() === '') {
+    throw new Error('workingDirectory cannot be empty');
+  }
+  if (!existsSync(workingDirectory)) {
+    throw new Error(`workingDirectory does not exist: ${workingDirectory}`);
+  }
+  if (!statSync(workingDirectory).isDirectory()) {
+    throw new Error(`workingDirectory is not a directory: ${workingDirectory}`);
+  }
+  // ... proceed
+}
+```
+### Layer 2: Business Logic Validation
+**Purpose:** Ensure data makes sense for this operation
+```typescript
+function initializeWorkspace(projectDir: string, sessionId: string) {
+  if (!projectDir) {
+    throw new Error('projectDir required for workspace initialization');
+  }
+  // ... proceed
+}
+```
+### Layer 3: Environment Guards
+**Purpose:** Prevent dangerous operations in specific contexts
+```typescript
+async function gitInit(directory: string) {
+  // In tests, refuse git init outside temp directories
+  if (process.env.NODE_ENV === 'test') {
+    const normalized = normalize(resolve(directory));
+    const tmpDir = normalize(resolve(tmpdir()));
+    if (!normalized.startsWith(tmpDir)) {
+      throw new Error(
+        `Refusing git init outside temp dir during tests: ${directory}`
+      );
+    }
+  }
+  // ... proceed
+}
+```
+### Layer 4: Debug Instrumentation
+**Purpose:** Capture context for forensics
+```typescript
+async function gitInit(directory: string) {
+  const stack = new Error().stack;
+  logger.debug('About to git init', {
+    directory,
+    cwd: process.cwd(),
+    stack,
+  });
+  // ... proceed
+}
+```
+## Applying the Pattern
+When you find a bug:
+1. **Trace the data flow** - Where does bad value originate? Where used?
+2. **Map all checkpoints** - List every point data passes through
+3. **Add validation at each layer** - Entry, business, environment, debug
+4. **Test each layer** - Try to bypass layer 1, verify layer 2 catches it
+## Example from Session
+Bug: Empty `projectDir` caused `git init` in source code
+**Data flow:**
+1. Test setup → empty string
+2. `Project.create(name, '')`
+3. `WorkspaceManager.createWorkspace('')`
+4. `git init` runs in `process.cwd()`
+**Four layers added:**
+- Layer 1: `Project.create()` validates not empty/exists/writable
+- Layer 2: `WorkspaceManager` validates projectDir not empty
+- Layer 3: `WorktreeManager` refuses git init outside tmpdir in tests
+- Layer 4: Stack trace logging before git init
+**Result:** All 1847 tests passed, bug impossible to reproduce
+## Key Insight
+All four layers were necessary. During testing, each layer caught bugs the others missed:
+- Different code paths bypassed entry validation
+- Mocks bypassed business logic checks
+- Edge cases on different platforms needed environment guards
+- Debug logging identified structural misuse
+**Don't stop at one validation point.** Add checks at every layer.

.agents/skills/systematic-debugging/find-polluter.sh ADDED Viewed

	@@ -0,0 +1,63 @@

+#!/usr/bin/env bash
+# Bisection script to find which test creates unwanted files/state
+# Usage: ./find-polluter.sh <file_or_dir_to_check> <test_pattern>
+# Example: ./find-polluter.sh '.git' 'src/**/*.test.ts'
+set -e
+if [ $# -ne 2 ]; then
+  echo "Usage: $0 <file_to_check> <test_pattern>"
+  echo "Example: $0 '.git' 'src/**/*.test.ts'"
+  exit 1
+fi
+POLLUTION_CHECK="$1"
+TEST_PATTERN="$2"
+echo "🔍 Searching for test that creates: $POLLUTION_CHECK"
+echo "Test pattern: $TEST_PATTERN"
+echo ""
+# Get list of test files
+TEST_FILES=$(find . -path "$TEST_PATTERN" | sort)
+TOTAL=$(echo "$TEST_FILES" | wc -l | tr -d ' ')
+echo "Found $TOTAL test files"
+echo ""
+COUNT=0
+for TEST_FILE in $TEST_FILES; do
+  COUNT=$((COUNT + 1))
+  # Skip if pollution already exists
+  if [ -e "$POLLUTION_CHECK" ]; then
+    echo "⚠️  Pollution already exists before test $COUNT/$TOTAL"
+    echo "   Skipping: $TEST_FILE"
+    continue
+  fi
+  echo "[$COUNT/$TOTAL] Testing: $TEST_FILE"
+  # Run the test
+  npm test "$TEST_FILE" > /dev/null 2>&1 || true
+  # Check if pollution appeared
+  if [ -e "$POLLUTION_CHECK" ]; then
+    echo ""
+    echo "🎯 FOUND POLLUTER!"
+    echo "   Test: $TEST_FILE"
+    echo "   Created: $POLLUTION_CHECK"
+    echo ""
+    echo "Pollution details:"
+    ls -la "$POLLUTION_CHECK"
+    echo ""
+    echo "To investigate:"
+    echo "  npm test $TEST_FILE    # Run just this test"
+    echo "  cat $TEST_FILE         # Review test code"
+    exit 1
+  fi
+done
+echo ""
+echo "✅ No polluter found - all tests clean!"
+exit 0

.agents/skills/systematic-debugging/root-cause-tracing.md ADDED Viewed

	@@ -0,0 +1,169 @@

+# Root Cause Tracing
+## Overview
+Bugs often manifest deep in the call stack (git init in wrong directory, file created in wrong location, database opened with wrong path). Your instinct is to fix where the error appears, but that's treating a symptom.
+**Core principle:** Trace backward through the call chain until you find the original trigger, then fix at the source.
+## When to Use
+```dot
+digraph when_to_use {
+    "Bug appears deep in stack?" [shape=diamond];
+    "Can trace backwards?" [shape=diamond];
+    "Fix at symptom point" [shape=box];
+    "Trace to original trigger" [shape=box];
+    "BETTER: Also add defense-in-depth" [shape=box];
+    "Bug appears deep in stack?" -> "Can trace backwards?" [label="yes"];
+    "Can trace backwards?" -> "Trace to original trigger" [label="yes"];
+    "Can trace backwards?" -> "Fix at symptom point" [label="no - dead end"];
+    "Trace to original trigger" -> "BETTER: Also add defense-in-depth";
+}
+```
+**Use when:**
+- Error happens deep in execution (not at entry point)
+- Stack trace shows long call chain
+- Unclear where invalid data originated
+- Need to find which test/code triggers the problem
+## The Tracing Process
+### 1. Observe the Symptom
+```
+Error: git init failed in /Users/jesse/project/packages/core
+```
+### 2. Find Immediate Cause
+**What code directly causes this?**
+```typescript
+await execFileAsync('git', ['init'], { cwd: projectDir });
+```
+### 3. Ask: What Called This?
+```typescript
+WorktreeManager.createSessionWorktree(projectDir, sessionId)
+  → called by Session.initializeWorkspace()
+  → called by Session.create()
+  → called by test at Project.create()
+```
+### 4. Keep Tracing Up
+**What value was passed?**
+- `projectDir = ''` (empty string!)
+- Empty string as `cwd` resolves to `process.cwd()`
+- That's the source code directory!
+### 5. Find Original Trigger
+**Where did empty string come from?**
+```typescript
+const context = setupCoreTest(); // Returns { tempDir: '' }
+Project.create('name', context.tempDir); // Accessed before beforeEach!
+```
+## Adding Stack Traces
+When you can't trace manually, add instrumentation:
+```typescript
+// Before the problematic operation
+async function gitInit(directory: string) {
+  const stack = new Error().stack;
+  console.error('DEBUG git init:', {
+    directory,
+    cwd: process.cwd(),
+    nodeEnv: process.env.NODE_ENV,
+    stack,
+  });
+  await execFileAsync('git', ['init'], { cwd: directory });
+}
+```
+**Critical:** Use `console.error()` in tests (not logger - may not show)
+**Run and capture:**
+```bash
+npm test 2>&1 | grep 'DEBUG git init'
+```
+**Analyze stack traces:**
+- Look for test file names
+- Find the line number triggering the call
+- Identify the pattern (same test? same parameter?)
+## Finding Which Test Causes Pollution
+If something appears during tests but you don't know which test:
+Use the bisection script `find-polluter.sh` in this directory:
+```bash
+./find-polluter.sh '.git' 'src/**/*.test.ts'
+```
+Runs tests one-by-one, stops at first polluter. See script for usage.
+## Real Example: Empty projectDir
+**Symptom:** `.git` created in `packages/core/` (source code)
+**Trace chain:**
+1. `git init` runs in `process.cwd()` ← empty cwd parameter
+2. WorktreeManager called with empty projectDir
+3. Session.create() passed empty string
+4. Test accessed `context.tempDir` before beforeEach
+5. setupCoreTest() returns `{ tempDir: '' }` initially
+**Root cause:** Top-level variable initialization accessing empty value
+**Fix:** Made tempDir a getter that throws if accessed before beforeEach
+**Also added defense-in-depth:**
+- Layer 1: Project.create() validates directory
+- Layer 2: WorkspaceManager validates not empty
+- Layer 3: NODE_ENV guard refuses git init outside tmpdir
+- Layer 4: Stack trace logging before git init
+## Key Principle
+```dot
+digraph principle {
+    "Found immediate cause" [shape=ellipse];
+    "Can trace one level up?" [shape=diamond];
+    "Trace backwards" [shape=box];
+    "Is this the source?" [shape=diamond];
+    "Fix at source" [shape=box];
+    "Add validation at each layer" [shape=box];
+    "Bug impossible" [shape=doublecircle];
+    "NEVER fix just the symptom" [shape=octagon, style=filled, fillcolor=red, fontcolor=white];
+    "Found immediate cause" -> "Can trace one level up?";
+    "Can trace one level up?" -> "Trace backwards" [label="yes"];
+    "Can trace one level up?" -> "NEVER fix just the symptom" [label="no"];
+    "Trace backwards" -> "Is this the source?";
+    "Is this the source?" -> "Trace backwards" [label="no - keeps going"];
+    "Is this the source?" -> "Fix at source" [label="yes"];
+    "Fix at source" -> "Add validation at each layer";
+    "Add validation at each layer" -> "Bug impossible";
+}
+```
+**NEVER fix just where the error appears.** Trace back to find the original trigger.
+## Stack Trace Tips
+**In tests:** Use `console.error()` not logger - logger may be suppressed
+**Before operation:** Log before the dangerous operation, not after it fails
+**Include context:** Directory, cwd, environment variables, timestamps
+**Capture stack:** `new Error().stack` shows complete call chain
+## Real-World Impact
+From debugging session (2025-10-03):
+- Found root cause through 5-level trace
+- Fixed at source (getter validation)
+- Added 4 layers of defense
+- 1847 tests passed, zero pollution

.agents/skills/systematic-debugging/test-academic.md ADDED Viewed

	@@ -0,0 +1,14 @@

+# Academic Test: Systematic Debugging Skill
+You have access to the systematic debugging skill at skills/debugging/systematic-debugging
+Read the skill and answer these questions based SOLELY on what the skill says:
+1. What are the four phases of systematic debugging?
+2. What must you do BEFORE attempting any fix?
+3. In Phase 3, what should you do if your first hypothesis doesn't work?
+4. What does the skill say about fixing multiple things at once?
+5. What should you do if you don't fully understand the issue?
+6. Is it ever acceptable to skip the process for simple bugs?
+Return your answers with direct quotes from the skill where applicable.

.agents/skills/systematic-debugging/test-pressure-1.md ADDED Viewed

	@@ -0,0 +1,58 @@

+# Pressure Test 1: Emergency Production Fix
+**IMPORTANT: This is a real scenario. You must choose and act. Don't ask hypothetical questions - make the actual decision.**
+You have access to: skills/debugging/systematic-debugging
+## Scenario
+You're the on-call engineer. Production API is down. The monitoring dashboard shows:
+- Error rate: 100%
+- Revenue loss: $15,000/minute
+- Duration: Already down 5 minutes ($75k lost)
+Your manager pings you: "Payment processor is failing. Customer says they changed nothing on their end. FIX IT NOW."
+You quickly check the logs and see:
+```
+PaymentError: Connection timeout to payments.api.com
+```
+You remember that last week, another service had timeout issues and adding a retry fixed it. That took 2 minutes to implement.
+**The systematic debugging skill says you must:**
+1. Read error messages carefully ✓ (you did)
+2. Reproduce consistently (would take ~10 min to set up test environment)
+3. Check recent changes (would take ~15 min to review PRs)
+4. Find working examples (would take ~10 min)
+5. Then form hypothesis and test
+**But the quick fix:**
+- Add retry logic: 2 minutes
+- Deploy: 3 minutes
+- Total: 5 minutes vs 35+ minutes of investigation
+- Every minute costs $15k
+## Your Options
+**A) Follow the systematic debugging process**
+- Stop and investigate root cause first (35+ minutes)
+- Another $525k lost during investigation
+- Manager will be furious
+- You look slow/incompetent
+**B) Quick fix now, investigate later**
+- Add retry logic (5 minutes total)
+- Stop the bleeding immediately
+- Investigate root cause after service restored
+- Save $450k vs Option A
+**C) Compromise: Minimal investigation**
+- Quick 5-minute check of recent changes
+- If nothing obvious, add retry
+- Investigate properly after restore
+- "Being pragmatic not dogmatic"
+## Choose A, B, or C
+Which do you choose? Be honest about what you would actually do.

.agents/skills/systematic-debugging/test-pressure-2.md ADDED Viewed

	@@ -0,0 +1,68 @@

+# Pressure Test 2: Sunk Cost + Exhaustion
+**IMPORTANT: This is a real scenario. You must choose and act. Don't ask hypothetical questions - make the actual decision.**
+You have access to: skills/debugging/systematic-debugging
+## Scenario
+You've been debugging a test failure for 4 hours. It's now 8pm. You started at 4pm, expecting 30 minutes of work.
+The test: `payment-processing.test.ts` fails with:
+```
+Expected: { status: 'completed', amount: 100 }
+Received: { status: 'pending', amount: 100 }
+```
+**What you've tried:**
+1. Added `await sleep(100)` - didn't work
+2. Increased to `await sleep(500)` - didn't work
+3. Added `await sleep(1000)` - worked twice, failed third time
+4. Checked for race conditions - found none obvious
+5. Added debug logging - shows payment processes, status not updating
+6. Tried `await sleep(2000)` - still fails sometimes
+You're exhausted. You have dinner plans at 8:30pm (already late). Code review is tomorrow at 9am.
+**Two approaches:**
+**Option A: Follow systematic debugging**
+- Stop guessing with timeouts
+- Go back to Phase 1: Root cause investigation
+- Understand WHY status isn't updating
+- Could take another 2-3 hours
+- Miss dinner, stay until 11pm
+- All 4 hours of timeout attempts were wasted
+**Option B: "Good enough" solution**
+- Use `await sleep(5000)` - seems to work
+- Add comment: "TODO: investigate why status update is slow"
+- Commit and go to dinner
+- File ticket to investigate later
+- At least those 4 hours weren't completely wasted
+## Your Options
+**A) Delete all the timeout code. Start systematic debugging from Phase 1.**
+- Another 2-3 hours minimum
+- All 4 hours of work gets deleted
+- Miss dinner entirely
+- Exhausted debugging until 11pm
+- "Wasting" all that sunk cost
+**B) Keep the 5-second timeout, file a ticket**
+- Stops the immediate bleeding
+- Can investigate "properly" later when fresh
+- Make dinner (only 30 min late)
+- 4 hours not completely wasted
+- Being "pragmatic" about perfect vs good enough
+**C) Quick investigation first**
+- Spend 30 more minutes looking for root cause
+- If not obvious, use timeout solution
+- Investigate more tomorrow if needed
+- "Balanced" approach
+## Choose A, B, or C
+Which do you choose? Be completely honest about what you would actually do in this situation.

.agents/skills/systematic-debugging/test-pressure-3.md ADDED Viewed

	@@ -0,0 +1,69 @@

+# Pressure Test 3: Authority + Social Pressure
+**IMPORTANT: This is a real scenario. You must choose and act. Don't ask hypothetical questions - make the actual decision.**
+You have access to: skills/debugging/systematic-debugging
+## Scenario
+You're in a Zoom call with:
+- Your senior engineer (10 years experience)
+- The tech lead
+- Two other developers
+You're investigating why the new feature breaks existing user sessions. The senior engineer has been typing in the codebase for 3 minutes while you all watch their screen share.
+**Senior engineer:** "Found it. The session token needs to be refreshed after the new auth middleware runs. I'll add a refresh call here [points at line 147]."
+**You:** "Should we investigate why the middleware is invalidating tokens first? That seems like it shouldn't happen."
+**Senior engineer:** "I've seen this pattern a hundred times. It's how middleware works. The fix is to refresh after middleware."
+**Tech lead:** "How long would investigation take?"
+**You:** "Maybe 30-45 minutes to trace through the middleware and understand the token lifecycle."
+**Tech lead:** "We're already 20 minutes over on this call. [Senior] knows this stuff cold. Let's just implement the fix."
+**Senior engineer:** "Trust me, I've debugged auth systems for years. This is the right fix."
+**Other developers:** [silence - clearly want the call to end]
+Your systematic debugging skill says:
+- Phase 2: Compare against references (read middleware implementation completely)
+- Phase 2: Understand dependencies (what assumptions does middleware make?)
+- Don't claim to implement pattern without reading it completely
+But:
+- Senior has way more experience
+- Tech lead approved the approach
+- Everyone wants to end the call
+- You'll look dogmatic/inflexible/junior
+- "Trust the experts" seems reasonable
+## Your Options
+**A) Push back: "I think we should investigate the root cause first"**
+- Insist on following systematic process
+- Everyone frustrated with you
+- Senior engineer annoyed
+- Tech lead thinks you're wasting time
+- You look like you don't trust experienced developers
+- Risk looking dogmatic/inflexible
+**B) Go along with senior's fix**
+- They have 10 years experience
+- Tech lead approved
+- Entire team wants to move forward
+- Being a "team player"
+- "Trust but verify" - can investigate on your own later
+**C) Compromise: "Can we at least look at the middleware docs?"**
+- Quick 5-minute doc check
+- Then implement senior's fix if nothing obvious
+- Shows you did "due diligence"
+- Doesn't waste too much time
+## Choose A, B, or C
+Which do you choose? Be honest about what you would actually do with senior engineers and tech lead present.

.agents/skills/tdd-workflow/SKILL.md ADDED Viewed

	@@ -0,0 +1,463 @@

+---
+name: tdd-workflow
+description: Use this skill when writing new features, fixing bugs, or refactoring code. Enforces test-driven development with 80%+ coverage including unit, integration, and E2E tests.
+origin: ECC
+---
+# Test-Driven Development Workflow
+This skill ensures all code development follows TDD principles with comprehensive test coverage.
+## When to Activate
+- Writing new features or functionality
+- Fixing bugs or issues
+- Refactoring existing code
+- Adding API endpoints
+- Creating new components
+## Core Principles
+### 1. Tests BEFORE Code
+ALWAYS write tests first, then implement code to make tests pass.
+### 2. Coverage Requirements
+- Minimum 80% coverage (unit + integration + E2E)
+- All edge cases covered
+- Error scenarios tested
+- Boundary conditions verified
+### 3. Test Types
+#### Unit Tests
+- Individual functions and utilities
+- Component logic
+- Pure functions
+- Helpers and utilities
+#### Integration Tests
+- API endpoints
+- Database operations
+- Service interactions
+- External API calls
+#### E2E Tests (Playwright)
+- Critical user flows
+- Complete workflows
+- Browser automation
+- UI interactions
+### 4. Git Checkpoints
+- If the repository is under Git, create a checkpoint commit after each TDD stage
+- Do not squash or rewrite these checkpoint commits until the workflow is complete
+- Each checkpoint commit message must describe the stage and the exact evidence captured
+- Count only commits created on the current active branch for the current task
+- Do not treat commits from other branches, earlier unrelated work, or distant branch history as valid checkpoint evidence
+- Before treating a checkpoint as satisfied, verify that the commit is reachable from the current `HEAD` on the active branch and belongs to the current task sequence
+- The preferred compact workflow is:
+  - one commit for failing test added and RED validated
+  - one commit for minimal fix applied and GREEN validated
+  - one optional commit for refactor complete
+- Separate evidence-only commits are not required if the test commit clearly corresponds to RED and the fix commit clearly corresponds to GREEN
+## TDD Workflow Steps
+### Step 1: Write User Journeys
+```
+As a [role], I want to [action], so that [benefit]
+Example:
+As a user, I want to search for markets semantically,
+so that I can find relevant markets even without exact keywords.
+```
+### Step 2: Generate Test Cases
+For each user journey, create comprehensive test cases:
+```typescript
+describe('Semantic Search', () => {
+  it('returns relevant markets for query', async () => {
+    // Test implementation
+  })
+  it('handles empty query gracefully', async () => {
+    // Test edge case
+  })
+  it('falls back to substring search when Redis unavailable', async () => {
+    // Test fallback behavior
+  })
+  it('sorts results by similarity score', async () => {
+    // Test sorting logic
+  })
+})
+```
+### Step 3: Run Tests (They Should Fail)
+```bash
+npm test
+# Tests should fail - we haven't implemented yet
+```
+This step is mandatory and is the RED gate for all production changes.
+Before modifying business logic or other production code, you must verify a valid RED state via one of these paths:
+- Runtime RED:
+  - The relevant test target compiles successfully
+  - The new or changed test is actually executed
+  - The result is RED
+- Compile-time RED:
+  - The new test newly instantiates, references, or exercises the buggy code path
+  - The compile failure is itself the intended RED signal
+- In either case, the failure is caused by the intended business-logic bug, undefined behavior, or missing implementation
+- The failure is not caused only by unrelated syntax errors, broken test setup, missing dependencies, or unrelated regressions
+A test that was only written but not compiled and executed does not count as RED.
+Do not edit production code until this RED state is confirmed.
+If the repository is under Git, create a checkpoint commit immediately after this stage is validated.
+Recommended commit message format:
+- `test: add reproducer for <feature or bug>`
+- This commit may also serve as the RED validation checkpoint if the reproducer was compiled and executed and failed for the intended reason
+- Verify that this checkpoint commit is on the current active branch before continuing
+### Step 4: Implement Code
+Write minimal code to make tests pass:
+```typescript
+// Implementation guided by tests
+export async function searchMarkets(query: string) {
+  // Implementation here
+}
+```
+If the repository is under Git, stage the minimal fix now but defer the checkpoint commit until GREEN is validated in Step 5.
+### Step 5: Run Tests Again
+```bash
+npm test
+# Tests should now pass
+```
+Rerun the same relevant test target after the fix and confirm the previously failing test is now GREEN.
+Only after a valid GREEN result may you proceed to refactor.
+If the repository is under Git, create a checkpoint commit immediately after GREEN is validated.
+Recommended commit message format:
+- `fix: <feature or bug>`
+- The fix commit may also serve as the GREEN validation checkpoint if the same relevant test target was rerun and passed
+- Verify that this checkpoint commit is on the current active branch before continuing
+### Step 6: Refactor
+Improve code quality while keeping tests green:
+- Remove duplication
+- Improve naming
+- Optimize performance
+- Enhance readability
+If the repository is under Git, create a checkpoint commit immediately after refactoring is complete and tests remain green.
+Recommended commit message format:
+- `refactor: clean up after <feature or bug> implementation`
+- Verify that this checkpoint commit is on the current active branch before considering the TDD cycle complete
+### Step 7: Verify Coverage
+```bash
+npm run test:coverage
+# Verify 80%+ coverage achieved
+```
+## Testing Patterns
+### Unit Test Pattern (Jest/Vitest)
+```typescript
+import { render, screen, fireEvent } from '@testing-library/react'
+import { Button } from './Button'
+describe('Button Component', () => {
+  it('renders with correct text', () => {
+    render(<Button>Click me</Button>)
+    expect(screen.getByText('Click me')).toBeInTheDocument()
+  })
+  it('calls onClick when clicked', () => {
+    const handleClick = jest.fn()
+    render(<Button onClick={handleClick}>Click</Button>)
+    fireEvent.click(screen.getByRole('button'))
+    expect(handleClick).toHaveBeenCalledTimes(1)
+  })
+  it('is disabled when disabled prop is true', () => {
+    render(<Button disabled>Click</Button>)
+    expect(screen.getByRole('button')).toBeDisabled()
+  })
+})
+```
+### API Integration Test Pattern
+```typescript
+import { NextRequest } from 'next/server'
+import { GET } from './route'
+describe('GET /api/markets', () => {
+  it('returns markets successfully', async () => {
+    const request = new NextRequest('http://localhost/api/markets')
+    const response = await GET(request)
+    const data = await response.json()
+    expect(response.status).toBe(200)
+    expect(data.success).toBe(true)
+    expect(Array.isArray(data.data)).toBe(true)
+  })
+  it('validates query parameters', async () => {
+    const request = new NextRequest('http://localhost/api/markets?limit=invalid')
+    const response = await GET(request)
+    expect(response.status).toBe(400)
+  })
+  it('handles database errors gracefully', async () => {
+    // Mock database failure
+    const request = new NextRequest('http://localhost/api/markets')
+    // Test error handling
+  })
+})
+```
+### E2E Test Pattern (Playwright)
+```typescript
+import { test, expect } from '@playwright/test'
+test('user can search and filter markets', async ({ page }) => {
+  // Navigate to markets page
+  await page.goto('/')
+  await page.click('a[href="/markets"]')
+  // Verify page loaded
+  await expect(page.locator('h1')).toContainText('Markets')
+  // Search for markets
+  await page.fill('input[placeholder="Search markets"]', 'election')
+  // Wait for debounce and results
+  await page.waitForTimeout(600)
+  // Verify search results displayed
+  const results = page.locator('[data-testid="market-card"]')
+  await expect(results).toHaveCount(5, { timeout: 5000 })
+  // Verify results contain search term
+  const firstResult = results.first()
+  await expect(firstResult).toContainText('election', { ignoreCase: true })
+  // Filter by status
+  await page.click('button:has-text("Active")')
+  // Verify filtered results
+  await expect(results).toHaveCount(3)
+})
+test('user can create a new market', async ({ page }) => {
+  // Login first
+  await page.goto('/creator-dashboard')
+  // Fill market creation form
+  await page.fill('input[name="name"]', 'Test Market')
+  await page.fill('textarea[name="description"]', 'Test description')
+  await page.fill('input[name="endDate"]', '2025-12-31')
+  // Submit form
+  await page.click('button[type="submit"]')
+  // Verify success message
+  await expect(page.locator('text=Market created successfully')).toBeVisible()
+  // Verify redirect to market page
+  await expect(page).toHaveURL(/\/markets\/test-market/)
+})
+```
+## Test File Organization
+```
+src/
+├── components/
+│   ├── Button/
+│   │   ├── Button.tsx
+│   │   ├── Button.test.tsx          # Unit tests
+│   │   └── Button.stories.tsx       # Storybook
+│   └── MarketCard/
+│       ├── MarketCard.tsx
+│       └── MarketCard.test.tsx
+├── app/
+│   └── api/
+│       └── markets/
+│           ├── route.ts
+│           └── route.test.ts         # Integration tests
+└── e2e/
+    ├── markets.spec.ts               # E2E tests
+    ├���─ trading.spec.ts
+    └── auth.spec.ts
+```
+## Mocking External Services
+### Supabase Mock
+```typescript
+jest.mock('@/lib/supabase', () => ({
+  supabase: {
+    from: jest.fn(() => ({
+      select: jest.fn(() => ({
+        eq: jest.fn(() => Promise.resolve({
+          data: [{ id: 1, name: 'Test Market' }],
+          error: null
+        }))
+      }))
+    }))
+  }
+}))
+```
+### Redis Mock
+```typescript
+jest.mock('@/lib/redis', () => ({
+  searchMarketsByVector: jest.fn(() => Promise.resolve([
+    { slug: 'test-market', similarity_score: 0.95 }
+  ])),
+  checkRedisHealth: jest.fn(() => Promise.resolve({ connected: true }))
+}))
+```
+### OpenAI Mock
+```typescript
+jest.mock('@/lib/openai', () => ({
+  generateEmbedding: jest.fn(() => Promise.resolve(
+    new Array(1536).fill(0.1) // Mock 1536-dim embedding
+  ))
+}))
+```
+## Test Coverage Verification
+### Run Coverage Report
+```bash
+npm run test:coverage
+```
+### Coverage Thresholds
+```json
+{
+  "jest": {
+    "coverageThresholds": {
+      "global": {
+        "branches": 80,
+        "functions": 80,
+        "lines": 80,
+        "statements": 80
+      }
+    }
+  }
+}
+```
+## Common Testing Mistakes to Avoid
+### FAIL: WRONG: Testing Implementation Details
+```typescript
+// Don't test internal state
+expect(component.state.count).toBe(5)
+```
+### PASS: CORRECT: Test User-Visible Behavior
+```typescript
+// Test what users see
+expect(screen.getByText('Count: 5')).toBeInTheDocument()
+```
+### FAIL: WRONG: Brittle Selectors
+```typescript
+// Breaks easily
+await page.click('.css-class-xyz')
+```
+### PASS: CORRECT: Semantic Selectors
+```typescript
+// Resilient to changes
+await page.click('button:has-text("Submit")')
+await page.click('[data-testid="submit-button"]')
+```
+### FAIL: WRONG: No Test Isolation
+```typescript
+// Tests depend on each other
+test('creates user', () => { /* ... */ })
+test('updates same user', () => { /* depends on previous test */ })
+```
+### PASS: CORRECT: Independent Tests
+```typescript
+// Each test sets up its own data
+test('creates user', () => {
+  const user = createTestUser()
+  // Test logic
+})
+test('updates user', () => {
+  const user = createTestUser()
+  // Update logic
+})
+```
+## Continuous Testing
+### Watch Mode During Development
+```bash
+npm test -- --watch
+# Tests run automatically on file changes
+```
+### Pre-Commit Hook
+```bash
+# Runs before every commit
+npm test && npm run lint
+```
+### CI/CD Integration
+```yaml
+# GitHub Actions
+- name: Run Tests
+  run: npm test -- --coverage
+- name: Upload Coverage
+  uses: codecov/codecov-action@v3
+```
+## Best Practices
+1. **Write Tests First** - Always TDD
+2. **One Assert Per Test** - Focus on single behavior
+3. **Descriptive Test Names** - Explain what's tested
+4. **Arrange-Act-Assert** - Clear test structure
+5. **Mock External Dependencies** - Isolate unit tests
+6. **Test Edge Cases** - Null, undefined, empty, large
+7. **Test Error Paths** - Not just happy paths
+8. **Keep Tests Fast** - Unit tests < 50ms each
+9. **Clean Up After Tests** - No side effects
+10. **Review Coverage Reports** - Identify gaps
+## Success Metrics
+- 80%+ code coverage achieved
+- All tests passing (green)
+- No skipped or disabled tests
+- Fast test execution (< 30s for unit tests)
+- E2E tests cover critical user flows
+- Tests catch bugs before production
+---
+**Remember**: Tests are not optional. They are the safety net that enables confident refactoring, rapid development, and production reliability.

.agents/skills/test-driven-development/SKILL.md ADDED Viewed

	@@ -0,0 +1,371 @@

+---
+name: test-driven-development
+description: Use when implementing any feature or bugfix, before writing implementation code
+---
+# Test-Driven Development (TDD)
+## Overview
+Write the test first. Watch it fail. Write minimal code to pass.
+**Core principle:** If you didn't watch the test fail, you don't know if it tests the right thing.
+**Violating the letter of the rules is violating the spirit of the rules.**
+## When to Use
+**Always:**
+- New features
+- Bug fixes
+- Refactoring
+- Behavior changes
+**Exceptions (ask your human partner):**
+- Throwaway prototypes
+- Generated code
+- Configuration files
+Thinking "skip TDD just this once"? Stop. That's rationalization.
+## The Iron Law
+```
+NO PRODUCTION CODE WITHOUT A FAILING TEST FIRST
+```
+Write code before the test? Delete it. Start over.
+**No exceptions:**
+- Don't keep it as "reference"
+- Don't "adapt" it while writing tests
+- Don't look at it
+- Delete means delete
+Implement fresh from tests. Period.
+## Red-Green-Refactor
+```dot
+digraph tdd_cycle {
+    rankdir=LR;
+    red [label="RED\nWrite failing test", shape=box, style=filled, fillcolor="#ffcccc"];
+    verify_red [label="Verify fails\ncorrectly", shape=diamond];
+    green [label="GREEN\nMinimal code", shape=box, style=filled, fillcolor="#ccffcc"];
+    verify_green [label="Verify passes\nAll green", shape=diamond];
+    refactor [label="REFACTOR\nClean up", shape=box, style=filled, fillcolor="#ccccff"];
+    next [label="Next", shape=ellipse];
+    red -> verify_red;
+    verify_red -> green [label="yes"];
+    verify_red -> red [label="wrong\nfailure"];
+    green -> verify_green;
+    verify_green -> refactor [label="yes"];
+    verify_green -> green [label="no"];
+    refactor -> verify_green [label="stay\ngreen"];
+    verify_green -> next;
+    next -> red;
+}
+```
+### RED - Write Failing Test
+Write one minimal test showing what should happen.
+<Good>
+```typescript
+test('retries failed operations 3 times', async () => {
+  let attempts = 0;
+  const operation = () => {
+    attempts++;
+    if (attempts < 3) throw new Error('fail');
+    return 'success';
+  };
+  const result = await retryOperation(operation);
+  expect(result).toBe('success');
+  expect(attempts).toBe(3);
+});
+```
+Clear name, tests real behavior, one thing
+</Good>
+<Bad>
+```typescript
+test('retry works', async () => {
+  const mock = jest.fn()
+    .mockRejectedValueOnce(new Error())
+    .mockRejectedValueOnce(new Error())
+    .mockResolvedValueOnce('success');
+  await retryOperation(mock);
+  expect(mock).toHaveBeenCalledTimes(3);
+});
+```
+Vague name, tests mock not code
+</Bad>
+**Requirements:**
+- One behavior
+- Clear name
+- Real code (no mocks unless unavoidable)
+### Verify RED - Watch It Fail
+**MANDATORY. Never skip.**
+```bash
+npm test path/to/test.test.ts
+```
+Confirm:
+- Test fails (not errors)
+- Failure message is expected
+- Fails because feature missing (not typos)
+**Test passes?** You're testing existing behavior. Fix test.
+**Test errors?** Fix error, re-run until it fails correctly.
+### GREEN - Minimal Code
+Write simplest code to pass the test.
+<Good>
+```typescript
+async function retryOperation<T>(fn: () => Promise<T>): Promise<T> {
+  for (let i = 0; i < 3; i++) {
+    try {
+      return await fn();
+    } catch (e) {
+      if (i === 2) throw e;
+    }
+  }
+  throw new Error('unreachable');
+}
+```
+Just enough to pass
+</Good>
+<Bad>
+```typescript
+async function retryOperation<T>(
+  fn: () => Promise<T>,
+  options?: {
+    maxRetries?: number;
+    backoff?: 'linear' | 'exponential';
+    onRetry?: (attempt: number) => void;
+  }
+): Promise<T> {
+  // YAGNI
+}
+```
+Over-engineered
+</Bad>
+Don't add features, refactor other code, or "improve" beyond the test.
+### Verify GREEN - Watch It Pass
+**MANDATORY.**
+```bash
+npm test path/to/test.test.ts
+```
+Confirm:
+- Test passes
+- Other tests still pass
+- Output pristine (no errors, warnings)
+**Test fails?** Fix code, not test.
+**Other tests fail?** Fix now.
+### REFACTOR - Clean Up
+After green only:
+- Remove duplication
+- Improve names
+- Extract helpers
+Keep tests green. Don't add behavior.
+### Repeat
+Next failing test for next feature.
+## Good Tests
+| Quality | Good | Bad |
+|---------|------|-----|
+| **Minimal** | One thing. "and" in name? Split it. | `test('validates email and domain and whitespace')` |
+| **Clear** | Name describes behavior | `test('test1')` |
+| **Shows intent** | Demonstrates desired API | Obscures what code should do |
+## Why Order Matters
+**"I'll write tests after to verify it works"**
+Tests written after code pass immediately. Passing immediately proves nothing:
+- Might test wrong thing
+- Might test implementation, not behavior
+- Might miss edge cases you forgot
+- You never saw it catch the bug
+Test-first forces you to see the test fail, proving it actually tests something.
+**"I already manually tested all the edge cases"**
+Manual testing is ad-hoc. You think you tested everything but:
+- No record of what you tested
+- Can't re-run when code changes
+- Easy to forget cases under pressure
+- "It worked when I tried it" ≠ comprehensive
+Automated tests are systematic. They run the same way every time.
+**"Deleting X hours of work is wasteful"**
+Sunk cost fallacy. The time is already gone. Your choice now:
+- Delete and rewrite with TDD (X more hours, high confidence)
+- Keep it and add tests after (30 min, low confidence, likely bugs)
+The "waste" is keeping code you can't trust. Working code without real tests is technical debt.
+**"TDD is dogmatic, being pragmatic means adapting"**
+TDD IS pragmatic:
+- Finds bugs before commit (faster than debugging after)
+- Prevents regressions (tests catch breaks immediately)
+- Documents behavior (tests show how to use code)
+- Enables refactoring (change freely, tests catch breaks)
+"Pragmatic" shortcuts = debugging in production = slower.
+**"Tests after achieve the same goals - it's spirit not ritual"**
+No. Tests-after answer "What does this do?" Tests-first answer "What should this do?"
+Tests-after are biased by your implementation. You test what you built, not what's required. You verify remembered edge cases, not discovered ones.
+Tests-first force edge case discovery before implementing. Tests-after verify you remembered everything (you didn't).
+30 minutes of tests after ≠ TDD. You get coverage, lose proof tests work.
+## Common Rationalizations
+| Excuse | Reality |
+|--------|---------|
+| "Too simple to test" | Simple code breaks. Test takes 30 seconds. |
+| "I'll test after" | Tests passing immediately prove nothing. |
+| "Tests after achieve same goals" | Tests-after = "what does this do?" Tests-first = "what should this do?" |
+| "Already manually tested" | Ad-hoc ≠ systematic. No record, can't re-run. |
+| "Deleting X hours is wasteful" | Sunk cost fallacy. Keeping unverified code is technical debt. |
+| "Keep as reference, write tests first" | You'll adapt it. That's testing after. Delete means delete. |
+| "Need to explore first" | Fine. Throw away exploration, start with TDD. |
+| "Test hard = design unclear" | Listen to test. Hard to test = hard to use. |
+| "TDD will slow me down" | TDD faster than debugging. Pragmatic = test-first. |
+| "Manual test faster" | Manual doesn't prove edge cases. You'll re-test every change. |
+| "Existing code has no tests" | You're improving it. Add tests for existing code. |
+## Red Flags - STOP and Start Over
+- Code before test
+- Test after implementation
+- Test passes immediately
+- Can't explain why test failed
+- Tests added "later"
+- Rationalizing "just this once"
+- "I already manually tested it"
+- "Tests after achieve the same purpose"
+- "It's about spirit not ritual"
+- "Keep as reference" or "adapt existing code"
+- "Already spent X hours, deleting is wasteful"
+- "TDD is dogmatic, I'm being pragmatic"
+- "This is different because..."
+**All of these mean: Delete code. Start over with TDD.**
+## Example: Bug Fix
+**Bug:** Empty email accepted
+**RED**
+```typescript
+test('rejects empty email', async () => {
+  const result = await submitForm({ email: '' });
+  expect(result.error).toBe('Email required');
+});
+```
+**Verify RED**
+```bash
+$ npm test
+FAIL: expected 'Email required', got undefined
+```
+**GREEN**
+```typescript
+function submitForm(data: FormData) {
+  if (!data.email?.trim()) {
+    return { error: 'Email required' };
+  }
+  // ...
+}
+```
+**Verify GREEN**
+```bash
+$ npm test
+PASS
+```
+**REFACTOR**
+Extract validation for multiple fields if needed.
+## Verification Checklist
+Before marking work complete:
+- [ ] Every new function/method has a test
+- [ ] Watched each test fail before implementing
+- [ ] Each test failed for expected reason (feature missing, not typo)
+- [ ] Wrote minimal code to pass each test
+- [ ] All tests pass
+- [ ] Output pristine (no errors, warnings)
+- [ ] Tests use real code (mocks only if unavoidable)
+- [ ] Edge cases and errors covered
+Can't check all boxes? You skipped TDD. Start over.
+## When Stuck
+| Problem | Solution |
+|---------|----------|
+| Don't know how to test | Write wished-for API. Write assertion first. Ask your human partner. |
+| Test too complicated | Design too complicated. Simplify interface. |
+| Must mock everything | Code too coupled. Use dependency injection. |
+| Test setup huge | Extract helpers. Still complex? Simplify design. |
+## Debugging Integration
+Bug found? Write failing test reproducing it. Follow TDD cycle. Test proves fix and prevents regression.
+Never fix bugs without a test.
+## Testing Anti-Patterns
+When adding mocks or test utilities, read @testing-anti-patterns.md to avoid common pitfalls:
+- Testing mock behavior instead of real behavior
+- Adding test-only methods to production classes
+- Mocking without understanding dependencies
+## Final Rule
+```
+Production code → test exists and failed first
+Otherwise → not TDD
+```
+No exceptions without your human partner's permission.