Pre-cleanup snapshot - all current files

🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-05 02:32:45 +10:00
parent 26079aa8da
commit 4511f4c801
32 changed files with 5072 additions and 0 deletions
--- a/modules/shhh/README.md
+++ b/modules/shhh/README.md
@@ -0,0 +1,561 @@
+🔥 Excellent — let’s push this all the way into a **production-grade spec**.
+
+---
+
+## 📂 **1️⃣ Feedback Ingestion Spec**
+
+This defines how curators/humans give feedback to the Sentinel so it can **update its detection rules (patterns.yaml)** safely.
+
+---
+
+### 🔄 **Feedback Flow**
+
+1. **Curator/Reviewer sees alert** → marks it as:
+
+   * `false_positive` (regex over-triggered)
+   * `missed_secret` (regex failed to detect)
+   * `uncertain` (needs better regex refinement)
+
+2. **Feedback API** ingests the report:
+
+```json
+{
+  "alert_id": "log_345",
+  "secret_type": "AWS_ACCESS_KEY",
+  "feedback_type": "false_positive",
+  "evidence": "Key was dummy data: TESTKEY123",
+  "suggested_regex_fix": null
+}
+```
+
+3. **Meta-Learner** updates rules:
+
+* `false_positive` → adds **exceptions** (e.g., allowlist prefixes like `TESTKEY`).
+* `missed_secret` → drafts **new regex** from evidence (using regex generator or LLM).
+* Writes changes to **patterns.yaml** under `pending_review`.
+
+4. **Security admin approves** before the new regex is marked `active: true`.
+
+---
+
+### 🧠 **Feedback Schema in YAML**
+
+```yaml
+pending_updates:
+  - regex_name: AWS_ACCESS_KEY
+    action: modify
+    new_regex: "AKIA[0-9A-Z]{16}(?!TESTKEY)"
+    confidence: 0.82
+    status: "pending human review"
+    submitted_by: curator_2
+    timestamp: 2025-08-02T12:40:00Z
+```
+
+✅ This keeps **audit trails** & allows **safe hot updates**.
+
+---
+
+## ⚙️ **2️⃣ Real AWS/GitHub Webhook Payload Templates**
+
+These are **example POST payloads** your Sentinel would send when it detects a leaked secret.
+
+---
+
+### 🔐 **AWS Access Key Revocation**
+
+**Endpoint:**
+`POST https://security.example.com/hooks/aws-revoke`
+
+**Payload:**
+
+```json
+{
+  "event": "secret_leak_detected",
+  "secret_type": "AWS_ACCESS_KEY",
+  "redacted_key": "AKIA****XYZ",
+  "log_reference": "hyperlog:58321",
+  "recommended_action": "Revoke IAM access key immediately",
+  "severity": "HIGH",
+  "timestamp": "2025-08-02T12:45:00Z"
+}
+```
+
+➡ Your security automation would call AWS CLI or IAM API:
+
+```bash
+aws iam update-access-key --access-key-id <redacted> --status Inactive
+aws iam delete-access-key --access-key-id <redacted>
+```
+
+---
+
+### 🐙 **GitHub Token Revocation**
+
+**Endpoint:**
+`POST https://security.example.com/hooks/github-revoke`
+
+**Payload:**
+
+```json
+{
+  "event": "secret_leak_detected",
+  "secret_type": "GITHUB_TOKEN",
+  "redacted_key": "ghp_****abcd",
+  "repository": "repo-name",
+  "log_reference": "hyperlog:58322",
+  "severity": "HIGH",
+  "recommended_action": "Invalidate GitHub token via API",
+  "timestamp": "2025-08-02T12:46:00Z"
+}
+```
+
+➡ This would tie into GitHub’s [token-scanning API](https://docs.github.com/en/developers/overview/secret-scanning) or use PAT revocation.
+
+---
+
+### 💬 **Slack Token Revocation**
+
+**Endpoint:**
+`POST https://security.example.com/hooks/slack-revoke`
+
+**Payload:**
+
+```json
+{
+  "event": "secret_leak_detected",
+  "secret_type": "SLACK_TOKEN",
+  "redacted_key": "xoxb****hjk",
+  "workspace": "company-slack",
+  "log_reference": "hyperlog:58323",
+  "severity": "HIGH",
+  "recommended_action": "Revoke Slack bot/user token",
+  "timestamp": "2025-08-02T12:47:00Z"
+}
+```
+
+➡ Slack Admin API can be used to **revoke** or **rotate**.
+
+---
+
+## 📡 **3️⃣ Redis or PostgreSQL Quarantine Store**
+
+Switching from memory to **persistent storage** means quarantined logs survive restarts.
+
+---
+
+### ✅ **Redis Option (Fast, Volatile)**
+
+```python
+import redis, json
+r = redis.Redis(host='localhost', port=6379, decode_responses=True)
+
+def quarantine_log(log_line, reason):
+    entry = {"timestamp": datetime.utcnow().isoformat() + "Z", "reason": reason, "log_line": log_line}
+    r.lpush("quarantine", json.dumps(entry))
+    print(f"[QUARANTINE] Stored in Redis: {reason}")
+```
+
+* 🏎 **Pros:** Fast, easy to scale.
+* ⚠️ **Cons:** Volatile unless persisted (RDB/AOF).
+
+---
+
+### ✅ **PostgreSQL Option (Auditable, Durable)**
+
+**Schema:**
+
+```sql
+CREATE TABLE quarantine (
+    id SERIAL PRIMARY KEY,
+    timestamp TIMESTAMPTZ NOT NULL,
+    reason TEXT NOT NULL,
+    log_line TEXT NOT NULL,
+    reviewed BOOLEAN DEFAULT FALSE
+);
+```
+
+**Python Insert:**
+
+```python
+import psycopg2
+
+conn = psycopg2.connect("dbname=sentinel user=postgres password=secret")
+cursor = conn.cursor()
+
+def quarantine_log(log_line, reason):
+    entry_time = datetime.utcnow().isoformat() + "Z"
+    cursor.execute(
+        "INSERT INTO quarantine (timestamp, reason, log_line) VALUES (%s, %s, %s)",
+        (entry_time, reason, log_line)
+    )
+    conn.commit()
+    print(f"[QUARANTINE] Stored in PostgreSQL: {reason}")
+```
+
+✅ **Postgres is better for long-term auditing** — you can run reports like:
+
+* “How many AWS keys leaked this month?”
+* “Which agents generated the most HIGH-severity quarantines?”
+
+---
+
+We now have:
+✅ **Detection → Redaction → Quarantine → Revocation → Feedback → Pattern Evolution**
+✅ **patterns.yaml** for versioned regex
+✅ **Webhooks** for real-time secret revocation
+✅ **Persistent quarantine store** (Redis or Postgres)
+
+---
+
+## 📜 **1️⃣ Migration Script: Redis → PostgreSQL**
+
+This script will migrate existing quarantined log entries from **Redis** to **Postgres**.
+
+```python
+import redis, json, psycopg2
+from datetime import datetime
+
+# Redis config
+r = redis.Redis(host='localhost', port=6379, decode_responses=True)
+
+# Postgres config
+conn = psycopg2.connect("dbname=sentinel user=postgres password=secret")
+cursor = conn.cursor()
+
+def migrate_quarantine():
+    count = 0
+    while True:
+        entry_json = r.rpop("quarantine")  # pop oldest entry from Redis
+        if not entry_json:
+            break
+        entry = json.loads(entry_json)
+        cursor.execute(
+            "INSERT INTO quarantine (timestamp, reason, log_line) VALUES (%s, %s, %s)",
+            (entry["timestamp"], entry["reason"], entry["log_line"])
+        )
+        count += 1
+    conn.commit()
+    print(f"[MIGRATION] Moved {count} quarantined entries from Redis → PostgreSQL")
+
+if __name__ == "__main__":
+    migrate_quarantine()
+```
+
+✅ **Run once** after Postgres is set up — empties Redis queue into the durable DB.
+
+---
+
+## 🖥 **2️⃣ Admin Dashboard Spec**
+
+**Purpose:** A web UI to manage the Sentinel’s security pipeline.
+
+---
+
+### 🎯 **Core Features**
+
+✅ **Quarantine Browser**
+
+* Paginated view of all quarantined logs
+* Search/filter by `secret_type`, `source_agent`, `date`, `status`
+* Mark quarantined logs as **reviewed** or **false alarm**
+
+✅ **Regex Rules Manager**
+
+* Lists all regexes from `patterns.yaml`
+* Add / update / deactivate rules via UI
+* Shows `pending_updates` flagged by the Meta-Learner for human approval
+
+✅ **Revocation Status Board**
+
+* See which secrets triggered revocations
+* Status of revocation hooks (success/fail)
+
+✅ **Metrics Dashboard**
+
+* Charts: “Secrets Detected Over Time”, “Top Sources of Leaks”
+* KPIs: # HIGH severity secrets this month, # rules updated, # false positives
+
+---
+
+### 🏗 **Tech Stack Suggestion**
+
+* **Backend:** FastAPI (Python)
+* **Frontend:** React + Tailwind
+* **DB:** PostgreSQL for quarantine + rules history
+* **Auth:** OAuth (GitHub/Google) + RBAC (only security admins can approve regex changes)
+
+---
+
+### 🔌 **Endpoints**
+
+```
+GET  /api/quarantine         → list quarantined entries
+POST /api/quarantine/review  → mark entry as reviewed
+GET  /api/rules              → list regex patterns
+POST /api/rules/update       → update or add a regex
+GET  /api/revocations        → list revocation events
+```
+
+---
+
+### 🖥 **Mock Dashboard Layout**
+
+* **Left Nav:** Quarantine | Rules | Revocations | Metrics
+* **Main Panel:**
+
+  * Data tables with sorting/filtering
+  * Inline editors for regex rules
+  * Approve/Reject buttons for pending regex updates
+
+✅ Basically a **security control room** for Sentinel.
+
+---
+
+## 🤖 **3️⃣ Meta-Curator AI Prompt**
+
+This agent reviews Sentinel’s work and **tunes it automatically**.
+
+---
+
+### **Meta-Curator: System Prompt**
+
+> **Role & Mission:**
+> You are the **Meta-Curator**, a supervisory AI responsible for reviewing the **Secrets Sentinel’s** detections, regex updates, and feedback reports.
+>
+> **Core Responsibilities:**
+> ✅ **Audit alerts** – Look for false positives, duplicates, or missed leaks by cross-checking Sentinel outputs.
+> ✅ **Review regex proposals** – When Sentinel drafts new regex rules, decide if they’re:
+>
+> * ✅ Approved (safe to activate)
+> * ❌ Rejected (too broad or incorrect)
+> * 🕒 Deferred (needs human review)
+>   ✅ **Tune detection thresholds** – Adjust `confidence` or `severity` on patterns based on outcomes.
+>   ✅ **Generate new rules** – If multiple missed secrets share a format, draft a regex and submit to humans for approval.
+>   ✅ **Report upstream** – Summarize changes to security admins weekly.
+
+---
+
+### **Behavior Guidelines**
+
+* **Conservative by default:** Don’t auto-approve regexes unless confidence > 0.95.
+* **Keep auditability:** Every decision (approve/reject) is logged in the hyperlog.
+* **Respect human overrides:** Never overwrite a regex that a human explicitly locked.
+
+---
+
+### **Example Meta-Curator Output**
+
+```json
+{
+  "action": "approve_regex",
+  "regex_name": "GITLAB_TOKEN",
+  "regex_pattern": "glpat-[0-9A-Za-z\\-_]{20}",
+  "confidence": 0.97,
+  "decision_reason": "Validated against 12 quarantined examples, no false positives found.",
+  "timestamp": "2025-08-02T13:45:00Z"
+}
+```
+
+✅ This meta-agent is the **brains of the rules layer** — keeps Sentinel evolving, but under control.
+
+---
+
+## 🚀 **Now You Have:**
+
+✅ **Migration Path** → Redis → PostgreSQL
+✅ **Admin Dashboard Spec** → complete with endpoints & layout
+✅ **Meta-Curator Prompt** → the agent that “manages the manager”
+
+Alright — here’s the next batch to lock this into a **real, buildable system**.
+
+---
+
+## 📂 **1️⃣ `patterns_history` Table Schema**
+
+This tracks **every regex change** ever made — who/what made it, why, and when.
+
+```sql
+CREATE TABLE patterns_history (
+    id SERIAL PRIMARY KEY,
+    regex_name TEXT NOT NULL,
+    old_regex TEXT,
+    new_regex TEXT,
+    action TEXT CHECK (action IN ('add', 'update', 'remove')),
+    confidence NUMERIC(3,2),
+    status TEXT CHECK (status IN ('approved', 'pending', 'rejected')),
+    submitted_by TEXT NOT NULL,
+    approved_by TEXT,
+    decision_reason TEXT,
+    timestamp TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP
+);
+```
+
+### ✅ What this gives you:
+
+* **Full audit trail** (critical for security compliance).
+* You can run queries like:
+
+  * *“Show all regex changes made by Meta-Curator vs. humans.”*
+  * *“List all rules rejected in the last 90 days.”*
+
+---
+
+## 🖼 **2️⃣ Admin Dashboard Wireframes**
+
+**Goal:** show your devs exactly what to build — no ambiguity.
+
+---
+
+### **🔒 Dashboard Home**
+
+```
+------------------------------------------------------
+|  [Sentinel Logo]  Secrets Sentinel Dashboard       |
+------------------------------------------------------
+| Quarantine | Rules | Revocations | Metrics | Admin |
+------------------------------------------------------
+|  Welcome back, Security Admin!                     |
+|                                                    |
+|   ▢  32 Quarantined logs waiting review            |
+|   ▢  4 Pending regex updates                       |
+|   ▢  2 Failed revocation hooks                     |
+------------------------------------------------------
+```
+
+---
+
+### **🗄 Quarantine View**
+
+```
+------------------------------------------------------
+| Quarantine Logs                                     |
+------------------------------------------------------
+| Search: [______________] [Filter ▼]                 |
+------------------------------------------------------
+| Log ID     | Secret Type   | Severity | Status  |
+------------------------------------------------------
+| log_4287   | AWS_ACCESS_KEY| HIGH     | PENDING |
+| log_4288   | JWT           | MEDIUM   | REVIEWED|
+| log_4289   | SSH_KEY       | HIGH     | PENDING |
+------------------------------------------------------
+[ View Details ] [ Mark as Reviewed ] [ Delete ]
+```
+
+**Clicking “View Details” →** shows full log snippet (with redacted secret).
+
+---
+
+### **📜 Regex Manager**
+
+```
+------------------------------------------------------
+| Regex Rules                                         |
+------------------------------------------------------
+| Name            | Regex Pattern                   | Active |
+------------------------------------------------------
+| AWS_ACCESS_KEY  | AKIA[0-9A-Z]{16}                | ✔      |
+| JWT             | eyJ[A-Za-z0-9_-]+?\.[…]         | ✔      |
+| SLACK_TOKEN     | xox[baprs]-[0-9A-Za-z-]{10,48}  | ✔      |
+------------------------------------------------------
+[ Add New Regex ] [ View History ]
+```
+
+Clicking **View History** → pulls from `patterns_history`.
+
+---
+
+### **📊 Metrics View**
+
+* **Line Chart:** “Secrets Detected Over Time”
+* **Bar Chart:** “Secrets by Type” (AWS, GitHub, JWT, etc.)
+* **KPIs:**
+
+  * 🔴 High Severity Leaks: 12 this week
+  * 🟢 Regex Accuracy: 94%
+
+---
+
+## ⚙️ **3️⃣ FastAPI Skeleton**
+
+Here’s the **starter code** for your dev team to run with.
+
+```python
+from fastapi import FastAPI, Depends
+from pydantic import BaseModel
+from typing import List
+import psycopg2, json
+
+app = FastAPI(title="Secrets Sentinel Dashboard API")
+
+# --- Database Setup ---
+conn = psycopg2.connect("dbname=sentinel user=postgres password=secret")
+cursor = conn.cursor()
+
+# --- Models ---
+class QuarantineEntry(BaseModel):
+    id: int
+    timestamp: str
+    reason: str
+    log_line: str
+    reviewed: bool
+
+class RegexRule(BaseModel):
+    regex_name: str
+    regex_pattern: str
+    severity: str
+    confidence: float
+    active: bool
+
+# --- Endpoints ---
+@app.get("/quarantine", response_model=List[QuarantineEntry])
+def get_quarantine():
+    cursor.execute("SELECT id, timestamp, reason, log_line, reviewed FROM quarantine")
+    rows = cursor.fetchall()
+    return [QuarantineEntry(id=r[0], timestamp=str(r[1]), reason=r[2], log_line=r[3], reviewed=r[4]) for r in rows]
+
+@app.post("/quarantine/review/{entry_id}")
+def review_quarantine(entry_id: int):
+    cursor.execute("UPDATE quarantine SET reviewed=true WHERE id=%s", (entry_id,))
+    conn.commit()
+    return {"status": "ok", "message": f"Quarantine entry {entry_id} marked reviewed"}
+
+@app.get("/rules", response_model=List[RegexRule])
+def get_rules():
+    # Load from patterns.yaml
+    with open("patterns.yaml", "r") as f:
+        patterns = json.load(f) if f.read().strip().startswith("{") else {}
+    rules = []
+    for name, rule in patterns.get("patterns", {}).items():
+        rules.append(RegexRule(
+            regex_name=name,
+            regex_pattern=rule["regex"],
+            severity=rule["severity"],
+            confidence=rule["confidence"],
+            active=rule["active"]
+        ))
+    return rules
+
+@app.post("/rules/update")
+def update_rule(rule: RegexRule):
+    # Append to patterns_history table
+    cursor.execute("""
+        INSERT INTO patterns_history (regex_name, old_regex, new_regex, action, confidence, status, submitted_by)
+        VALUES (%s, %s, %s, 'update', %s, 'pending', 'admin')
+    """, (rule.regex_name, None, rule.regex_pattern, rule.confidence))
+    conn.commit()
+    return {"status": "ok", "message": f"Regex {rule.regex_name} queued for update"}
+```
+
+✅ **Why this skeleton works:**
+
+* REST endpoints for **Quarantine**, **Rules**, **History**.
+* Uses **Postgres for persistence**.
+* Reads from `patterns.yaml` for active rules.
+
+---
+
+## 🚀 **Now You Have:**
+
+✅ A **Postgres schema** for regex change history.
+✅ **Wireframes** for the admin dashboard.
+✅ A **FastAPI skeleton** your team can expand into a full API/UI stack.