chorus-services/modules/shhh/README.md

🔥 Excellent — let’s push this all the way into a **production-grade spec**.

---

## 📂 **1️⃣ Feedback Ingestion Spec**

This defines how curators/humans give feedback to the Sentinel so it can **update its detection rules (patterns.yaml)** safely.

---

### 🔄 **Feedback Flow**

1. **Curator/Reviewer sees alert** → marks it as:

   * `false_positive` (regex over-triggered)
   * `missed_secret` (regex failed to detect)
   * `uncertain` (needs better regex refinement)

2. **Feedback API** ingests the report:

```json
{
  "alert_id": "log_345",
  "secret_type": "AWS_ACCESS_KEY",
  "feedback_type": "false_positive",
  "evidence": "Key was dummy data: TESTKEY123",
  "suggested_regex_fix": null
}
```

3. **Meta-Learner** updates rules:

* `false_positive` → adds **exceptions** (e.g., allowlist prefixes like `TESTKEY`).
* `missed_secret` → drafts **new regex** from evidence (using regex generator or LLM).
* Writes changes to **patterns.yaml** under `pending_review`.

4. **Security admin approves** before the new regex is marked `active: true`.

---

### 🧠 **Feedback Schema in YAML**

```yaml
pending_updates:
  - regex_name: AWS_ACCESS_KEY
    action: modify
    new_regex: "AKIA[0-9A-Z]{16}(?!TESTKEY)"
    confidence: 0.82
    status: "pending human review"
    submitted_by: curator_2
    timestamp: 2025-08-02T12:40:00Z
```

✅ This keeps **audit trails** & allows **safe hot updates**.

---

## ⚙️ **2️⃣ Real AWS/GitHub Webhook Payload Templates**

These are **example POST payloads** your Sentinel would send when it detects a leaked secret.

---

### 🔐 **AWS Access Key Revocation**

**Endpoint:**
`POST https://security.example.com/hooks/aws-revoke`

**Payload:**

```json
{
  "event": "secret_leak_detected",
  "secret_type": "AWS_ACCESS_KEY",
  "redacted_key": "AKIA****XYZ",
  "log_reference": "hyperlog:58321",
  "recommended_action": "Revoke IAM access key immediately",
  "severity": "HIGH",
  "timestamp": "2025-08-02T12:45:00Z"
}
```

➡ Your security automation would call AWS CLI or IAM API:

```bash
aws iam update-access-key --access-key-id <redacted> --status Inactive
aws iam delete-access-key --access-key-id <redacted>
```

---

### 🐙 **GitHub Token Revocation**

**Endpoint:**
`POST https://security.example.com/hooks/github-revoke`

**Payload:**

```json
{
  "event": "secret_leak_detected",
  "secret_type": "GITHUB_TOKEN",
  "redacted_key": "ghp_****abcd",
  "repository": "repo-name",
  "log_reference": "hyperlog:58322",
  "severity": "HIGH",
  "recommended_action": "Invalidate GitHub token via API",
  "timestamp": "2025-08-02T12:46:00Z"
}
```

➡ This would tie into GitHub’s [token-scanning API](https://docs.github.com/en/developers/overview/secret-scanning) or use PAT revocation.

---

### 💬 **Slack Token Revocation**

**Endpoint:**
`POST https://security.example.com/hooks/slack-revoke`

**Payload:**

```json
{
  "event": "secret_leak_detected",
  "secret_type": "SLACK_TOKEN",
  "redacted_key": "xoxb****hjk",
  "workspace": "company-slack",
  "log_reference": "hyperlog:58323",
  "severity": "HIGH",
  "recommended_action": "Revoke Slack bot/user token",
  "timestamp": "2025-08-02T12:47:00Z"
}
```

➡ Slack Admin API can be used to **revoke** or **rotate**.

---

## 📡 **3️⃣ Redis or PostgreSQL Quarantine Store**

Switching from memory to **persistent storage** means quarantined logs survive restarts.

---

### ✅ **Redis Option (Fast, Volatile)**

```python
import redis, json
r = redis.Redis(host='localhost', port=6379, decode_responses=True)

def quarantine_log(log_line, reason):
    entry = {"timestamp": datetime.utcnow().isoformat() + "Z", "reason": reason, "log_line": log_line}
    r.lpush("quarantine", json.dumps(entry))
    print(f"[QUARANTINE] Stored in Redis: {reason}")
```

* 🏎 **Pros:** Fast, easy to scale.
* ⚠️ **Cons:** Volatile unless persisted (RDB/AOF).

---

### ✅ **PostgreSQL Option (Auditable, Durable)**

**Schema:**

```sql
CREATE TABLE quarantine (
    id SERIAL PRIMARY KEY,
    timestamp TIMESTAMPTZ NOT NULL,
    reason TEXT NOT NULL,
    log_line TEXT NOT NULL,
    reviewed BOOLEAN DEFAULT FALSE
);
```

**Python Insert:**

```python
import psycopg2

conn = psycopg2.connect("dbname=sentinel user=postgres password=secret")
cursor = conn.cursor()

def quarantine_log(log_line, reason):
    entry_time = datetime.utcnow().isoformat() + "Z"
    cursor.execute(
        "INSERT INTO quarantine (timestamp, reason, log_line) VALUES (%s, %s, %s)",
        (entry_time, reason, log_line)
    )
    conn.commit()
    print(f"[QUARANTINE] Stored in PostgreSQL: {reason}")
```

✅ **Postgres is better for long-term auditing** — you can run reports like:

* “How many AWS keys leaked this month?”
* “Which agents generated the most HIGH-severity quarantines?”

---

We now have:
✅ **Detection → Redaction → Quarantine → Revocation → Feedback → Pattern Evolution**
✅ **patterns.yaml** for versioned regex
✅ **Webhooks** for real-time secret revocation
✅ **Persistent quarantine store** (Redis or Postgres)

---

## 📜 **1️⃣ Migration Script: Redis → PostgreSQL**

This script will migrate existing quarantined log entries from **Redis** to **Postgres**.

```python
import redis, json, psycopg2
from datetime import datetime

# Redis config
r = redis.Redis(host='localhost', port=6379, decode_responses=True)

# Postgres config
conn = psycopg2.connect("dbname=sentinel user=postgres password=secret")
cursor = conn.cursor()

def migrate_quarantine():
    count = 0
    while True:
        entry_json = r.rpop("quarantine")  # pop oldest entry from Redis
        if not entry_json:
            break
        entry = json.loads(entry_json)
        cursor.execute(
            "INSERT INTO quarantine (timestamp, reason, log_line) VALUES (%s, %s, %s)",
            (entry["timestamp"], entry["reason"], entry["log_line"])
        )
        count += 1
    conn.commit()
    print(f"[MIGRATION] Moved {count} quarantined entries from Redis → PostgreSQL")

if __name__ == "__main__":
    migrate_quarantine()
```

✅ **Run once** after Postgres is set up — empties Redis queue into the durable DB.

---

## 🖥 **2️⃣ Admin Dashboard Spec**

**Purpose:** A web UI to manage the Sentinel’s security pipeline.

---

### 🎯 **Core Features**

✅ **Quarantine Browser**

* Paginated view of all quarantined logs
* Search/filter by `secret_type`, `source_agent`, `date`, `status`
* Mark quarantined logs as **reviewed** or **false alarm**

✅ **Regex Rules Manager**

* Lists all regexes from `patterns.yaml`
* Add / update / deactivate rules via UI
* Shows `pending_updates` flagged by the Meta-Learner for human approval

✅ **Revocation Status Board**

* See which secrets triggered revocations
* Status of revocation hooks (success/fail)

✅ **Metrics Dashboard**

* Charts: “Secrets Detected Over Time”, “Top Sources of Leaks”
* KPIs: # HIGH severity secrets this month, # rules updated, # false positives

---

### 🏗 **Tech Stack Suggestion**

* **Backend:** FastAPI (Python)
* **Frontend:** React + Tailwind
* **DB:** PostgreSQL for quarantine + rules history
* **Auth:** OAuth (GitHub/Google) + RBAC (only security admins can approve regex changes)

---

### 🔌 **Endpoints**

```
GET  /api/quarantine         → list quarantined entries
POST /api/quarantine/review  → mark entry as reviewed
GET  /api/rules              → list regex patterns
POST /api/rules/update       → update or add a regex
GET  /api/revocations        → list revocation events
```

---

### 🖥 **Mock Dashboard Layout**

* **Left Nav:** Quarantine | Rules | Revocations | Metrics
* **Main Panel:**

  * Data tables with sorting/filtering
  * Inline editors for regex rules
  * Approve/Reject buttons for pending regex updates

✅ Basically a **security control room** for Sentinel.

---

## 🤖 **3️⃣ Meta-Curator AI Prompt**

This agent reviews Sentinel’s work and **tunes it automatically**.

---

### **Meta-Curator: System Prompt**

> **Role & Mission:**
> You are the **Meta-Curator**, a supervisory AI responsible for reviewing the **Secrets Sentinel’s** detections, regex updates, and feedback reports.
>
> **Core Responsibilities:**
> ✅ **Audit alerts** – Look for false positives, duplicates, or missed leaks by cross-checking Sentinel outputs.
> ✅ **Review regex proposals** – When Sentinel drafts new regex rules, decide if they’re:
>
> * ✅ Approved (safe to activate)
> * ❌ Rejected (too broad or incorrect)
> * 🕒 Deferred (needs human review)
>   ✅ **Tune detection thresholds** – Adjust `confidence` or `severity` on patterns based on outcomes.
>   ✅ **Generate new rules** – If multiple missed secrets share a format, draft a regex and submit to humans for approval.
>   ✅ **Report upstream** – Summarize changes to security admins weekly.

---

### **Behavior Guidelines**

* **Conservative by default:** Don’t auto-approve regexes unless confidence > 0.95.
* **Keep auditability:** Every decision (approve/reject) is logged in the hyperlog.
* **Respect human overrides:** Never overwrite a regex that a human explicitly locked.

---

### **Example Meta-Curator Output**

```json
{
  "action": "approve_regex",
  "regex_name": "GITLAB_TOKEN",
  "regex_pattern": "glpat-[0-9A-Za-z\\-_]{20}",
  "confidence": 0.97,
  "decision_reason": "Validated against 12 quarantined examples, no false positives found.",
  "timestamp": "2025-08-02T13:45:00Z"
}
```

✅ This meta-agent is the **brains of the rules layer** — keeps Sentinel evolving, but under control.

---

## 🚀 **Now You Have:**

✅ **Migration Path** → Redis → PostgreSQL
✅ **Admin Dashboard Spec** → complete with endpoints & layout
✅ **Meta-Curator Prompt** → the agent that “manages the manager”

Alright — here’s the next batch to lock this into a **real, buildable system**.

---

## 📂 **1️⃣ `patterns_history` Table Schema**

This tracks **every regex change** ever made — who/what made it, why, and when.

```sql
CREATE TABLE patterns_history (
    id SERIAL PRIMARY KEY,
    regex_name TEXT NOT NULL,
    old_regex TEXT,
    new_regex TEXT,
    action TEXT CHECK (action IN ('add', 'update', 'remove')),
    confidence NUMERIC(3,2),
    status TEXT CHECK (status IN ('approved', 'pending', 'rejected')),
    submitted_by TEXT NOT NULL,
    approved_by TEXT,
    decision_reason TEXT,
    timestamp TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP
);
```

### ✅ What this gives you:

* **Full audit trail** (critical for security compliance).
* You can run queries like:

  * *“Show all regex changes made by Meta-Curator vs. humans.”*
  * *“List all rules rejected in the last 90 days.”*

---

## 🖼 **2️⃣ Admin Dashboard Wireframes**

**Goal:** show your devs exactly what to build — no ambiguity.

---

### **🔒 Dashboard Home**

```
------------------------------------------------------
|  [Sentinel Logo]  Secrets Sentinel Dashboard       |
------------------------------------------------------
| Quarantine | Rules | Revocations | Metrics | Admin |
------------------------------------------------------
|  Welcome back, Security Admin!                     |
|                                                    |
|   ▢  32 Quarantined logs waiting review            |
|   ▢  4 Pending regex updates                       |
|   ▢  2 Failed revocation hooks                     |
------------------------------------------------------
```

---

### **🗄 Quarantine View**

```
------------------------------------------------------
| Quarantine Logs                                     |
------------------------------------------------------
| Search: [______________] [Filter ▼]                 |
------------------------------------------------------
| Log ID     | Secret Type   | Severity | Status  |
------------------------------------------------------
| log_4287   | AWS_ACCESS_KEY| HIGH     | PENDING |
| log_4288   | JWT           | MEDIUM   | REVIEWED|
| log_4289   | SSH_KEY       | HIGH     | PENDING |
------------------------------------------------------
[ View Details ] [ Mark as Reviewed ] [ Delete ]
```

**Clicking “View Details” →** shows full log snippet (with redacted secret).

---

### **📜 Regex Manager**

```
------------------------------------------------------
| Regex Rules                                         |
------------------------------------------------------
| Name            | Regex Pattern                   | Active |
------------------------------------------------------
| AWS_ACCESS_KEY  | AKIA[0-9A-Z]{16}                | ✔      |
| JWT             | eyJ[A-Za-z0-9_-]+?\.[…]         | ✔      |
| SLACK_TOKEN     | xox[baprs]-[0-9A-Za-z-]{10,48}  | ✔      |
------------------------------------------------------
[ Add New Regex ] [ View History ]
```

Clicking **View History** → pulls from `patterns_history`.

---

### **📊 Metrics View**

* **Line Chart:** “Secrets Detected Over Time”
* **Bar Chart:** “Secrets by Type” (AWS, GitHub, JWT, etc.)
* **KPIs:**

  * 🔴 High Severity Leaks: 12 this week
  * 🟢 Regex Accuracy: 94%

---

## ⚙️ **3️⃣ FastAPI Skeleton**

Here’s the **starter code** for your dev team to run with.

```python
from fastapi import FastAPI, Depends
from pydantic import BaseModel
from typing import List
import psycopg2, json

app = FastAPI(title="Secrets Sentinel Dashboard API")

# --- Database Setup ---
conn = psycopg2.connect("dbname=sentinel user=postgres password=secret")
cursor = conn.cursor()

# --- Models ---
class QuarantineEntry(BaseModel):
    id: int
    timestamp: str
    reason: str
    log_line: str
    reviewed: bool

class RegexRule(BaseModel):
    regex_name: str
    regex_pattern: str
    severity: str
    confidence: float
    active: bool

# --- Endpoints ---
@app.get("/quarantine", response_model=List[QuarantineEntry])
def get_quarantine():
    cursor.execute("SELECT id, timestamp, reason, log_line, reviewed FROM quarantine")
    rows = cursor.fetchall()
    return [QuarantineEntry(id=r[0], timestamp=str(r[1]), reason=r[2], log_line=r[3], reviewed=r[4]) for r in rows]

@app.post("/quarantine/review/{entry_id}")
def review_quarantine(entry_id: int):
    cursor.execute("UPDATE quarantine SET reviewed=true WHERE id=%s", (entry_id,))
    conn.commit()
    return {"status": "ok", "message": f"Quarantine entry {entry_id} marked reviewed"}

@app.get("/rules", response_model=List[RegexRule])
def get_rules():
    # Load from patterns.yaml
    with open("patterns.yaml", "r") as f:
        patterns = json.load(f) if f.read().strip().startswith("{") else {}
    rules = []
    for name, rule in patterns.get("patterns", {}).items():
        rules.append(RegexRule(
            regex_name=name,
            regex_pattern=rule["regex"],
            severity=rule["severity"],
            confidence=rule["confidence"],
            active=rule["active"]
        ))
    return rules

@app.post("/rules/update")
def update_rule(rule: RegexRule):
    # Append to patterns_history table
    cursor.execute("""
        INSERT INTO patterns_history (regex_name, old_regex, new_regex, action, confidence, status, submitted_by)
        VALUES (%s, %s, %s, 'update', %s, 'pending', 'admin')
    """, (rule.regex_name, None, rule.regex_pattern, rule.confidence))
    conn.commit()
    return {"status": "ok", "message": f"Regex {rule.regex_name} queued for update"}
```

✅ **Why this skeleton works:**

* REST endpoints for **Quarantine**, **Rules**, **History**.
* Uses **Postgres for persistence**.
* Reads from `patterns.yaml` for active rules.

---

## 🚀 **Now You Have:**

✅ A **Postgres schema** for regex change history.
✅ **Wireframes** for the admin dashboard.
✅ A **FastAPI skeleton** your team can expand into a full API/UI stack.