Files
chorus-services/modules/shhh/README.md
tony 4511f4c801 Pre-cleanup snapshot - all current files
🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-05 02:32:45 +10:00

562 lines
16 KiB
Markdown
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

🔥 Excellent — lets push this all the way into a **production-grade spec**.
---
## 📂 **1⃣ Feedback Ingestion Spec**
This defines how curators/humans give feedback to the Sentinel so it can **update its detection rules (patterns.yaml)** safely.
---
### 🔄 **Feedback Flow**
1. **Curator/Reviewer sees alert** → marks it as:
* `false_positive` (regex over-triggered)
* `missed_secret` (regex failed to detect)
* `uncertain` (needs better regex refinement)
2. **Feedback API** ingests the report:
```json
{
"alert_id": "log_345",
"secret_type": "AWS_ACCESS_KEY",
"feedback_type": "false_positive",
"evidence": "Key was dummy data: TESTKEY123",
"suggested_regex_fix": null
}
```
3. **Meta-Learner** updates rules:
* `false_positive` → adds **exceptions** (e.g., allowlist prefixes like `TESTKEY`).
* `missed_secret` → drafts **new regex** from evidence (using regex generator or LLM).
* Writes changes to **patterns.yaml** under `pending_review`.
4. **Security admin approves** before the new regex is marked `active: true`.
---
### 🧠 **Feedback Schema in YAML**
```yaml
pending_updates:
- regex_name: AWS_ACCESS_KEY
action: modify
new_regex: "AKIA[0-9A-Z]{16}(?!TESTKEY)"
confidence: 0.82
status: "pending human review"
submitted_by: curator_2
timestamp: 2025-08-02T12:40:00Z
```
✅ This keeps **audit trails** & allows **safe hot updates**.
---
## ⚙️ **2⃣ Real AWS/GitHub Webhook Payload Templates**
These are **example POST payloads** your Sentinel would send when it detects a leaked secret.
---
### 🔐 **AWS Access Key Revocation**
**Endpoint:**
`POST https://security.example.com/hooks/aws-revoke`
**Payload:**
```json
{
"event": "secret_leak_detected",
"secret_type": "AWS_ACCESS_KEY",
"redacted_key": "AKIA****XYZ",
"log_reference": "hyperlog:58321",
"recommended_action": "Revoke IAM access key immediately",
"severity": "HIGH",
"timestamp": "2025-08-02T12:45:00Z"
}
```
➡ Your security automation would call AWS CLI or IAM API:
```bash
aws iam update-access-key --access-key-id <redacted> --status Inactive
aws iam delete-access-key --access-key-id <redacted>
```
---
### 🐙 **GitHub Token Revocation**
**Endpoint:**
`POST https://security.example.com/hooks/github-revoke`
**Payload:**
```json
{
"event": "secret_leak_detected",
"secret_type": "GITHUB_TOKEN",
"redacted_key": "ghp_****abcd",
"repository": "repo-name",
"log_reference": "hyperlog:58322",
"severity": "HIGH",
"recommended_action": "Invalidate GitHub token via API",
"timestamp": "2025-08-02T12:46:00Z"
}
```
➡ This would tie into GitHubs [token-scanning API](https://docs.github.com/en/developers/overview/secret-scanning) or use PAT revocation.
---
### 💬 **Slack Token Revocation**
**Endpoint:**
`POST https://security.example.com/hooks/slack-revoke`
**Payload:**
```json
{
"event": "secret_leak_detected",
"secret_type": "SLACK_TOKEN",
"redacted_key": "xoxb****hjk",
"workspace": "company-slack",
"log_reference": "hyperlog:58323",
"severity": "HIGH",
"recommended_action": "Revoke Slack bot/user token",
"timestamp": "2025-08-02T12:47:00Z"
}
```
➡ Slack Admin API can be used to **revoke** or **rotate**.
---
## 📡 **3⃣ Redis or PostgreSQL Quarantine Store**
Switching from memory to **persistent storage** means quarantined logs survive restarts.
---
### ✅ **Redis Option (Fast, Volatile)**
```python
import redis, json
r = redis.Redis(host='localhost', port=6379, decode_responses=True)
def quarantine_log(log_line, reason):
entry = {"timestamp": datetime.utcnow().isoformat() + "Z", "reason": reason, "log_line": log_line}
r.lpush("quarantine", json.dumps(entry))
print(f"[QUARANTINE] Stored in Redis: {reason}")
```
* 🏎 **Pros:** Fast, easy to scale.
* ⚠️ **Cons:** Volatile unless persisted (RDB/AOF).
---
### ✅ **PostgreSQL Option (Auditable, Durable)**
**Schema:**
```sql
CREATE TABLE quarantine (
id SERIAL PRIMARY KEY,
timestamp TIMESTAMPTZ NOT NULL,
reason TEXT NOT NULL,
log_line TEXT NOT NULL,
reviewed BOOLEAN DEFAULT FALSE
);
```
**Python Insert:**
```python
import psycopg2
conn = psycopg2.connect("dbname=sentinel user=postgres password=secret")
cursor = conn.cursor()
def quarantine_log(log_line, reason):
entry_time = datetime.utcnow().isoformat() + "Z"
cursor.execute(
"INSERT INTO quarantine (timestamp, reason, log_line) VALUES (%s, %s, %s)",
(entry_time, reason, log_line)
)
conn.commit()
print(f"[QUARANTINE] Stored in PostgreSQL: {reason}")
```
**Postgres is better for long-term auditing** — you can run reports like:
* “How many AWS keys leaked this month?”
* “Which agents generated the most HIGH-severity quarantines?”
---
We now have:
**Detection → Redaction → Quarantine → Revocation → Feedback → Pattern Evolution**
**patterns.yaml** for versioned regex
**Webhooks** for real-time secret revocation
**Persistent quarantine store** (Redis or Postgres)
---
## 📜 **1⃣ Migration Script: Redis → PostgreSQL**
This script will migrate existing quarantined log entries from **Redis** to **Postgres**.
```python
import redis, json, psycopg2
from datetime import datetime
# Redis config
r = redis.Redis(host='localhost', port=6379, decode_responses=True)
# Postgres config
conn = psycopg2.connect("dbname=sentinel user=postgres password=secret")
cursor = conn.cursor()
def migrate_quarantine():
count = 0
while True:
entry_json = r.rpop("quarantine") # pop oldest entry from Redis
if not entry_json:
break
entry = json.loads(entry_json)
cursor.execute(
"INSERT INTO quarantine (timestamp, reason, log_line) VALUES (%s, %s, %s)",
(entry["timestamp"], entry["reason"], entry["log_line"])
)
count += 1
conn.commit()
print(f"[MIGRATION] Moved {count} quarantined entries from Redis → PostgreSQL")
if __name__ == "__main__":
migrate_quarantine()
```
**Run once** after Postgres is set up — empties Redis queue into the durable DB.
---
## 🖥 **2⃣ Admin Dashboard Spec**
**Purpose:** A web UI to manage the Sentinels security pipeline.
---
### 🎯 **Core Features**
**Quarantine Browser**
* Paginated view of all quarantined logs
* Search/filter by `secret_type`, `source_agent`, `date`, `status`
* Mark quarantined logs as **reviewed** or **false alarm**
**Regex Rules Manager**
* Lists all regexes from `patterns.yaml`
* Add / update / deactivate rules via UI
* Shows `pending_updates` flagged by the Meta-Learner for human approval
**Revocation Status Board**
* See which secrets triggered revocations
* Status of revocation hooks (success/fail)
**Metrics Dashboard**
* Charts: “Secrets Detected Over Time”, “Top Sources of Leaks”
* KPIs: # HIGH severity secrets this month, # rules updated, # false positives
---
### 🏗 **Tech Stack Suggestion**
* **Backend:** FastAPI (Python)
* **Frontend:** React + Tailwind
* **DB:** PostgreSQL for quarantine + rules history
* **Auth:** OAuth (GitHub/Google) + RBAC (only security admins can approve regex changes)
---
### 🔌 **Endpoints**
```
GET /api/quarantine → list quarantined entries
POST /api/quarantine/review → mark entry as reviewed
GET /api/rules → list regex patterns
POST /api/rules/update → update or add a regex
GET /api/revocations → list revocation events
```
---
### 🖥 **Mock Dashboard Layout**
* **Left Nav:** Quarantine | Rules | Revocations | Metrics
* **Main Panel:**
* Data tables with sorting/filtering
* Inline editors for regex rules
* Approve/Reject buttons for pending regex updates
✅ Basically a **security control room** for Sentinel.
---
## 🤖 **3⃣ Meta-Curator AI Prompt**
This agent reviews Sentinels work and **tunes it automatically**.
---
### **Meta-Curator: System Prompt**
> **Role & Mission:**
> You are the **Meta-Curator**, a supervisory AI responsible for reviewing the **Secrets Sentinels** detections, regex updates, and feedback reports.
>
> **Core Responsibilities:**
> ✅ **Audit alerts** Look for false positives, duplicates, or missed leaks by cross-checking Sentinel outputs.
> ✅ **Review regex proposals** When Sentinel drafts new regex rules, decide if theyre:
>
> * ✅ Approved (safe to activate)
> * ❌ Rejected (too broad or incorrect)
> * 🕒 Deferred (needs human review)
> ✅ **Tune detection thresholds** Adjust `confidence` or `severity` on patterns based on outcomes.
> ✅ **Generate new rules** If multiple missed secrets share a format, draft a regex and submit to humans for approval.
> ✅ **Report upstream** Summarize changes to security admins weekly.
---
### **Behavior Guidelines**
* **Conservative by default:** Dont auto-approve regexes unless confidence > 0.95.
* **Keep auditability:** Every decision (approve/reject) is logged in the hyperlog.
* **Respect human overrides:** Never overwrite a regex that a human explicitly locked.
---
### **Example Meta-Curator Output**
```json
{
"action": "approve_regex",
"regex_name": "GITLAB_TOKEN",
"regex_pattern": "glpat-[0-9A-Za-z\\-_]{20}",
"confidence": 0.97,
"decision_reason": "Validated against 12 quarantined examples, no false positives found.",
"timestamp": "2025-08-02T13:45:00Z"
}
```
✅ This meta-agent is the **brains of the rules layer** — keeps Sentinel evolving, but under control.
---
## 🚀 **Now You Have:**
**Migration Path** → Redis → PostgreSQL
**Admin Dashboard Spec** → complete with endpoints & layout
**Meta-Curator Prompt** → the agent that “manages the manager”
Alright — heres the next batch to lock this into a **real, buildable system**.
---
## 📂 **1⃣ `patterns_history` Table Schema**
This tracks **every regex change** ever made — who/what made it, why, and when.
```sql
CREATE TABLE patterns_history (
id SERIAL PRIMARY KEY,
regex_name TEXT NOT NULL,
old_regex TEXT,
new_regex TEXT,
action TEXT CHECK (action IN ('add', 'update', 'remove')),
confidence NUMERIC(3,2),
status TEXT CHECK (status IN ('approved', 'pending', 'rejected')),
submitted_by TEXT NOT NULL,
approved_by TEXT,
decision_reason TEXT,
timestamp TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP
);
```
### ✅ What this gives you:
* **Full audit trail** (critical for security compliance).
* You can run queries like:
* *“Show all regex changes made by Meta-Curator vs. humans.”*
* *“List all rules rejected in the last 90 days.”*
---
## 🖼 **2⃣ Admin Dashboard Wireframes**
**Goal:** show your devs exactly what to build — no ambiguity.
---
### **🔒 Dashboard Home**
```
------------------------------------------------------
| [Sentinel Logo] Secrets Sentinel Dashboard |
------------------------------------------------------
| Quarantine | Rules | Revocations | Metrics | Admin |
------------------------------------------------------
| Welcome back, Security Admin! |
| |
| ▢ 32 Quarantined logs waiting review |
| ▢ 4 Pending regex updates |
| ▢ 2 Failed revocation hooks |
------------------------------------------------------
```
---
### **🗄 Quarantine View**
```
------------------------------------------------------
| Quarantine Logs |
------------------------------------------------------
| Search: [______________] [Filter ▼] |
------------------------------------------------------
| Log ID | Secret Type | Severity | Status |
------------------------------------------------------
| log_4287 | AWS_ACCESS_KEY| HIGH | PENDING |
| log_4288 | JWT | MEDIUM | REVIEWED|
| log_4289 | SSH_KEY | HIGH | PENDING |
------------------------------------------------------
[ View Details ] [ Mark as Reviewed ] [ Delete ]
```
**Clicking “View Details” →** shows full log snippet (with redacted secret).
---
### **📜 Regex Manager**
```
------------------------------------------------------
| Regex Rules |
------------------------------------------------------
| Name | Regex Pattern | Active |
------------------------------------------------------
| AWS_ACCESS_KEY | AKIA[0-9A-Z]{16} | ✔ |
| JWT | eyJ[A-Za-z0-9_-]+?\.[…] | ✔ |
| SLACK_TOKEN | xox[baprs]-[0-9A-Za-z-]{10,48} | ✔ |
------------------------------------------------------
[ Add New Regex ] [ View History ]
```
Clicking **View History** → pulls from `patterns_history`.
---
### **📊 Metrics View**
* **Line Chart:** “Secrets Detected Over Time”
* **Bar Chart:** “Secrets by Type” (AWS, GitHub, JWT, etc.)
* **KPIs:**
* 🔴 High Severity Leaks: 12 this week
* 🟢 Regex Accuracy: 94%
---
## ⚙️ **3⃣ FastAPI Skeleton**
Heres the **starter code** for your dev team to run with.
```python
from fastapi import FastAPI, Depends
from pydantic import BaseModel
from typing import List
import psycopg2, json
app = FastAPI(title="Secrets Sentinel Dashboard API")
# --- Database Setup ---
conn = psycopg2.connect("dbname=sentinel user=postgres password=secret")
cursor = conn.cursor()
# --- Models ---
class QuarantineEntry(BaseModel):
id: int
timestamp: str
reason: str
log_line: str
reviewed: bool
class RegexRule(BaseModel):
regex_name: str
regex_pattern: str
severity: str
confidence: float
active: bool
# --- Endpoints ---
@app.get("/quarantine", response_model=List[QuarantineEntry])
def get_quarantine():
cursor.execute("SELECT id, timestamp, reason, log_line, reviewed FROM quarantine")
rows = cursor.fetchall()
return [QuarantineEntry(id=r[0], timestamp=str(r[1]), reason=r[2], log_line=r[3], reviewed=r[4]) for r in rows]
@app.post("/quarantine/review/{entry_id}")
def review_quarantine(entry_id: int):
cursor.execute("UPDATE quarantine SET reviewed=true WHERE id=%s", (entry_id,))
conn.commit()
return {"status": "ok", "message": f"Quarantine entry {entry_id} marked reviewed"}
@app.get("/rules", response_model=List[RegexRule])
def get_rules():
# Load from patterns.yaml
with open("patterns.yaml", "r") as f:
patterns = json.load(f) if f.read().strip().startswith("{") else {}
rules = []
for name, rule in patterns.get("patterns", {}).items():
rules.append(RegexRule(
regex_name=name,
regex_pattern=rule["regex"],
severity=rule["severity"],
confidence=rule["confidence"],
active=rule["active"]
))
return rules
@app.post("/rules/update")
def update_rule(rule: RegexRule):
# Append to patterns_history table
cursor.execute("""
INSERT INTO patterns_history (regex_name, old_regex, new_regex, action, confidence, status, submitted_by)
VALUES (%s, %s, %s, 'update', %s, 'pending', 'admin')
""", (rule.regex_name, None, rule.regex_pattern, rule.confidence))
conn.commit()
return {"status": "ok", "message": f"Regex {rule.regex_name} queued for update"}
```
**Why this skeleton works:**
* REST endpoints for **Quarantine**, **Rules**, **History**.
* Uses **Postgres for persistence**.
* Reads from `patterns.yaml` for active rules.
---
## 🚀 **Now You Have:**
✅ A **Postgres schema** for regex change history.
**Wireframes** for the admin dashboard.
✅ A **FastAPI skeleton** your team can expand into a full API/UI stack.