Pre-cleanup snapshot - all current files

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
tony
2025-08-05 02:32:45 +10:00
parent 26079aa8da
commit 4511f4c801
32 changed files with 5072 additions and 0 deletions

561
modules/shhh/README.md Normal file
View File

@@ -0,0 +1,561 @@
🔥 Excellent — lets push this all the way into a **production-grade spec**.
---
## 📂 **1⃣ Feedback Ingestion Spec**
This defines how curators/humans give feedback to the Sentinel so it can **update its detection rules (patterns.yaml)** safely.
---
### 🔄 **Feedback Flow**
1. **Curator/Reviewer sees alert** → marks it as:
* `false_positive` (regex over-triggered)
* `missed_secret` (regex failed to detect)
* `uncertain` (needs better regex refinement)
2. **Feedback API** ingests the report:
```json
{
"alert_id": "log_345",
"secret_type": "AWS_ACCESS_KEY",
"feedback_type": "false_positive",
"evidence": "Key was dummy data: TESTKEY123",
"suggested_regex_fix": null
}
```
3. **Meta-Learner** updates rules:
* `false_positive` → adds **exceptions** (e.g., allowlist prefixes like `TESTKEY`).
* `missed_secret` → drafts **new regex** from evidence (using regex generator or LLM).
* Writes changes to **patterns.yaml** under `pending_review`.
4. **Security admin approves** before the new regex is marked `active: true`.
---
### 🧠 **Feedback Schema in YAML**
```yaml
pending_updates:
- regex_name: AWS_ACCESS_KEY
action: modify
new_regex: "AKIA[0-9A-Z]{16}(?!TESTKEY)"
confidence: 0.82
status: "pending human review"
submitted_by: curator_2
timestamp: 2025-08-02T12:40:00Z
```
✅ This keeps **audit trails** & allows **safe hot updates**.
---
## ⚙️ **2⃣ Real AWS/GitHub Webhook Payload Templates**
These are **example POST payloads** your Sentinel would send when it detects a leaked secret.
---
### 🔐 **AWS Access Key Revocation**
**Endpoint:**
`POST https://security.example.com/hooks/aws-revoke`
**Payload:**
```json
{
"event": "secret_leak_detected",
"secret_type": "AWS_ACCESS_KEY",
"redacted_key": "AKIA****XYZ",
"log_reference": "hyperlog:58321",
"recommended_action": "Revoke IAM access key immediately",
"severity": "HIGH",
"timestamp": "2025-08-02T12:45:00Z"
}
```
➡ Your security automation would call AWS CLI or IAM API:
```bash
aws iam update-access-key --access-key-id <redacted> --status Inactive
aws iam delete-access-key --access-key-id <redacted>
```
---
### 🐙 **GitHub Token Revocation**
**Endpoint:**
`POST https://security.example.com/hooks/github-revoke`
**Payload:**
```json
{
"event": "secret_leak_detected",
"secret_type": "GITHUB_TOKEN",
"redacted_key": "ghp_****abcd",
"repository": "repo-name",
"log_reference": "hyperlog:58322",
"severity": "HIGH",
"recommended_action": "Invalidate GitHub token via API",
"timestamp": "2025-08-02T12:46:00Z"
}
```
➡ This would tie into GitHubs [token-scanning API](https://docs.github.com/en/developers/overview/secret-scanning) or use PAT revocation.
---
### 💬 **Slack Token Revocation**
**Endpoint:**
`POST https://security.example.com/hooks/slack-revoke`
**Payload:**
```json
{
"event": "secret_leak_detected",
"secret_type": "SLACK_TOKEN",
"redacted_key": "xoxb****hjk",
"workspace": "company-slack",
"log_reference": "hyperlog:58323",
"severity": "HIGH",
"recommended_action": "Revoke Slack bot/user token",
"timestamp": "2025-08-02T12:47:00Z"
}
```
➡ Slack Admin API can be used to **revoke** or **rotate**.
---
## 📡 **3⃣ Redis or PostgreSQL Quarantine Store**
Switching from memory to **persistent storage** means quarantined logs survive restarts.
---
### ✅ **Redis Option (Fast, Volatile)**
```python
import redis, json
r = redis.Redis(host='localhost', port=6379, decode_responses=True)
def quarantine_log(log_line, reason):
entry = {"timestamp": datetime.utcnow().isoformat() + "Z", "reason": reason, "log_line": log_line}
r.lpush("quarantine", json.dumps(entry))
print(f"[QUARANTINE] Stored in Redis: {reason}")
```
* 🏎 **Pros:** Fast, easy to scale.
* ⚠️ **Cons:** Volatile unless persisted (RDB/AOF).
---
### ✅ **PostgreSQL Option (Auditable, Durable)**
**Schema:**
```sql
CREATE TABLE quarantine (
id SERIAL PRIMARY KEY,
timestamp TIMESTAMPTZ NOT NULL,
reason TEXT NOT NULL,
log_line TEXT NOT NULL,
reviewed BOOLEAN DEFAULT FALSE
);
```
**Python Insert:**
```python
import psycopg2
conn = psycopg2.connect("dbname=sentinel user=postgres password=secret")
cursor = conn.cursor()
def quarantine_log(log_line, reason):
entry_time = datetime.utcnow().isoformat() + "Z"
cursor.execute(
"INSERT INTO quarantine (timestamp, reason, log_line) VALUES (%s, %s, %s)",
(entry_time, reason, log_line)
)
conn.commit()
print(f"[QUARANTINE] Stored in PostgreSQL: {reason}")
```
**Postgres is better for long-term auditing** — you can run reports like:
* “How many AWS keys leaked this month?”
* “Which agents generated the most HIGH-severity quarantines?”
---
We now have:
**Detection → Redaction → Quarantine → Revocation → Feedback → Pattern Evolution**
**patterns.yaml** for versioned regex
**Webhooks** for real-time secret revocation
**Persistent quarantine store** (Redis or Postgres)
---
## 📜 **1⃣ Migration Script: Redis → PostgreSQL**
This script will migrate existing quarantined log entries from **Redis** to **Postgres**.
```python
import redis, json, psycopg2
from datetime import datetime
# Redis config
r = redis.Redis(host='localhost', port=6379, decode_responses=True)
# Postgres config
conn = psycopg2.connect("dbname=sentinel user=postgres password=secret")
cursor = conn.cursor()
def migrate_quarantine():
count = 0
while True:
entry_json = r.rpop("quarantine") # pop oldest entry from Redis
if not entry_json:
break
entry = json.loads(entry_json)
cursor.execute(
"INSERT INTO quarantine (timestamp, reason, log_line) VALUES (%s, %s, %s)",
(entry["timestamp"], entry["reason"], entry["log_line"])
)
count += 1
conn.commit()
print(f"[MIGRATION] Moved {count} quarantined entries from Redis → PostgreSQL")
if __name__ == "__main__":
migrate_quarantine()
```
**Run once** after Postgres is set up — empties Redis queue into the durable DB.
---
## 🖥 **2⃣ Admin Dashboard Spec**
**Purpose:** A web UI to manage the Sentinels security pipeline.
---
### 🎯 **Core Features**
**Quarantine Browser**
* Paginated view of all quarantined logs
* Search/filter by `secret_type`, `source_agent`, `date`, `status`
* Mark quarantined logs as **reviewed** or **false alarm**
**Regex Rules Manager**
* Lists all regexes from `patterns.yaml`
* Add / update / deactivate rules via UI
* Shows `pending_updates` flagged by the Meta-Learner for human approval
**Revocation Status Board**
* See which secrets triggered revocations
* Status of revocation hooks (success/fail)
**Metrics Dashboard**
* Charts: “Secrets Detected Over Time”, “Top Sources of Leaks”
* KPIs: # HIGH severity secrets this month, # rules updated, # false positives
---
### 🏗 **Tech Stack Suggestion**
* **Backend:** FastAPI (Python)
* **Frontend:** React + Tailwind
* **DB:** PostgreSQL for quarantine + rules history
* **Auth:** OAuth (GitHub/Google) + RBAC (only security admins can approve regex changes)
---
### 🔌 **Endpoints**
```
GET /api/quarantine → list quarantined entries
POST /api/quarantine/review → mark entry as reviewed
GET /api/rules → list regex patterns
POST /api/rules/update → update or add a regex
GET /api/revocations → list revocation events
```
---
### 🖥 **Mock Dashboard Layout**
* **Left Nav:** Quarantine | Rules | Revocations | Metrics
* **Main Panel:**
* Data tables with sorting/filtering
* Inline editors for regex rules
* Approve/Reject buttons for pending regex updates
✅ Basically a **security control room** for Sentinel.
---
## 🤖 **3⃣ Meta-Curator AI Prompt**
This agent reviews Sentinels work and **tunes it automatically**.
---
### **Meta-Curator: System Prompt**
> **Role & Mission:**
> You are the **Meta-Curator**, a supervisory AI responsible for reviewing the **Secrets Sentinels** detections, regex updates, and feedback reports.
>
> **Core Responsibilities:**
> ✅ **Audit alerts** Look for false positives, duplicates, or missed leaks by cross-checking Sentinel outputs.
> ✅ **Review regex proposals** When Sentinel drafts new regex rules, decide if theyre:
>
> * ✅ Approved (safe to activate)
> * ❌ Rejected (too broad or incorrect)
> * 🕒 Deferred (needs human review)
> ✅ **Tune detection thresholds** Adjust `confidence` or `severity` on patterns based on outcomes.
> ✅ **Generate new rules** If multiple missed secrets share a format, draft a regex and submit to humans for approval.
> ✅ **Report upstream** Summarize changes to security admins weekly.
---
### **Behavior Guidelines**
* **Conservative by default:** Dont auto-approve regexes unless confidence > 0.95.
* **Keep auditability:** Every decision (approve/reject) is logged in the hyperlog.
* **Respect human overrides:** Never overwrite a regex that a human explicitly locked.
---
### **Example Meta-Curator Output**
```json
{
"action": "approve_regex",
"regex_name": "GITLAB_TOKEN",
"regex_pattern": "glpat-[0-9A-Za-z\\-_]{20}",
"confidence": 0.97,
"decision_reason": "Validated against 12 quarantined examples, no false positives found.",
"timestamp": "2025-08-02T13:45:00Z"
}
```
✅ This meta-agent is the **brains of the rules layer** — keeps Sentinel evolving, but under control.
---
## 🚀 **Now You Have:**
**Migration Path** → Redis → PostgreSQL
**Admin Dashboard Spec** → complete with endpoints & layout
**Meta-Curator Prompt** → the agent that “manages the manager”
Alright — heres the next batch to lock this into a **real, buildable system**.
---
## 📂 **1⃣ `patterns_history` Table Schema**
This tracks **every regex change** ever made — who/what made it, why, and when.
```sql
CREATE TABLE patterns_history (
id SERIAL PRIMARY KEY,
regex_name TEXT NOT NULL,
old_regex TEXT,
new_regex TEXT,
action TEXT CHECK (action IN ('add', 'update', 'remove')),
confidence NUMERIC(3,2),
status TEXT CHECK (status IN ('approved', 'pending', 'rejected')),
submitted_by TEXT NOT NULL,
approved_by TEXT,
decision_reason TEXT,
timestamp TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP
);
```
### ✅ What this gives you:
* **Full audit trail** (critical for security compliance).
* You can run queries like:
* *“Show all regex changes made by Meta-Curator vs. humans.”*
* *“List all rules rejected in the last 90 days.”*
---
## 🖼 **2⃣ Admin Dashboard Wireframes**
**Goal:** show your devs exactly what to build — no ambiguity.
---
### **🔒 Dashboard Home**
```
------------------------------------------------------
| [Sentinel Logo] Secrets Sentinel Dashboard |
------------------------------------------------------
| Quarantine | Rules | Revocations | Metrics | Admin |
------------------------------------------------------
| Welcome back, Security Admin! |
| |
| ▢ 32 Quarantined logs waiting review |
| ▢ 4 Pending regex updates |
| ▢ 2 Failed revocation hooks |
------------------------------------------------------
```
---
### **🗄 Quarantine View**
```
------------------------------------------------------
| Quarantine Logs |
------------------------------------------------------
| Search: [______________] [Filter ▼] |
------------------------------------------------------
| Log ID | Secret Type | Severity | Status |
------------------------------------------------------
| log_4287 | AWS_ACCESS_KEY| HIGH | PENDING |
| log_4288 | JWT | MEDIUM | REVIEWED|
| log_4289 | SSH_KEY | HIGH | PENDING |
------------------------------------------------------
[ View Details ] [ Mark as Reviewed ] [ Delete ]
```
**Clicking “View Details” →** shows full log snippet (with redacted secret).
---
### **📜 Regex Manager**
```
------------------------------------------------------
| Regex Rules |
------------------------------------------------------
| Name | Regex Pattern | Active |
------------------------------------------------------
| AWS_ACCESS_KEY | AKIA[0-9A-Z]{16} | ✔ |
| JWT | eyJ[A-Za-z0-9_-]+?\.[…] | ✔ |
| SLACK_TOKEN | xox[baprs]-[0-9A-Za-z-]{10,48} | ✔ |
------------------------------------------------------
[ Add New Regex ] [ View History ]
```
Clicking **View History** → pulls from `patterns_history`.
---
### **📊 Metrics View**
* **Line Chart:** “Secrets Detected Over Time”
* **Bar Chart:** “Secrets by Type” (AWS, GitHub, JWT, etc.)
* **KPIs:**
* 🔴 High Severity Leaks: 12 this week
* 🟢 Regex Accuracy: 94%
---
## ⚙️ **3⃣ FastAPI Skeleton**
Heres the **starter code** for your dev team to run with.
```python
from fastapi import FastAPI, Depends
from pydantic import BaseModel
from typing import List
import psycopg2, json
app = FastAPI(title="Secrets Sentinel Dashboard API")
# --- Database Setup ---
conn = psycopg2.connect("dbname=sentinel user=postgres password=secret")
cursor = conn.cursor()
# --- Models ---
class QuarantineEntry(BaseModel):
id: int
timestamp: str
reason: str
log_line: str
reviewed: bool
class RegexRule(BaseModel):
regex_name: str
regex_pattern: str
severity: str
confidence: float
active: bool
# --- Endpoints ---
@app.get("/quarantine", response_model=List[QuarantineEntry])
def get_quarantine():
cursor.execute("SELECT id, timestamp, reason, log_line, reviewed FROM quarantine")
rows = cursor.fetchall()
return [QuarantineEntry(id=r[0], timestamp=str(r[1]), reason=r[2], log_line=r[3], reviewed=r[4]) for r in rows]
@app.post("/quarantine/review/{entry_id}")
def review_quarantine(entry_id: int):
cursor.execute("UPDATE quarantine SET reviewed=true WHERE id=%s", (entry_id,))
conn.commit()
return {"status": "ok", "message": f"Quarantine entry {entry_id} marked reviewed"}
@app.get("/rules", response_model=List[RegexRule])
def get_rules():
# Load from patterns.yaml
with open("patterns.yaml", "r") as f:
patterns = json.load(f) if f.read().strip().startswith("{") else {}
rules = []
for name, rule in patterns.get("patterns", {}).items():
rules.append(RegexRule(
regex_name=name,
regex_pattern=rule["regex"],
severity=rule["severity"],
confidence=rule["confidence"],
active=rule["active"]
))
return rules
@app.post("/rules/update")
def update_rule(rule: RegexRule):
# Append to patterns_history table
cursor.execute("""
INSERT INTO patterns_history (regex_name, old_regex, new_regex, action, confidence, status, submitted_by)
VALUES (%s, %s, %s, 'update', %s, 'pending', 'admin')
""", (rule.regex_name, None, rule.regex_pattern, rule.confidence))
conn.commit()
return {"status": "ok", "message": f"Regex {rule.regex_name} queued for update"}
```
**Why this skeleton works:**
* REST endpoints for **Quarantine**, **Rules**, **History**.
* Uses **Postgres for persistence**.
* Reads from `patterns.yaml` for active rules.
---
## 🚀 **Now You Have:**
✅ A **Postgres schema** for regex change history.
**Wireframes** for the admin dashboard.
✅ A **FastAPI skeleton** your team can expand into a full API/UI stack.