chorus-services/README.md at 4511f4c8017f6c519418abeed898ab0f6acf10c8

Files

tony 4511f4c801 Pre-cleanup snapshot - all current files

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

2025-08-05 02:32:45 +10:00

16 KiB

Raw Blame History

🔥 Excellent — let’s push this all the way into a production-grade spec.

📂 1️⃣ Feedback Ingestion Spec

This defines how curators/humans give feedback to the Sentinel so it can update its detection rules (patterns.yaml) safely.

🔄 Feedback Flow

Curator/Reviewer sees alert → marks it as:
- false_positive (regex over-triggered)
- missed_secret (regex failed to detect)
- uncertain (needs better regex refinement)
Feedback API ingests the report:

{
  "alert_id": "log_345",
  "secret_type": "AWS_ACCESS_KEY",
  "feedback_type": "false_positive",
  "evidence": "Key was dummy data: TESTKEY123",
  "suggested_regex_fix": null
}

Meta-Learner updates rules:

false_positive → adds exceptions (e.g., allowlist prefixes like TESTKEY).
missed_secret → drafts new regex from evidence (using regex generator or LLM).
Writes changes to patterns.yaml under pending_review.

Security admin approves before the new regex is marked active: true.

🧠 Feedback Schema in YAML

pending_updates:
  - regex_name: AWS_ACCESS_KEY
    action: modify
    new_regex: "AKIA[0-9A-Z]{16}(?!TESTKEY)"
    confidence: 0.82
    status: "pending human review"
    submitted_by: curator_2
    timestamp: 2025-08-02T12:40:00Z

✅ This keeps audit trails & allows safe hot updates.

⚙️ 2️⃣ Real AWS/GitHub Webhook Payload Templates

These are example POST payloads your Sentinel would send when it detects a leaked secret.

🔐 AWS Access Key Revocation

Endpoint: POST https://security.example.com/hooks/aws-revoke

Payload:

{
  "event": "secret_leak_detected",
  "secret_type": "AWS_ACCESS_KEY",
  "redacted_key": "AKIA****XYZ",
  "log_reference": "hyperlog:58321",
  "recommended_action": "Revoke IAM access key immediately",
  "severity": "HIGH",
  "timestamp": "2025-08-02T12:45:00Z"
}

➡ Your security automation would call AWS CLI or IAM API:

aws iam update-access-key --access-key-id <redacted> --status Inactive
aws iam delete-access-key --access-key-id <redacted>

🐙 GitHub Token Revocation

Endpoint: POST https://security.example.com/hooks/github-revoke

Payload:

{
  "event": "secret_leak_detected",
  "secret_type": "GITHUB_TOKEN",
  "redacted_key": "ghp_****abcd",
  "repository": "repo-name",
  "log_reference": "hyperlog:58322",
  "severity": "HIGH",
  "recommended_action": "Invalidate GitHub token via API",
  "timestamp": "2025-08-02T12:46:00Z"
}

➡ This would tie into GitHub’s token-scanning API or use PAT revocation.

💬 Slack Token Revocation

Endpoint: POST https://security.example.com/hooks/slack-revoke

Payload:

{
  "event": "secret_leak_detected",
  "secret_type": "SLACK_TOKEN",
  "redacted_key": "xoxb****hjk",
  "workspace": "company-slack",
  "log_reference": "hyperlog:58323",
  "severity": "HIGH",
  "recommended_action": "Revoke Slack bot/user token",
  "timestamp": "2025-08-02T12:47:00Z"
}

➡ Slack Admin API can be used to revoke or rotate.

📡 3️⃣ Redis or PostgreSQL Quarantine Store

Switching from memory to persistent storage means quarantined logs survive restarts.

✅ Redis Option (Fast, Volatile)

import redis, json
r = redis.Redis(host='localhost', port=6379, decode_responses=True)

def quarantine_log(log_line, reason):
    entry = {"timestamp": datetime.utcnow().isoformat() + "Z", "reason": reason, "log_line": log_line}
    r.lpush("quarantine", json.dumps(entry))
    print(f"[QUARANTINE] Stored in Redis: {reason}")

🏎 Pros: Fast, easy to scale.
⚠️ Cons: Volatile unless persisted (RDB/AOF).

✅ PostgreSQL Option (Auditable, Durable)

Schema:

CREATE TABLE quarantine (
    id SERIAL PRIMARY KEY,
    timestamp TIMESTAMPTZ NOT NULL,
    reason TEXT NOT NULL,
    log_line TEXT NOT NULL,
    reviewed BOOLEAN DEFAULT FALSE
);

Python Insert:

import psycopg2

conn = psycopg2.connect("dbname=sentinel user=postgres password=secret")
cursor = conn.cursor()

def quarantine_log(log_line, reason):
    entry_time = datetime.utcnow().isoformat() + "Z"
    cursor.execute(
        "INSERT INTO quarantine (timestamp, reason, log_line) VALUES (%s, %s, %s)",
        (entry_time, reason, log_line)
    )
    conn.commit()
    print(f"[QUARANTINE] Stored in PostgreSQL: {reason}")

✅ Postgres is better for long-term auditing — you can run reports like:

“How many AWS keys leaked this month?”
“Which agents generated the most HIGH-severity quarantines?”

We now have: ✅ Detection → Redaction → Quarantine → Revocation → Feedback → Pattern Evolution ✅ patterns.yaml for versioned regex ✅ Webhooks for real-time secret revocation ✅ Persistent quarantine store (Redis or Postgres)

📜 1️⃣ Migration Script: Redis → PostgreSQL

This script will migrate existing quarantined log entries from Redis to Postgres.

import redis, json, psycopg2
from datetime import datetime

# Redis config
r = redis.Redis(host='localhost', port=6379, decode_responses=True)

# Postgres config
conn = psycopg2.connect("dbname=sentinel user=postgres password=secret")
cursor = conn.cursor()

def migrate_quarantine():
    count = 0
    while True:
        entry_json = r.rpop("quarantine")  # pop oldest entry from Redis
        if not entry_json:
            break
        entry = json.loads(entry_json)
        cursor.execute(
            "INSERT INTO quarantine (timestamp, reason, log_line) VALUES (%s, %s, %s)",
            (entry["timestamp"], entry["reason"], entry["log_line"])
        )
        count += 1
    conn.commit()
    print(f"[MIGRATION] Moved {count} quarantined entries from Redis → PostgreSQL")

if __name__ == "__main__":
    migrate_quarantine()

✅ Run once after Postgres is set up — empties Redis queue into the durable DB.

🖥 2️⃣ Admin Dashboard Spec

Purpose: A web UI to manage the Sentinel’s security pipeline.

🎯 Core Features

✅ Quarantine Browser

Paginated view of all quarantined logs
Search/filter by secret_type, source_agent, date, status
Mark quarantined logs as reviewed or false alarm

✅ Regex Rules Manager

Lists all regexes from patterns.yaml
Add / update / deactivate rules via UI
Shows pending_updates flagged by the Meta-Learner for human approval

✅ Revocation Status Board

See which secrets triggered revocations
Status of revocation hooks (success/fail)

✅ Metrics Dashboard

Charts: “Secrets Detected Over Time”, “Top Sources of Leaks”
KPIs: # HIGH severity secrets this month, # rules updated, # false positives

🏗 Tech Stack Suggestion

Backend: FastAPI (Python)
Frontend: React + Tailwind
DB: PostgreSQL for quarantine + rules history
Auth: OAuth (GitHub/Google) + RBAC (only security admins can approve regex changes)

🔌 Endpoints

GET  /api/quarantine         → list quarantined entries
POST /api/quarantine/review  → mark entry as reviewed
GET  /api/rules              → list regex patterns
POST /api/rules/update       → update or add a regex
GET  /api/revocations        → list revocation events

🖥 Mock Dashboard Layout

Left Nav: Quarantine | Rules | Revocations | Metrics
Main Panel:
- Data tables with sorting/filtering
- Inline editors for regex rules
- Approve/Reject buttons for pending regex updates

✅ Basically a security control room for Sentinel.

🤖 3️⃣ Meta-Curator AI Prompt

This agent reviews Sentinel’s work and tunes it automatically.

Meta-Curator: System Prompt

Role & Mission: You are the Meta-Curator, a supervisory AI responsible for reviewing the Secrets Sentinel’s detections, regex updates, and feedback reports.

Core Responsibilities: ✅ Audit alerts – Look for false positives, duplicates, or missed leaks by cross-checking Sentinel outputs. ✅ Review regex proposals – When Sentinel drafts new regex rules, decide if they’re:

✅ Approved (safe to activate)

❌ Rejected (too broad or incorrect)

🕒 Deferred (needs human review) ✅ Tune detection thresholds – Adjust confidence or severity on patterns based on outcomes. ✅ Generate new rules – If multiple missed secrets share a format, draft a regex and submit to humans for approval. ✅ Report upstream – Summarize changes to security admins weekly.

Behavior Guidelines

Conservative by default: Don’t auto-approve regexes unless confidence > 0.95.
Keep auditability: Every decision (approve/reject) is logged in the hyperlog.
Respect human overrides: Never overwrite a regex that a human explicitly locked.

Example Meta-Curator Output

{
  "action": "approve_regex",
  "regex_name": "GITLAB_TOKEN",
  "regex_pattern": "glpat-[0-9A-Za-z\\-_]{20}",
  "confidence": 0.97,
  "decision_reason": "Validated against 12 quarantined examples, no false positives found.",
  "timestamp": "2025-08-02T13:45:00Z"
}

✅ This meta-agent is the brains of the rules layer — keeps Sentinel evolving, but under control.

🚀 Now You Have:

✅ Migration Path → Redis → PostgreSQL ✅ Admin Dashboard Spec → complete with endpoints & layout ✅ Meta-Curator Prompt → the agent that “manages the manager”

Alright — here’s the next batch to lock this into a real, buildable system.

📂 1️⃣ `patterns_history` Table Schema

This tracks every regex change ever made — who/what made it, why, and when.

CREATE TABLE patterns_history (
    id SERIAL PRIMARY KEY,
    regex_name TEXT NOT NULL,
    old_regex TEXT,
    new_regex TEXT,
    action TEXT CHECK (action IN ('add', 'update', 'remove')),
    confidence NUMERIC(3,2),
    status TEXT CHECK (status IN ('approved', 'pending', 'rejected')),
    submitted_by TEXT NOT NULL,
    approved_by TEXT,
    decision_reason TEXT,
    timestamp TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP
);

✅ What this gives you:

Full audit trail (critical for security compliance).
You can run queries like:
- “Show all regex changes made by Meta-Curator vs. humans.”
- “List all rules rejected in the last 90 days.”

🖼 2️⃣ Admin Dashboard Wireframes

Goal: show your devs exactly what to build — no ambiguity.

🔒 Dashboard Home

------------------------------------------------------
|  [Sentinel Logo]  Secrets Sentinel Dashboard       |
------------------------------------------------------
| Quarantine | Rules | Revocations | Metrics | Admin |
------------------------------------------------------
|  Welcome back, Security Admin!                     |
|                                                    |
|   ▢  32 Quarantined logs waiting review            |
|   ▢  4 Pending regex updates                       |
|   ▢  2 Failed revocation hooks                     |
------------------------------------------------------

🗄 Quarantine View

------------------------------------------------------
| Quarantine Logs                                     |
------------------------------------------------------
| Search: [______________] [Filter ▼]                 |
------------------------------------------------------
| Log ID     | Secret Type   | Severity | Status  |
------------------------------------------------------
| log_4287   | AWS_ACCESS_KEY| HIGH     | PENDING |
| log_4288   | JWT           | MEDIUM   | REVIEWED|
| log_4289   | SSH_KEY       | HIGH     | PENDING |
------------------------------------------------------
[ View Details ] [ Mark as Reviewed ] [ Delete ]

Clicking “View Details” → shows full log snippet (with redacted secret).

📜 Regex Manager

------------------------------------------------------
| Regex Rules                                         |
------------------------------------------------------
| Name            | Regex Pattern                   | Active |
------------------------------------------------------
| AWS_ACCESS_KEY  | AKIA[0-9A-Z]{16}                | ✔      |
| JWT             | eyJ[A-Za-z0-9_-]+?\.[…]         | ✔      |
| SLACK_TOKEN     | xox[baprs]-[0-9A-Za-z-]{10,48}  | ✔      |
------------------------------------------------------
[ Add New Regex ] [ View History ]

Clicking View History → pulls from patterns_history.

📊 Metrics View

Line Chart: “Secrets Detected Over Time”
Bar Chart: “Secrets by Type” (AWS, GitHub, JWT, etc.)
KPIs:
- 🔴 High Severity Leaks: 12 this week
- 🟢 Regex Accuracy: 94%

⚙️ 3️⃣ FastAPI Skeleton

Here’s the starter code for your dev team to run with.

from fastapi import FastAPI, Depends
from pydantic import BaseModel
from typing import List
import psycopg2, json

app = FastAPI(title="Secrets Sentinel Dashboard API")

# --- Database Setup ---
conn = psycopg2.connect("dbname=sentinel user=postgres password=secret")
cursor = conn.cursor()

# --- Models ---
class QuarantineEntry(BaseModel):
    id: int
    timestamp: str
    reason: str
    log_line: str
    reviewed: bool

class RegexRule(BaseModel):
    regex_name: str
    regex_pattern: str
    severity: str
    confidence: float
    active: bool

# --- Endpoints ---
@app.get("/quarantine", response_model=List[QuarantineEntry])
def get_quarantine():
    cursor.execute("SELECT id, timestamp, reason, log_line, reviewed FROM quarantine")
    rows = cursor.fetchall()
    return [QuarantineEntry(id=r[0], timestamp=str(r[1]), reason=r[2], log_line=r[3], reviewed=r[4]) for r in rows]

@app.post("/quarantine/review/{entry_id}")
def review_quarantine(entry_id: int):
    cursor.execute("UPDATE quarantine SET reviewed=true WHERE id=%s", (entry_id,))
    conn.commit()
    return {"status": "ok", "message": f"Quarantine entry {entry_id} marked reviewed"}

@app.get("/rules", response_model=List[RegexRule])
def get_rules():
    # Load from patterns.yaml
    with open("patterns.yaml", "r") as f:
        patterns = json.load(f) if f.read().strip().startswith("{") else {}
    rules = []
    for name, rule in patterns.get("patterns", {}).items():
        rules.append(RegexRule(
            regex_name=name,
            regex_pattern=rule["regex"],
            severity=rule["severity"],
            confidence=rule["confidence"],
            active=rule["active"]
        ))
    return rules

@app.post("/rules/update")
def update_rule(rule: RegexRule):
    # Append to patterns_history table
    cursor.execute("""
        INSERT INTO patterns_history (regex_name, old_regex, new_regex, action, confidence, status, submitted_by)
        VALUES (%s, %s, %s, 'update', %s, 'pending', 'admin')
    """, (rule.regex_name, None, rule.regex_pattern, rule.confidence))
    conn.commit()
    return {"status": "ok", "message": f"Regex {rule.regex_name} queued for update"}

✅ Why this skeleton works:

REST endpoints for Quarantine, Rules, History.
Uses Postgres for persistence.
Reads from patterns.yaml for active rules.

🚀 Now You Have:

✅ A Postgres schema for regex change history. ✅ Wireframes for the admin dashboard. ✅ A FastAPI skeleton your team can expand into a full API/UI stack.

16 KiB Raw Blame History Unescape Escape

📂 1️⃣ Feedback Ingestion Spec

🔄 Feedback Flow

🧠 Feedback Schema in YAML

⚙️ 2️⃣ Real AWS/GitHub Webhook Payload Templates

🔐 AWS Access Key Revocation

🐙 GitHub Token Revocation

💬 Slack Token Revocation

📡 3️⃣ Redis or PostgreSQL Quarantine Store

✅ Redis Option (Fast, Volatile)

✅ PostgreSQL Option (Auditable, Durable)

📜 1️⃣ Migration Script: Redis → PostgreSQL

🖥 2️⃣ Admin Dashboard Spec

🎯 Core Features

🏗 Tech Stack Suggestion

🔌 Endpoints

🖥 Mock Dashboard Layout

🤖 3️⃣ Meta-Curator AI Prompt

Meta-Curator: System Prompt

Behavior Guidelines

Example Meta-Curator Output

🚀 Now You Have:

📂 1️⃣ patterns_history Table Schema

✅ What this gives you:

🖼 2️⃣ Admin Dashboard Wireframes

🔒 Dashboard Home

🗄 Quarantine View

📜 Regex Manager

📊 Metrics View

⚙️ 3️⃣ FastAPI Skeleton

🚀 Now You Have:

16 KiB

Raw Blame History

📂 1️⃣ `patterns_history` Table Schema