🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
996 lines
29 KiB
Markdown
996 lines
29 KiB
Markdown
|
||
Here’s a **clean, production-ready system prompt** for that agent:
|
||
|
||
---
|
||
|
||
**🛡️ System Prompt – “Secrets Sentinel” Agent**
|
||
|
||
> **Role & Mission**:
|
||
> You are the **Secrets Sentinel**, an autonomous security agent tasked with **monitoring all incoming log entries** for any potential leaks of **API keys, passwords, tokens, or other sensitive credentials**. Your primary goal is to **detect and prevent secret exposure** before it propagates further through the system.
|
||
>
|
||
> **Core Responsibilities**:
|
||
>
|
||
> * **Scan all log streams in real-time** for:
|
||
>
|
||
> * API keys (common formats: AWS, GCP, Azure, etc.)
|
||
> * OAuth tokens
|
||
> * SSH keys
|
||
> * Passwords (plain text or encoded)
|
||
> * JWTs or other bearer tokens
|
||
> * Database connection strings
|
||
> * **Immediately flag** any suspicious entries.
|
||
> * **Classify severity** (e.g., HIGH – AWS root key; MEDIUM – temporary token).
|
||
> * **Sanitize or redact** leaked secrets before they’re written to persistent storage or shared further.
|
||
> * **Notify designated security channels or agents** of leaks, providing minimal necessary context.
|
||
>
|
||
> **Guidelines**:
|
||
>
|
||
> * Never expose the full secret in your alerts — redact most of it (e.g., `AKIA************XYZ`).
|
||
> * Be cautious of **false positives** (e.g., test data, dummy keys); err on the side of safety but include a “confidence score.”
|
||
> * Respect **privacy and operational integrity**: do not log or store the full value of any detected secret.
|
||
> * Assume the system may expand; be prepared to recognize **new secret formats** and learn from curator feedback.
|
||
>
|
||
> **Behavior Under Edge Cases**:
|
||
>
|
||
> * If unsure whether a string is a secret, flag it as **LOW severity** with a note for human review.
|
||
> * If you detect a high-severity leak, **trigger immediate alerts** and halt propagation of the compromised entry.
|
||
>
|
||
> **Your Output**:
|
||
>
|
||
> * A **structured alert** (JSON preferred) with:
|
||
>
|
||
> * `timestamp`
|
||
> * `source` (which log/agent)
|
||
> * `type` of suspected secret
|
||
> * `redacted_sample`
|
||
> * `confidence_score` (0–1)
|
||
> * `recommended_action` (e.g., “revoke key,” “rotate password,” “ignore dummy”)
|
||
>
|
||
> **Tone & Style**:
|
||
>
|
||
> * Precise, neutral, security-minded.
|
||
> * Avoid speculation beyond what you can confidently identify.
|
||
|
||
---
|
||
## 📂 **Version-Controlled `patterns.yaml` Format**
|
||
|
||
This lets you add/update/remove detection patterns **without touching code**.
|
||
|
||
```yaml
|
||
version: 1.2
|
||
last_updated: 2025-08-02
|
||
|
||
patterns:
|
||
AWS_ACCESS_KEY:
|
||
regex: "AKIA[0-9A-Z]{16}"
|
||
description: "AWS Access Key ID"
|
||
severity: HIGH
|
||
confidence: 0.99
|
||
active: true
|
||
|
||
AWS_SECRET_KEY:
|
||
regex: "(?i)aws(.{0,20})?(?-i)['\"][0-9a-zA-Z\/+]{40}['\"]"
|
||
description: "AWS Secret Key"
|
||
severity: HIGH
|
||
confidence: 0.99
|
||
active: true
|
||
|
||
GITHUB_TOKEN:
|
||
regex: "gh[pousr]_[0-9A-Za-z]{36}"
|
||
description: "GitHub Personal Access Token"
|
||
severity: HIGH
|
||
confidence: 0.97
|
||
active: true
|
||
|
||
JWT:
|
||
regex: "eyJ[A-Za-z0-9_-]+?\\.[A-Za-z0-9._-]+?\\.[A-Za-z0-9._-]+"
|
||
description: "JSON Web Token"
|
||
severity: MEDIUM
|
||
confidence: 0.95
|
||
active: true
|
||
|
||
meta:
|
||
allow_feedback_learning: true
|
||
require_human_review_above_confidence: 0.8
|
||
```
|
||
|
||
✅ **Advantages:**
|
||
|
||
- Regexes are editable without code changes.
|
||
|
||
- Can be versioned in Git for full audit trail.
|
||
|
||
- Can toggle `active: false` for deprecating broken rules.
|
||
|
||
|
||
---
|
||
|
||
## 🖼 **Flow Diagram (Secrets Sentinel)**
|
||
|
||
**Secrets Flow**
|
||
|
||
```
|
||
┌───────────────┐
|
||
Logs Stream →│ Secrets │
|
||
│ Sentinel │
|
||
└──────┬────────┘
|
||
│
|
||
┌─────────┼─────────┐
|
||
│ │
|
||
[Quarantine] [Sanitized Logs]
|
||
│ │
|
||
┌──────┴──────┐ ┌────┴─────┐
|
||
│High Severity│ │ Safe Data│
|
||
│Secrets Only │ │ Storage │
|
||
└──────┬──────┘ └────┬─────┘
|
||
│ │
|
||
┌────────┼─────────┐ │
|
||
│ Revocation Hooks │ │
|
||
│ (AWS, GitHub, │ │
|
||
│ Slack, etc.) │ │
|
||
└────────┬─────────┘ │
|
||
│ │
|
||
┌────┴─────┐ │
|
||
│ Webhooks │ │
|
||
│ Key Kill │ │
|
||
└────┬─────┘ │
|
||
│
|
||
┌─────────┼─────────┐
|
||
│ Feedback Loop │
|
||
│ (Curator/Human) │
|
||
└─────────┬─────────┘
|
||
│
|
||
┌──────┴──────┐
|
||
│ Meta-Learner│
|
||
│ (new regex) │
|
||
└──────┬──────┘
|
||
│
|
||
┌──────┴───────┐
|
||
│ patterns.yaml│
|
||
└──────────────┘
|
||
```
|
||
|
||
---
|
||
|
||
## 🧪 **Test Harness Script**
|
||
|
||
This script simulates log scanning, quarantining, and revocation.
|
||
|
||
```python
|
||
import yaml, json, re
|
||
from datetime import datetime
|
||
|
||
# --- Load patterns.yaml ---
|
||
with open("patterns.yaml", "r") as f:
|
||
patterns_config = yaml.safe_load(f)
|
||
|
||
PATTERNS = patterns_config["patterns"]
|
||
|
||
QUARANTINE = []
|
||
SANITIZED_LOGS = []
|
||
|
||
def redact(secret):
|
||
return secret[:4] + "*" * (len(secret) - 7) + secret[-3:]
|
||
|
||
def scan_log(log_line, log_id, source_agent):
|
||
alerts = []
|
||
for secret_type, props in PATTERNS.items():
|
||
if not props.get("active", True):
|
||
continue
|
||
match = re.search(props["regex"], log_line)
|
||
if match:
|
||
secret = match.group(0)
|
||
severity = props["severity"]
|
||
alert = {
|
||
"timestamp": datetime.utcnow().isoformat() + "Z",
|
||
"source_agent": source_agent,
|
||
"log_line_id": log_id,
|
||
"secret_type": secret_type,
|
||
"redacted_sample": redact(secret),
|
||
"confidence_score": props["confidence"],
|
||
"severity": severity,
|
||
"recommended_action": "Revoke key/rotate credentials" if severity == "HIGH" else "Review"
|
||
}
|
||
alerts.append(alert)
|
||
|
||
# Quarantine if severity is HIGH
|
||
if severity == "HIGH":
|
||
quarantine_log(log_line, f"High severity secret detected: {secret_type}")
|
||
trigger_revocation(secret_type, redact(secret))
|
||
return alerts
|
||
|
||
def quarantine_log(log_line, reason):
|
||
entry = {"timestamp": datetime.utcnow().isoformat() + "Z", "reason": reason, "log_line": log_line}
|
||
QUARANTINE.append(entry)
|
||
print(f"[QUARANTINE] {reason}")
|
||
|
||
def trigger_revocation(secret_type, redacted_sample):
|
||
# Simulated webhook call
|
||
print(f"[REVOCATION] Simulated revocation triggered for {secret_type} ({redacted_sample})")
|
||
|
||
def process_logs(logs):
|
||
for i, log_line in enumerate(logs):
|
||
alerts = scan_log(log_line, f"log_{i}", "agent_demo")
|
||
if alerts:
|
||
print(json.dumps(alerts, indent=2))
|
||
else:
|
||
SANITIZED_LOGS.append(log_line)
|
||
|
||
# --- Test Run ---
|
||
sample_logs = [
|
||
"INFO User logged in successfully",
|
||
"WARNING Found AWS key AKIA1234567890ABCD in commit",
|
||
"DEBUG JWT detected eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.fake.fake"
|
||
]
|
||
|
||
process_logs(sample_logs)
|
||
|
||
print("\n--- Quarantine Store ---")
|
||
print(json.dumps(QUARANTINE, indent=2))
|
||
```
|
||
|
||
✅ **What this does:**
|
||
|
||
- Reads `patterns.yaml`
|
||
|
||
- Scans logs, prints alerts, quarantines high-severity entries
|
||
|
||
- Simulates revocation calls for AWS/GitHub/Slack
|
||
|
||
- Keeps sanitized logs separate from quarantined logs
|
||
|
||
|
||
---
|
||
|
||
## ✅ Next Expansions (Optional)
|
||
|
||
- 📦 **Redis/DB backend** for QUARANTINE instead of memory.
|
||
|
||
- 📡 **Real webhook integrations** (AWS STS, GitHub API, Slack API).
|
||
|
||
- 🧠 **Feedback ingestion module** (e.g., curator submits: `"false_positive": "AWS_ACCESS_KEY"` → adjusts regex in `patterns.yaml`).
|
||
|
||
- 🔄 **Auto-replay from Hyperlog** so Sentinel can retroactively scan old logs with new regex rules.
|
||
|
||
|
||
---
|
||
🔥 **production-grade spec**.
|
||
|
||
---
|
||
|
||
## 📂 **1️⃣ Feedback Ingestion Spec**
|
||
|
||
This defines how curators/humans give feedback to the Sentinel so it can **update its detection rules (patterns.yaml)** safely.
|
||
|
||
---
|
||
|
||
### 🔄 **Feedback Flow**
|
||
|
||
1. **Curator/Reviewer sees alert** → marks it as:
|
||
|
||
- `false_positive` (regex over-triggered)
|
||
|
||
- `missed_secret` (regex failed to detect)
|
||
|
||
- `uncertain` (needs better regex refinement)
|
||
|
||
2. **Feedback API** ingests the report:
|
||
|
||
|
||
```json
|
||
{
|
||
"alert_id": "log_345",
|
||
"secret_type": "AWS_ACCESS_KEY",
|
||
"feedback_type": "false_positive",
|
||
"evidence": "Key was dummy data: TESTKEY123",
|
||
"suggested_regex_fix": null
|
||
}
|
||
```
|
||
|
||
3. **Meta-Learner** updates rules:
|
||
|
||
|
||
- `false_positive` → adds **exceptions** (e.g., allowlist prefixes like `TESTKEY`).
|
||
|
||
- `missed_secret` → drafts **new regex** from evidence (using regex generator or LLM).
|
||
|
||
- Writes changes to **patterns.yaml** under `pending_review`.
|
||
|
||
|
||
4. **Security admin approves** before the new regex is marked `active: true`.
|
||
|
||
|
||
---
|
||
|
||
### 🧠 **Feedback Schema in YAML**
|
||
|
||
```yaml
|
||
pending_updates:
|
||
- regex_name: AWS_ACCESS_KEY
|
||
action: modify
|
||
new_regex: "AKIA[0-9A-Z]{16}(?!TESTKEY)"
|
||
confidence: 0.82
|
||
status: "pending human review"
|
||
submitted_by: curator_2
|
||
timestamp: 2025-08-02T12:40:00Z
|
||
```
|
||
|
||
✅ This keeps **audit trails** & allows **safe hot updates**.
|
||
|
||
---
|
||
|
||
## ⚙️ **2️⃣ Real AWS/GitHub Webhook Payload Templates**
|
||
|
||
These are **example POST payloads** your Sentinel would send when it detects a leaked secret.
|
||
|
||
---
|
||
|
||
### 🔐 **AWS Access Key Revocation**
|
||
|
||
**Endpoint:**
|
||
`POST https://security.example.com/hooks/aws-revoke`
|
||
|
||
**Payload:**
|
||
|
||
```json
|
||
{
|
||
"event": "secret_leak_detected",
|
||
"secret_type": "AWS_ACCESS_KEY",
|
||
"redacted_key": "AKIA****XYZ",
|
||
"log_reference": "hyperlog:58321",
|
||
"recommended_action": "Revoke IAM access key immediately",
|
||
"severity": "HIGH",
|
||
"timestamp": "2025-08-02T12:45:00Z"
|
||
}
|
||
```
|
||
|
||
➡ Your security automation would call AWS CLI or IAM API:
|
||
|
||
```bash
|
||
aws iam update-access-key --access-key-id <redacted> --status Inactive
|
||
aws iam delete-access-key --access-key-id <redacted>
|
||
```
|
||
|
||
---
|
||
|
||
### 🐙 **GitHub Token Revocation**
|
||
|
||
**Endpoint:**
|
||
`POST https://security.example.com/hooks/github-revoke`
|
||
|
||
**Payload:**
|
||
|
||
```json
|
||
{
|
||
"event": "secret_leak_detected",
|
||
"secret_type": "GITHUB_TOKEN",
|
||
"redacted_key": "ghp_****abcd",
|
||
"repository": "repo-name",
|
||
"log_reference": "hyperlog:58322",
|
||
"severity": "HIGH",
|
||
"recommended_action": "Invalidate GitHub token via API",
|
||
"timestamp": "2025-08-02T12:46:00Z"
|
||
}
|
||
```
|
||
|
||
➡ This would tie into GitHub’s [token-scanning API](https://docs.github.com/en/developers/overview/secret-scanning) or use PAT revocation.
|
||
|
||
---
|
||
|
||
### 💬 **Slack Token Revocation**
|
||
|
||
**Endpoint:**
|
||
`POST https://security.example.com/hooks/slack-revoke`
|
||
|
||
**Payload:**
|
||
|
||
```json
|
||
{
|
||
"event": "secret_leak_detected",
|
||
"secret_type": "SLACK_TOKEN",
|
||
"redacted_key": "xoxb****hjk",
|
||
"workspace": "company-slack",
|
||
"log_reference": "hyperlog:58323",
|
||
"severity": "HIGH",
|
||
"recommended_action": "Revoke Slack bot/user token",
|
||
"timestamp": "2025-08-02T12:47:00Z"
|
||
}
|
||
```
|
||
|
||
➡ Slack Admin API can be used to **revoke** or **rotate**.
|
||
|
||
---
|
||
|
||
## 📡 **3️⃣ Redis or PostgreSQL Quarantine Store**
|
||
|
||
Switching from memory to **persistent storage** means quarantined logs survive restarts.
|
||
|
||
---
|
||
|
||
### ✅ **Redis Option (Fast, Volatile)**
|
||
|
||
```python
|
||
import redis, json
|
||
r = redis.Redis(host='localhost', port=6379, decode_responses=True)
|
||
|
||
def quarantine_log(log_line, reason):
|
||
entry = {"timestamp": datetime.utcnow().isoformat() + "Z", "reason": reason, "log_line": log_line}
|
||
r.lpush("quarantine", json.dumps(entry))
|
||
print(f"[QUARANTINE] Stored in Redis: {reason}")
|
||
```
|
||
|
||
- 🏎 **Pros:** Fast, easy to scale.
|
||
|
||
- ⚠️ **Cons:** Volatile unless persisted (RDB/AOF).
|
||
|
||
|
||
---
|
||
|
||
### ✅ **PostgreSQL Option (Auditable, Durable)**
|
||
|
||
**Schema:**
|
||
|
||
```sql
|
||
CREATE TABLE quarantine (
|
||
id SERIAL PRIMARY KEY,
|
||
timestamp TIMESTAMPTZ NOT NULL,
|
||
reason TEXT NOT NULL,
|
||
log_line TEXT NOT NULL,
|
||
reviewed BOOLEAN DEFAULT FALSE
|
||
);
|
||
```
|
||
|
||
**Python Insert:**
|
||
|
||
```python
|
||
import psycopg2
|
||
|
||
conn = psycopg2.connect("dbname=sentinel user=postgres password=secret")
|
||
cursor = conn.cursor()
|
||
|
||
def quarantine_log(log_line, reason):
|
||
entry_time = datetime.utcnow().isoformat() + "Z"
|
||
cursor.execute(
|
||
"INSERT INTO quarantine (timestamp, reason, log_line) VALUES (%s, %s, %s)",
|
||
(entry_time, reason, log_line)
|
||
)
|
||
conn.commit()
|
||
print(f"[QUARANTINE] Stored in PostgreSQL: {reason}")
|
||
```
|
||
|
||
✅ **Postgres is better for long-term auditing** — you can run reports like:
|
||
|
||
- “How many AWS keys leaked this month?”
|
||
|
||
- “Which agents generated the most HIGH-severity quarantines?”
|
||
|
||
|
||
---
|
||
|
||
## 🚀 **What’s Next?**
|
||
|
||
We now have:
|
||
✅ **Detection → Redaction → Quarantine → Revocation → Feedback → Pattern Evolution**
|
||
✅ **patterns.yaml** for versioned regex
|
||
✅ **Webhooks** for real-time secret revocation
|
||
✅ **Persistent quarantine store** (Redis or Postgres)
|
||
|
||
---
|
||
|
||
|
||
|
||
|
||
|
||
## 🛡️ **Expanded System Prompt for Secrets Sentinel**
|
||
|
||
> **Role & Mission**:
|
||
> You are the **Secrets Sentinel**, a security-focused agent monitoring all log streams for potential leaks of sensitive information (API keys, passwords, tokens, etc.). Your mission: **detect**, **sanitize**, and **prevent** secret exposure while keeping operations secure and auditable.
|
||
>
|
||
> **Core Responsibilities**:
|
||
> ✅ Scan **all log entries** for API keys, passwords, JWTs, database strings, and private keys.
|
||
> ✅ **Redact** any detected secrets in-flight before writing them to storage or forwarding.
|
||
> ✅ **Generate structured alerts** for each detection with relevant metadata.
|
||
> ✅ **Quarantine** log lines that contain **high-severity** secrets (so they aren’t distributed further).
|
||
> ✅ Support **continuous learning** by flagging uncertain cases for human/curator review.
|
||
>
|
||
> **Secret Detection Targets**:
|
||
>
|
||
> - **Cloud Keys** (AWS, GCP, Azure, etc.)
|
||
>
|
||
> - **OAuth Tokens** (Bearer, Slack, Discord, GitHub, etc.)
|
||
>
|
||
> - **JWTs** (header.payload.signature format)
|
||
>
|
||
> - **SSH Private Keys** (`-----BEGIN PRIVATE KEY-----`)
|
||
>
|
||
> - **Database Connection Strings** (Postgres, MySQL, MongoDB, etc.)
|
||
>
|
||
> - **Generic Passwords** (detected from common prefixes, e.g. `pwd=`, `password:`).
|
||
>
|
||
>
|
||
> **Detection Rules**:
|
||
>
|
||
> - Use **regex patterns** for known key formats.
|
||
>
|
||
> - Score detections with a **confidence metric** (0–1).
|
||
>
|
||
> - If a string doesn’t fully match, classify as **LOW confidence** for review.
|
||
>
|
||
>
|
||
> **Redaction Policy**:
|
||
>
|
||
> - Always redact most of the secret (`AKIA************XYZ`).
|
||
>
|
||
> - Never store or transmit the **full secret**.
|
||
>
|
||
>
|
||
> **Alert Format (JSON)**:
|
||
>
|
||
> ```json
|
||
> {
|
||
> "timestamp": "2025-08-02T10:12:34Z",
|
||
> "source_agent": "agent_42",
|
||
> "log_line_id": "hyperlog:134593",
|
||
> "secret_type": "AWS_ACCESS_KEY",
|
||
> "redacted_sample": "AKIA********XYZ",
|
||
> "confidence_score": 0.95,
|
||
> "severity": "HIGH",
|
||
> "recommended_action": "Revoke AWS key immediately and rotate credentials"
|
||
> }
|
||
> ```
|
||
>
|
||
> **Behavior Under Edge Cases**:
|
||
>
|
||
> - If unsure: flag as LOW severity with `"recommended_action": "Manual review"`.
|
||
>
|
||
> - If a secret is clearly fake (like `TESTKEY123`), still alert but tag as `test_credential: true`.
|
||
>
|
||
>
|
||
> **Tone & Style**:
|
||
>
|
||
> - Precise, security-minded, and concise in reporting.
|
||
>
|
||
|
||
---
|
||
|
||
## 📚 **Regex Patterns Library (Starter Set)**
|
||
|
||
```python
|
||
REGEX_PATTERNS = {
|
||
"AWS_ACCESS_KEY": r"AKIA[0-9A-Z]{16}",
|
||
"AWS_SECRET_KEY": r"(?i)aws(.{0,20})?(?-i)['\"][0-9a-zA-Z\/+]{40}['\"]",
|
||
"GCP_API_KEY": r"AIza[0-9A-Za-z\\-_]{35}",
|
||
"GITHUB_TOKEN": r"gh[pousr]_[0-9A-Za-z]{36}",
|
||
"SLACK_TOKEN": r"xox[baprs]-[0-9A-Za-z-]{10,48}",
|
||
"JWT": r"eyJ[A-Za-z0-9_-]+?\.[A-Za-z0-9._-]+?\.[A-Za-z0-9._-]+",
|
||
"SSH_PRIVATE_KEY": r"-----BEGIN (RSA|DSA|EC|OPENSSH) PRIVATE KEY-----",
|
||
"GENERIC_PASSWORD": r"(?:password|pwd|pass|secret)\s*[:=]\s*['\"]?[^\s'\";]+['\"]?",
|
||
"DB_CONN_STRING": r"(postgres|mysql|mongodb|mssql|redis):\/\/[^\s]+"
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
## 🛠 **Python Skeleton Implementation**
|
||
|
||
```python
|
||
import re
|
||
import json
|
||
from datetime import datetime
|
||
|
||
REGEX_PATTERNS = {
|
||
"AWS_ACCESS_KEY": r"AKIA[0-9A-Z]{16}",
|
||
"AWS_SECRET_KEY": r"(?i)aws(.{0,20})?(?-i)['\"][0-9a-zA-Z\/+]{40}['\"]",
|
||
"GCP_API_KEY": r"AIza[0-9A-Za-z\\-_]{35}",
|
||
"GITHUB_TOKEN": r"gh[pousr]_[0-9A-Za-z]{36}",
|
||
"SLACK_TOKEN": r"xox[baprs]-[0-9A-Za-z-]{10,48}",
|
||
"JWT": r"eyJ[A-Za-z0-9_-]+?\.[A-Za-z0-9._-]+?\.[A-Za-z0-9._-]+",
|
||
"SSH_PRIVATE_KEY": r"-----BEGIN (RSA|DSA|EC|OPENSSH) PRIVATE KEY-----",
|
||
"GENERIC_PASSWORD": r"(?:password|pwd|pass|secret)\s*[:=]\s*['\"]?[^\s'\";]+['\"]?",
|
||
"DB_CONN_STRING": r"(postgres|mysql|mongodb|mssql|redis):\/\/[^\s]+"
|
||
}
|
||
|
||
def redact(secret: str) -> str:
|
||
"""Redact a secret leaving only first and last 3 chars."""
|
||
return secret[:4] + "*" * (len(secret) - 7) + secret[-3:]
|
||
|
||
def scan_log_line(log_line: str, log_id: str, source_agent: str):
|
||
alerts = []
|
||
for secret_type, pattern in REGEX_PATTERNS.items():
|
||
match = re.search(pattern, log_line)
|
||
if match:
|
||
secret = match.group(0)
|
||
alert = {
|
||
"timestamp": datetime.utcnow().isoformat() + "Z",
|
||
"source_agent": source_agent,
|
||
"log_line_id": log_id,
|
||
"secret_type": secret_type,
|
||
"redacted_sample": redact(secret),
|
||
"confidence_score": 0.95,
|
||
"severity": "HIGH" if "PRIVATE_KEY" in secret_type or "AWS" in secret_type else "MEDIUM",
|
||
"recommended_action": "Revoke key/rotate credentials"
|
||
}
|
||
alerts.append(alert)
|
||
return alerts
|
||
|
||
# Example usage:
|
||
log = "User pushed code with AWS key: AKIA1234567890ABCD"
|
||
alerts = scan_log_line(log, "hyperlog:42", "agent_7")
|
||
print(json.dumps(alerts, indent=2))
|
||
```
|
||
|
||
---
|
||
|
||
## 🚨 **Optional Extensions**
|
||
|
||
- ✅ **Automatic Quarantine** → If severity=HIGH, forward to a “quarantine pipeline” (prevent further log distribution).
|
||
|
||
- ✅ **Secret Revocation Hooks** → Webhooks/scripts to auto-disable compromised keys (AWS IAM revoke, GitHub token reset, etc.).
|
||
|
||
- ✅ **Curator Feedback Loop** → If many false positives occur, agents/curators can “vote” on regex quality → improves future detection.
|
||
|
||
|
||
---
|
||
|
||
👉 Do you want me to:
|
||
|
||
- **📡 Add “quarantine mode” logic** (e.g. isolate log lines and mark them unsafe)?
|
||
|
||
- **🔗 Draft a webhook design** for automatic secret revocation (AWS/GitHub/Slack)?
|
||
|
||
- **🧠 Write a “meta-prompt” for this agent** so it can _learn new secret formats_ over time from curator feedback?
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
Here’s the **full build-out** with all three requested additions:
|
||
|
||
|
||
|
||
|
||
|
||
## 🛡 **Secrets Sentinel: Extended Prompt**
|
||
|
||
> **Role & Mission**:
|
||
> You are the **Secrets Sentinel**, a security agent tasked with scanning all logs for leaked secrets. You **detect**, **redact**, **quarantine**, and optionally **revoke** compromised credentials. You also **evolve your detection rules** over time by learning from curator and human feedback.
|
||
>
|
||
> **Extended Responsibilities**:
|
||
> ✅ **Quarantine:** When you find a **HIGH severity** secret, isolate the entire log entry from normal processing so it doesn’t spread.
|
||
> ✅ **Revocation Hooks:** Trigger a webhook or automation script to disable the compromised key (e.g., AWS IAM revoke).
|
||
> ✅ **Adaptive Learning:** If you get “false positive” or “missed secret” feedback, update or request updates to your regex rules and detection logic.
|
||
|
||
---
|
||
|
||
### 1️⃣ **Quarantine Mode Logic**
|
||
|
||
Add this to the **Python skeleton**:
|
||
|
||
```python
|
||
QUARANTINE_STORE = [] # This could be a Redis list, DB table, etc.
|
||
|
||
def quarantine_log(log_line: str, reason: str):
|
||
"""Move a sensitive log line to quarantine for review."""
|
||
quarantine_entry = {
|
||
"timestamp": datetime.utcnow().isoformat() + "Z",
|
||
"reason": reason,
|
||
"log_line": log_line
|
||
}
|
||
QUARANTINE_STORE.append(quarantine_entry)
|
||
print(f"[QUARANTINE] Log quarantined: {reason}")
|
||
return quarantine_entry
|
||
|
||
def scan_log_line(log_line: str, log_id: str, source_agent: str):
|
||
alerts = []
|
||
for secret_type, pattern in REGEX_PATTERNS.items():
|
||
match = re.search(pattern, log_line)
|
||
if match:
|
||
secret = match.group(0)
|
||
severity = "HIGH" if "PRIVATE_KEY" in secret_type or "AWS" in secret_type else "MEDIUM"
|
||
alert = {
|
||
"timestamp": datetime.utcnow().isoformat() + "Z",
|
||
"source_agent": source_agent,
|
||
"log_line_id": log_id,
|
||
"secret_type": secret_type,
|
||
"redacted_sample": redact(secret),
|
||
"confidence_score": 0.95,
|
||
"severity": severity,
|
||
"recommended_action": "Revoke key/rotate credentials"
|
||
}
|
||
alerts.append(alert)
|
||
if severity == "HIGH":
|
||
quarantine_log(log_line, f"High severity secret detected: {secret_type}")
|
||
return alerts
|
||
```
|
||
|
||
➡ **Now, any HIGH-severity find triggers quarantine.**
|
||
|
||
---
|
||
|
||
### 2️⃣ **Webhook Design for Secret Revocation**
|
||
|
||
We’ll use a **simple webhook dispatcher** that calls different revocation endpoints depending on the secret type.
|
||
|
||
```python
|
||
import requests
|
||
|
||
REVOCATION_HOOKS = {
|
||
"AWS_ACCESS_KEY": "https://security.example.com/hooks/aws-revoke",
|
||
"GITHUB_TOKEN": "https://security.example.com/hooks/github-revoke",
|
||
"SLACK_TOKEN": "https://security.example.com/hooks/slack-revoke"
|
||
}
|
||
|
||
def trigger_revocation(secret_type: str, redacted_sample: str):
|
||
"""Send webhook request to revoke compromised credentials."""
|
||
if secret_type in REVOCATION_HOOKS:
|
||
payload = {
|
||
"secret_type": secret_type,
|
||
"sample": redacted_sample,
|
||
"reason": "Detected secret leak"
|
||
}
|
||
try:
|
||
r = requests.post(REVOCATION_HOOKS[secret_type], json=payload, timeout=5)
|
||
if r.status_code == 200:
|
||
print(f"[REVOCATION] Successfully triggered revocation for {secret_type}")
|
||
else:
|
||
print(f"[REVOCATION] Failed revocation for {secret_type}: {r.status_code}")
|
||
except Exception as e:
|
||
print(f"[REVOCATION ERROR] {e}")
|
||
```
|
||
|
||
Then, in the alert loop:
|
||
|
||
```python
|
||
if severity == "HIGH":
|
||
quarantine_log(log_line, f"High severity secret detected: {secret_type}")
|
||
trigger_revocation(secret_type, redact(secret))
|
||
```
|
||
|
||
➡ **Now, if the sentinel finds an AWS key or GitHub token, it can hit the right webhook to revoke it immediately.**
|
||
|
||
---
|
||
|
||
### 3️⃣ **Meta-Prompt for Learning New Secret Formats**
|
||
|
||
This allows the Sentinel to **expand its detection library** dynamically.
|
||
|
||
---
|
||
|
||
**🧠 Adaptive Learning Meta-Prompt**
|
||
|
||
> **Role**: You are the “Meta-Learner” function of Secrets Sentinel.
|
||
>
|
||
> **Mission**: Continuously expand the Sentinel’s knowledge of new secret formats and detection rules.
|
||
>
|
||
> **Responsibilities**:
|
||
>
|
||
> * Accept **feedback reports** from curators or human reviewers (false positives, missed secrets).
|
||
> * If multiple missed secrets share a pattern, propose a **new regex** (or modify an existing one).
|
||
> * Version-control detection rules (e.g., store regex patterns in a JSON or YAML file with version numbers).
|
||
> * Generate a **confidence score** for new rules before promoting them to active use.
|
||
> * Flag uncertain cases for human approval.
|
||
>
|
||
> **Behavior**:
|
||
>
|
||
> * If false positives keep happening for a regex → weaken or add exception clauses.
|
||
> * If missed secrets are reported → extract examples, infer format, draft new regex, test on past logs.
|
||
>
|
||
> **Output**:
|
||
>
|
||
> ```json
|
||
> {
|
||
> "action": "add|update|remove",
|
||
> "regex_name": "NEW_PROVIDER_KEY",
|
||
> "regex_pattern": "XXX",
|
||
> "confidence": 0.87,
|
||
> "status": "pending human review"
|
||
> }
|
||
> ```
|
||
>
|
||
> **Goal**: Turn curator/human feedback into a **self-evolving secret-detection engine**.
|
||
|
||
---
|
||
|
||
### 🧩 How It All Fits
|
||
|
||
✅ **Secrets Sentinel core** = Detect + Redact + Quarantine
|
||
✅ **Revocation Hooks** = Automated security response
|
||
✅ **Meta-Learning** = Regex patterns evolve over time
|
||
|
||
|
||
---
|
||
|
||
## 📜 **1️⃣ Migration Script: Redis → PostgreSQL**
|
||
|
||
This script will migrate existing quarantined log entries from **Redis** to **Postgres**.
|
||
|
||
```python
|
||
import redis, json, psycopg2
|
||
from datetime import datetime
|
||
|
||
# Redis config
|
||
r = redis.Redis(host='localhost', port=6379, decode_responses=True)
|
||
|
||
# Postgres config
|
||
conn = psycopg2.connect("dbname=sentinel user=postgres password=secret")
|
||
cursor = conn.cursor()
|
||
|
||
def migrate_quarantine():
|
||
count = 0
|
||
while True:
|
||
entry_json = r.rpop("quarantine") # pop oldest entry from Redis
|
||
if not entry_json:
|
||
break
|
||
entry = json.loads(entry_json)
|
||
cursor.execute(
|
||
"INSERT INTO quarantine (timestamp, reason, log_line) VALUES (%s, %s, %s)",
|
||
(entry["timestamp"], entry["reason"], entry["log_line"])
|
||
)
|
||
count += 1
|
||
conn.commit()
|
||
print(f"[MIGRATION] Moved {count} quarantined entries from Redis → PostgreSQL")
|
||
|
||
if __name__ == "__main__":
|
||
migrate_quarantine()
|
||
```
|
||
|
||
✅ **Run once** after Postgres is set up — empties Redis queue into the durable DB.
|
||
|
||
---
|
||
|
||
## 🖥 **2️⃣ Admin Dashboard Spec**
|
||
|
||
**Purpose:** A web UI to manage the Sentinel’s security pipeline.
|
||
|
||
---
|
||
|
||
### 🎯 **Core Features**
|
||
|
||
✅ **Quarantine Browser**
|
||
|
||
- Paginated view of all quarantined logs
|
||
|
||
- Search/filter by `secret_type`, `source_agent`, `date`, `status`
|
||
|
||
- Mark quarantined logs as **reviewed** or **false alarm**
|
||
|
||
|
||
✅ **Regex Rules Manager**
|
||
|
||
- Lists all regexes from `patterns.yaml`
|
||
|
||
- Add / update / deactivate rules via UI
|
||
|
||
- Shows `pending_updates` flagged by the Meta-Learner for human approval
|
||
|
||
|
||
✅ **Revocation Status Board**
|
||
|
||
- See which secrets triggered revocations
|
||
|
||
- Status of revocation hooks (success/fail)
|
||
|
||
|
||
✅ **Metrics Dashboard**
|
||
|
||
- Charts: “Secrets Detected Over Time”, “Top Sources of Leaks”
|
||
|
||
- KPIs: # HIGH severity secrets this month, # rules updated, # false positives
|
||
|
||
|
||
---
|
||
|
||
### 🏗 **Tech Stack Suggestion**
|
||
|
||
- **Backend:** FastAPI (Python)
|
||
|
||
- **Frontend:** React + Tailwind
|
||
|
||
- **DB:** PostgreSQL for quarantine + rules history
|
||
|
||
- **Auth:** OAuth (GitHub/Google) + RBAC (only security admins can approve regex changes)
|
||
|
||
|
||
---
|
||
|
||
### 🔌 **Endpoints**
|
||
|
||
```
|
||
GET /api/quarantine → list quarantined entries
|
||
POST /api/quarantine/review → mark entry as reviewed
|
||
GET /api/rules → list regex patterns
|
||
POST /api/rules/update → update or add a regex
|
||
GET /api/revocations → list revocation events
|
||
```
|
||
|
||
---
|
||
|
||
### 🖥 **Mock Dashboard Layout**
|
||
|
||
- **Left Nav:** Quarantine | Rules | Revocations | Metrics
|
||
|
||
- **Main Panel:**
|
||
|
||
- Data tables with sorting/filtering
|
||
|
||
- Inline editors for regex rules
|
||
|
||
- Approve/Reject buttons for pending regex updates
|
||
|
||
|
||
✅ Basically a **security control room** for Sentinel.
|
||
|
||
---
|
||
|
||
## 🤖 **3️⃣ Meta-Curator AI Prompt**
|
||
|
||
This agent reviews Sentinel’s work and **tunes it automatically**.
|
||
|
||
---
|
||
|
||
### **Meta-Curator: System Prompt**
|
||
|
||
> **Role & Mission:**
|
||
> You are the **Meta-Curator**, a supervisory AI responsible for reviewing the **Secrets Sentinel’s** detections, regex updates, and feedback reports.
|
||
>
|
||
> **Core Responsibilities:**
|
||
> ✅ **Audit alerts** – Look for false positives, duplicates, or missed leaks by cross-checking Sentinel outputs.
|
||
> ✅ **Review regex proposals** – When Sentinel drafts new regex rules, decide if they’re:
|
||
>
|
||
> - ✅ Approved (safe to activate)
|
||
>
|
||
> - ❌ Rejected (too broad or incorrect)
|
||
>
|
||
> - 🕒 Deferred (needs human review)
|
||
> ✅ **Tune detection thresholds** – Adjust `confidence` or `severity` on patterns based on outcomes.
|
||
> ✅ **Generate new rules** – If multiple missed secrets share a format, draft a regex and submit to humans for approval.
|
||
> ✅ **Report upstream** – Summarize changes to security admins weekly.
|
||
>
|
||
|
||
---
|
||
|
||
### **Behavior Guidelines**
|
||
|
||
- **Conservative by default:** Don’t auto-approve regexes unless confidence > 0.95.
|
||
|
||
- **Keep auditability:** Every decision (approve/reject) is logged in the hyperlog.
|
||
|
||
- **Respect human overrides:** Never overwrite a regex that a human explicitly locked.
|
||
|
||
|
||
---
|
||
|
||
### **Example Meta-Curator Output**
|
||
|
||
```json
|
||
{
|
||
"action": "approve_regex",
|
||
"regex_name": "GITLAB_TOKEN",
|
||
"regex_pattern": "glpat-[0-9A-Za-z\\-_]{20}",
|
||
"confidence": 0.97,
|
||
"decision_reason": "Validated against 12 quarantined examples, no false positives found.",
|
||
"timestamp": "2025-08-02T13:45:00Z"
|
||
}
|
||
```
|
||
|
||
✅ This meta-agent is the **brains of the rules layer** — keeps Sentinel evolving, but under control.
|
||
|
||
---
|
||
|
||
## 🚀 **Now You Have:**
|
||
|
||
✅ **Migration Path** → Redis → PostgreSQL
|
||
✅ **Admin Dashboard Spec** → complete with endpoints & layout
|
||
✅ **Meta-Curator Prompt** → the agent that “manages the manager”
|
||
|
||
---
|