# CHORUS Architecture Overview **System:** CHORUS - Container-First P2P Task Coordination **Version:** 0.5.0-dev **Architecture Type:** Distributed, Peer-to-Peer, Event-Driven --- ## Table of Contents 1. [System Overview](#system-overview) 2. [Core Principles](#core-principles) 3. [Architecture Layers](#architecture-layers) 4. [Key Components](#key-components) 5. [Data Flow](#data-flow) 6. [Deployment Models](#deployment-models) 7. [Related Documents](#related-documents) --- ## System Overview CHORUS is a **distributed task coordination system** that enables both autonomous AI agents and human operators to collaborate on software development tasks through a peer-to-peer network. The system provides: ### Primary Capabilities - **Autonomous Agent Execution**: AI agents that can execute code tasks in isolated Docker sandboxes - **Human-Agent Collaboration**: Human Agent Portal (HAP) for human participation in agent networks - **Distributed Coordination**: P2P mesh networking with democratic leader election - **Context Addressing**: UCXL (Universal Context Addressing) for immutable decision tracking - **Secure Execution**: Multi-layer sandboxing with Docker containers and security policies - **Collaborative Reasoning**: HMMM protocol for meta-discussion and consensus building - **Encrypted Storage**: DHT-based encrypted storage for sensitive data ### System Philosophy CHORUS follows these key principles: 1. **Container-First**: All configuration via environment variables, no file-based config 2. **P2P by Default**: No central server; agents form democratic mesh networks 3. **Zero-Trust Security**: Every operation validated, credentials never stored in containers 4. **Immutable Decisions**: All agent decisions recorded in content-addressed storage 5. **Human-in-the-Loop**: Humans as first-class peers in the agent network --- ## Core Principles ### 1. Container-Native Architecture ``` ┌─────────────────────────────────────────────────────────────┐ │ CHORUS Container │ │ │ │ Environment Variables → Runtime Configuration │ │ Volume Mounts → Prompts & Secrets │ │ Network Policies → Zero-Egress by Default │ │ Signal Handling → Dynamic Reconfiguration (SIGHUP) │ └─────────────────────────────────────────────────────────────┘ ``` **Key Features:** - No config files inside containers - All settings via environment variables - Secrets injected via secure volumes - Dynamic assignment loading from WHOOSH - SIGHUP-triggered reconfiguration ### 2. Peer-to-Peer Mesh Network ``` Agent-1 (Alice) /|\ / | \ / | \ / | \ Agent-2 | Agent-4 (Bob) | (Dave) \ | / \ | / \ | / \|/ Agent-3 (Carol) All agents are equal peers No central coordinator Democratic leader election mDNS local discovery DHT global discovery ``` ### 3. Multi-Layer Security ``` Layer 1: License Validation (KACHING) ↓ Layer 2: P2P Encryption (libp2p TLS) ↓ Layer 3: DHT Encryption (age encryption) ↓ Layer 4: Docker Sandboxing (namespaces, cgroups) ↓ Layer 5: Network Isolation (zero-egress) ↓ Layer 6: SHHH Secrets Detection (scan & redact) ↓ Layer 7: UCXL Validation (immutable audit trail) ↓ Layer 8: Credential Mediation (agent uploads, not container) ``` --- ## Architecture Layers CHORUS is organized into distinct architectural layers: ### Layer 1: P2P Infrastructure **Components:** - libp2p Host (networking) - mDNS Discovery (local peers) - DHT (global peer discovery) - PubSub (message broadcasting) **Responsibilities:** - Peer discovery and connection management - Encrypted peer-to-peer communication - Message routing and delivery - Network resilience and failover **See:** [P2P Infrastructure](../internal/p2p.md) ### Layer 2: Coordination & Consensus **Components:** - Election Manager (leader election) - Task Coordinator (work distribution) - HMMM Router (meta-discussion) - SLURP (distributed orchestration) **Responsibilities:** - Democratic leader election - Task assignment and tracking - Collaborative reasoning protocols - Work distribution algorithms **See:** [Coordination](../packages/coordination.md), [SLURP](../packages/slurp/README.md) ### Layer 3: Execution Engine **Components:** - Task Execution Engine - Docker Sandbox - Image Selector - Command Executor **Responsibilities:** - Isolated code execution in Docker containers - Language-specific environment selection - Resource limits and monitoring - Result capture and validation **See:** [Execution Engine](../packages/execution.md), [Task Execution Engine Module](../../Modules/TaskExecutionEngine.md) ### Layer 4: AI Integration **Components:** - AI Provider Interface - Provider Implementations (Ollama, ResetData) - Model Selection Logic - Prompt Management **Responsibilities:** - Abstract AI provider differences - Route requests to appropriate models - Manage system prompts and context - Handle AI provider failover **See:** [AI Providers](../packages/ai.md), [Providers](../packages/providers.md) ### Layer 5: Storage & State **Components:** - DHT Storage (distributed) - Encrypted Storage (age encryption) - UCXL Decision Publisher - Hypercore Log (append-only) **Responsibilities:** - Distributed data storage - Encryption and key management - Immutable decision recording - Event log persistence **See:** [DHT](../packages/dht.md), [UCXL](../packages/ucxl.md) ### Layer 6: Security & Validation **Components:** - License Validator (KACHING) - SHHH Sentinel (secrets detection) - Crypto Layer (encryption) - Security Policies **Responsibilities:** - License enforcement - Secrets scanning and redaction - Cryptographic operations - Security policy enforcement **See:** [Crypto](../packages/crypto.md), [SHHH](../packages/shhh.md), [Licensing](../internal/licensing.md) ### Layer 7: Observability **Components:** - Metrics Collector (CHORUS Metrics) - Health Checks (liveness, readiness) - BACKBEAT Integration (P2P telemetry) - Hypercore Log (coordination events) **Responsibilities:** - System metrics collection - Health monitoring - P2P operation tracking - Event logging and audit trails **See:** [Metrics](../packages/metrics.md), [Health](../packages/health.md) ### Layer 8: External Interfaces **Components:** - HTTP API Server - UCXI Server (content resolution) - HAP Terminal Interface - HAP Web Interface [STUB] **Responsibilities:** - REST API endpoints - UCXL content resolution - Human interaction interfaces - External system integration **See:** [API](../api/README.md), [UCXI](../packages/ucxi.md), [HAP UI](../internal/hapui.md) --- ## Key Components ### Runtime Architecture ``` ┌──────────────────────────────────────────────────────────────┐ │ main.go (cmd/agent or cmd/hap) │ │ │ │ │ └─→ internal/runtime.Initialize() │ │ │ │ │ ├─→ Config Loading (environment) │ │ ├─→ License Validation (KACHING) │ │ ├─→ AI Provider Setup (Ollama/ResetData) │ │ ├─→ P2P Node Creation (libp2p) │ │ ├─→ PubSub Initialization │ │ ├─→ DHT Setup (optional) │ │ ├─→ Election Manager │ │ ├─→ Task Coordinator │ │ ├─→ HTTP API Server │ │ ├─→ UCXI Server (optional) │ │ └─→ Health & Metrics │ │ │ │ SharedRuntime │ │ ├── Context & Cancellation │ │ ├── Logger (SimpleLogger) │ │ ├── Config (*config.Config) │ │ ├── RuntimeConfig (dynamic assignments) │ │ ├── P2P Node (*p2p.Node) │ │ ├── PubSub (*pubsub.PubSub) │ │ ├── DHT (*dht.LibP2PDHT) │ │ ├── Encrypted Storage (*dht.EncryptedDHTStorage) │ │ ├── Election Manager (*election.ElectionManager) │ │ ├── Task Coordinator (*coordinator.TaskCoordinator) │ │ ├── HTTP Server (*api.HTTPServer) │ │ ├── UCXI Server (*ucxi.Server) │ │ ├── Health Manager (*health.Manager) │ │ ├── Metrics (*metrics.CHORUSMetrics) │ │ ├── SHHH Sentinel (*shhh.Sentinel) │ │ ├── BACKBEAT Integration (*backbeat.Integration) │ │ └── Decision Publisher (*ucxl.DecisionPublisher) │ └──────────────────────────────────────────────────────────────┘ ``` ### Binary Separation CHORUS provides three binaries with shared infrastructure: | Binary | Purpose | Mode | Status | |--------|---------|------|--------| | **chorus-agent** | Autonomous AI agent | Agent Mode | ✅ Production | | **chorus-hap** | Human Agent Portal | HAP Mode | 🔶 Beta | | **chorus** | Compatibility wrapper | N/A | 🔴 Deprecated | All binaries share: - P2P infrastructure (libp2p, PubSub, DHT) - Election and coordination systems - Security and encryption layers - Configuration and licensing Differences: - **Agent**: Automatic task execution, autonomous reasoning - **HAP**: Terminal/web UI for human interaction, manual task approval **See:** [Commands](../commands/README.md) --- ## Data Flow ### Task Execution Flow ``` 1. Task Request Arrives │ ├─→ Via PubSub (from another agent) ├─→ Via HTTP API (from external system) └─→ Via HAP (from human operator) │ ↓ 2. Task Coordinator Receives Task │ ├─→ Check agent availability ├─→ Validate task structure └─→ Assign to execution engine │ ↓ 3. Execution Engine Processes │ ├─→ Detect language (Go, Rust, Python, etc.) ├─→ Select Docker image ├─→ Create sandbox configuration ├─→ Start container │ │ │ ├─→ Mount /workspace/input (read-only source) │ ├─→ Mount /workspace/data (working directory) │ └─→ Mount /workspace/output (deliverables) │ ├─→ Execute commands via Docker Exec API ├─→ Stream stdout/stderr ├─→ Monitor resource usage └─→ Capture exit codes │ ↓ 4. Result Processing │ ├─→ Collect artifacts from /workspace/output ├─→ Generate task summary ├─→ Create UCXL decision record └─→ Publish to DHT (encrypted) │ ↓ 5. Result Distribution │ ├─→ Broadcast completion via PubSub ├─→ Update task tracker (availability) ├─→ Notify requester (if HTTP API) └─→ Log to Hypercore (audit trail) ``` ### Decision Publishing Flow ``` Agent Decision Made │ ↓ Generate UCXL Context Address │ ├─→ Hash decision content (SHA-256) ├─→ Create ucxl:// URI └─→ Add metadata (agent ID, timestamp) │ ↓ Encrypt Decision Data │ ├─→ Use age encryption ├─→ Derive key from shared secret └─→ Create encrypted blob │ ↓ Store in DHT │ ├─→ Key: UCXL hash ├─→ Value: Encrypted decision └─→ TTL: Configured expiration │ ↓ Announce on PubSub │ ├─→ Topic: "chorus/decisions" ├─→ Payload: UCXL address only └─→ Interested peers can fetch from DHT ``` ### Election Flow ``` Agent Startup │ ↓ Join Election Topic │ ├─→ Subscribe to "chorus/election/v1" ├─→ Announce presence └─→ Share capabilities │ ↓ Send Heartbeats │ ├─→ Every 5 seconds ├─→ Include: Node ID, Uptime, Load └─→ Track other peers' heartbeats │ ↓ Monitor Admin Status │ ├─→ Track last admin heartbeat ├─→ Timeout: 15 seconds └─→ If timeout → Trigger election │ ↓ Election Triggered │ ├─→ All agents propose themselves ├─→ Vote for highest uptime ├─→ Consensus on winner └─→ Winner becomes admin │ ↓ Admin Elected │ ├─→ Winner assumes admin role ├─→ Applies admin configuration ├─→ Enables SLURP coordination └─→ Continues heartbeat at higher frequency ``` --- ## Deployment Models ### Model 1: Local Development ``` ┌─────────────────────────────────────────┐ │ Developer Laptop │ │ │ │ ┌──────────────┐ ┌──────────────┐ │ │ │ chorus-agent │ │ chorus-hap │ │ │ │ (Alice) │ │ (Human) │ │ │ └──────┬───────┘ └──────┬───────┘ │ │ │ │ │ │ └────────┬─────────┘ │ │ │ │ │ mDNS Discovery │ │ P2P Mesh (local) │ │ │ │ Ollama: localhost:11434 │ │ Docker: /var/run/docker.sock │ └─────────────────────────────────────────┘ ``` **Characteristics:** - Single machine deployment - mDNS for peer discovery - Local Ollama instance - Shared Docker socket - No DHT required **Use Cases:** - Local testing - Development workflows - Single-user tasks ### Model 2: Docker Swarm Cluster ``` ┌────────────────────────────────────────────────────────────┐ │ Docker Swarm Cluster │ │ │ │ Manager Node 1 Manager Node 2 Worker 1 │ │ ┌──────────────┐ ┌──────────────┐ ┌─────────┐ │ │ │ chorus-agent │←─────→│ chorus-agent │←─────→│ chorus │ │ │ │ (Leader) │ │ (Follower) │ │ -agent │ │ │ └──────────────┘ └──────────────┘ └─────────┘ │ │ ↑ ↑ ↑ │ │ │ │ │ │ │ └───────────────────────┴─────────────────────┘ │ │ Docker Swarm Overlay Network │ │ P2P Mesh + DHT │ │ │ │ Shared Services: │ │ - Docker Registry (private) │ │ - Ollama Distributed (5 nodes) │ │ - NFS Storage (/rust) │ │ - WHOOSH (assignment server) │ │ - KACHING (license server) │ └────────────────────────────────────────────────────────────┘ ``` **Characteristics:** - Multi-node cluster - DHT for global discovery - Bootstrap peers for network joining - Overlay networking - Shared storage via NFS - Centralized license validation **Use Cases:** - Production deployments - Team collaboration - High availability - Scalable workloads ### Model 3: Hybrid (Agent + HAP) ``` ┌──────────────────────────────────────────────────────────┐ │ Production Environment │ │ │ │ Docker Swarm Developer Workstation │ │ ┌──────────────┐ ┌──────────────┐ │ │ │ chorus-agent │ │ chorus-hap │ │ │ │ (Alice) │←─────P2P─────→│ (Human-Bob) │ │ │ └──────┬───────┘ └──────────────┘ │ │ │ │ │ ┌──────┴───────┐ │ │ │ chorus-agent │ │ │ │ (Carol) │ │ │ └──────────────┘ │ │ │ │ Autonomous agents run in swarm │ │ Human operator joins via HAP (local or remote) │ │ Same P2P protocol, equal participants │ └──────────────────────────────────────────────────────────┘ ``` **Characteristics:** - Autonomous agents in production - Human operators join as needed - Collaborative decision-making - HMMM meta-discussion - Humans can override or guide **Use Cases:** - Supervised automation - Human-in-the-loop workflows - Critical decision points - Training and oversight --- ## Related Documents ### Getting Started - [Commands Overview](../commands/README.md) - Entry points and CLI tools - [Deployment Guide](../deployment/README.md) - How to deploy CHORUS - [Configuration](../deployment/configuration.md) - Environment variables and settings ### Core Systems - [Task Execution Engine](../../Modules/TaskExecutionEngine.md) - Complete execution engine documentation - [P2P Infrastructure](../internal/p2p.md) - libp2p networking details - [SLURP System](../packages/slurp/README.md) - Distributed coordination ### Security - [Security Architecture](security.md) - Security layers and threat model - [Crypto Package](../packages/crypto.md) - Encryption and key management - [SHHH](../packages/shhh.md) - Secrets detection and redaction - [Licensing](../internal/licensing.md) - License validation ### Integration - [API Reference](../api/reference.md) - HTTP API endpoints - [UCXL System](../packages/ucxl.md) - Context addressing - [AI Providers](../packages/ai.md) - AI integration --- ## Next Steps For detailed information on specific components: 1. **New to CHORUS?** Start with [System Architecture](system-architecture.md) 2. **Want to deploy?** See [Deployment Guide](../deployment/README.md) 3. **Developing features?** Review [Component Map](component-map.md) 4. **Understanding execution?** Read [Task Execution Engine](../../Modules/TaskExecutionEngine.md)