 bd19709b31
			
		
	
	bd19709b31
	
	
	
		
			
			Created complete documentation infrastructure with master index and detailed command-line tool documentation. Documentation Structure: - docs/comprehensive/README.md - Master index with navigation - docs/comprehensive/architecture/README.md - System architecture overview - docs/comprehensive/commands/chorus-agent.md - Autonomous agent binary (✅ Production) - docs/comprehensive/commands/chorus-hap.md - Human Agent Portal (🔶 Beta) - docs/comprehensive/commands/chorus.md - Deprecated wrapper (⚠️ Deprecated) Coverage Statistics: - 3 command binaries fully documented (3,056 lines, ~14,500 words) - Complete source code analysis with line numbers - Configuration reference for all environment variables - Runtime behavior and execution flows - P2P networking details - Health checks and monitoring - Example deployments (local, Docker, Swarm) - Troubleshooting guides - Cross-references between docs Key Features Documented: - Container-first architecture - P2P mesh networking - Democratic leader election - Docker sandbox execution - HMMM collaborative reasoning - UCXL decision publishing - DHT encrypted storage - Multi-layer security - Human-agent collaboration Implementation Status Tracking: - ✅ Production features marked - 🔶 Beta features identified - ⏳ Stubbed components noted - ⚠️ Deprecated code flagged Next Phase: Package documentation (30+ packages in pkg/) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
		
			
				
	
	
	
		
			20 KiB
		
	
	
	
	
	
	
	
			
		
		
	
	CHORUS Architecture Overview
System: CHORUS - Container-First P2P Task Coordination Version: 0.5.0-dev Architecture Type: Distributed, Peer-to-Peer, Event-Driven
Table of Contents
- System Overview
- Core Principles
- Architecture Layers
- Key Components
- Data Flow
- Deployment Models
- Related Documents
System Overview
CHORUS is a distributed task coordination system that enables both autonomous AI agents and human operators to collaborate on software development tasks through a peer-to-peer network. The system provides:
Primary Capabilities
- Autonomous Agent Execution: AI agents that can execute code tasks in isolated Docker sandboxes
- Human-Agent Collaboration: Human Agent Portal (HAP) for human participation in agent networks
- Distributed Coordination: P2P mesh networking with democratic leader election
- Context Addressing: UCXL (Universal Context Addressing) for immutable decision tracking
- Secure Execution: Multi-layer sandboxing with Docker containers and security policies
- Collaborative Reasoning: HMMM protocol for meta-discussion and consensus building
- Encrypted Storage: DHT-based encrypted storage for sensitive data
System Philosophy
CHORUS follows these key principles:
- Container-First: All configuration via environment variables, no file-based config
- P2P by Default: No central server; agents form democratic mesh networks
- Zero-Trust Security: Every operation validated, credentials never stored in containers
- Immutable Decisions: All agent decisions recorded in content-addressed storage
- Human-in-the-Loop: Humans as first-class peers in the agent network
Core Principles
1. Container-Native Architecture
┌─────────────────────────────────────────────────────────────┐
│ CHORUS Container                                             │
│                                                               │
│  Environment Variables  →  Runtime Configuration             │
│  Volume Mounts          →  Prompts & Secrets                 │
│  Network Policies       →  Zero-Egress by Default            │
│  Signal Handling        →  Dynamic Reconfiguration (SIGHUP)  │
└─────────────────────────────────────────────────────────────┘
Key Features:
- No config files inside containers
- All settings via environment variables
- Secrets injected via secure volumes
- Dynamic assignment loading from WHOOSH
- SIGHUP-triggered reconfiguration
2. Peer-to-Peer Mesh Network
        Agent-1 (Alice)
           /|\
          / | \
         /  |  \
        /   |   \
   Agent-2  |  Agent-4
    (Bob)   |   (Dave)
        \   |   /
         \  |  /
          \ | /
           \|/
        Agent-3 (Carol)
All agents are equal peers
No central coordinator
Democratic leader election
mDNS local discovery
DHT global discovery
3. Multi-Layer Security
Layer 1: License Validation (KACHING)
    ↓
Layer 2: P2P Encryption (libp2p TLS)
    ↓
Layer 3: DHT Encryption (age encryption)
    ↓
Layer 4: Docker Sandboxing (namespaces, cgroups)
    ↓
Layer 5: Network Isolation (zero-egress)
    ↓
Layer 6: SHHH Secrets Detection (scan & redact)
    ↓
Layer 7: UCXL Validation (immutable audit trail)
    ↓
Layer 8: Credential Mediation (agent uploads, not container)
Architecture Layers
CHORUS is organized into distinct architectural layers:
Layer 1: P2P Infrastructure
Components:
- libp2p Host (networking)
- mDNS Discovery (local peers)
- DHT (global peer discovery)
- PubSub (message broadcasting)
Responsibilities:
- Peer discovery and connection management
- Encrypted peer-to-peer communication
- Message routing and delivery
- Network resilience and failover
See: P2P Infrastructure
Layer 2: Coordination & Consensus
Components:
- Election Manager (leader election)
- Task Coordinator (work distribution)
- HMMM Router (meta-discussion)
- SLURP (distributed orchestration)
Responsibilities:
- Democratic leader election
- Task assignment and tracking
- Collaborative reasoning protocols
- Work distribution algorithms
See: Coordination, SLURP
Layer 3: Execution Engine
Components:
- Task Execution Engine
- Docker Sandbox
- Image Selector
- Command Executor
Responsibilities:
- Isolated code execution in Docker containers
- Language-specific environment selection
- Resource limits and monitoring
- Result capture and validation
See: Execution Engine, Task Execution Engine Module
Layer 4: AI Integration
Components:
- AI Provider Interface
- Provider Implementations (Ollama, ResetData)
- Model Selection Logic
- Prompt Management
Responsibilities:
- Abstract AI provider differences
- Route requests to appropriate models
- Manage system prompts and context
- Handle AI provider failover
See: AI Providers, Providers
Layer 5: Storage & State
Components:
- DHT Storage (distributed)
- Encrypted Storage (age encryption)
- UCXL Decision Publisher
- Hypercore Log (append-only)
Responsibilities:
- Distributed data storage
- Encryption and key management
- Immutable decision recording
- Event log persistence
Layer 6: Security & Validation
Components:
- License Validator (KACHING)
- SHHH Sentinel (secrets detection)
- Crypto Layer (encryption)
- Security Policies
Responsibilities:
- License enforcement
- Secrets scanning and redaction
- Cryptographic operations
- Security policy enforcement
Layer 7: Observability
Components:
- Metrics Collector (CHORUS Metrics)
- Health Checks (liveness, readiness)
- BACKBEAT Integration (P2P telemetry)
- Hypercore Log (coordination events)
Responsibilities:
- System metrics collection
- Health monitoring
- P2P operation tracking
- Event logging and audit trails
Layer 8: External Interfaces
Components:
- HTTP API Server
- UCXI Server (content resolution)
- HAP Terminal Interface
- HAP Web Interface [STUB]
Responsibilities:
- REST API endpoints
- UCXL content resolution
- Human interaction interfaces
- External system integration
Key Components
Runtime Architecture
┌──────────────────────────────────────────────────────────────┐
│ main.go (cmd/agent or cmd/hap)                               │
│   │                                                            │
│   └─→ internal/runtime.Initialize()                           │
│          │                                                     │
│          ├─→ Config Loading (environment)                     │
│          ├─→ License Validation (KACHING)                     │
│          ├─→ AI Provider Setup (Ollama/ResetData)            │
│          ├─→ P2P Node Creation (libp2p)                       │
│          ├─→ PubSub Initialization                            │
│          ├─→ DHT Setup (optional)                             │
│          ├─→ Election Manager                                 │
│          ├─→ Task Coordinator                                 │
│          ├─→ HTTP API Server                                  │
│          ├─→ UCXI Server (optional)                           │
│          └─→ Health & Metrics                                 │
│                                                                │
│   SharedRuntime                                               │
│   ├── Context & Cancellation                                  │
│   ├── Logger (SimpleLogger)                                   │
│   ├── Config (*config.Config)                                 │
│   ├── RuntimeConfig (dynamic assignments)                     │
│   ├── P2P Node (*p2p.Node)                                    │
│   ├── PubSub (*pubsub.PubSub)                                │
│   ├── DHT (*dht.LibP2PDHT)                                    │
│   ├── Encrypted Storage (*dht.EncryptedDHTStorage)           │
│   ├── Election Manager (*election.ElectionManager)           │
│   ├── Task Coordinator (*coordinator.TaskCoordinator)        │
│   ├── HTTP Server (*api.HTTPServer)                           │
│   ├── UCXI Server (*ucxi.Server)                              │
│   ├── Health Manager (*health.Manager)                        │
│   ├── Metrics (*metrics.CHORUSMetrics)                        │
│   ├── SHHH Sentinel (*shhh.Sentinel)                          │
│   ├── BACKBEAT Integration (*backbeat.Integration)           │
│   └── Decision Publisher (*ucxl.DecisionPublisher)           │
└──────────────────────────────────────────────────────────────┘
Binary Separation
CHORUS provides three binaries with shared infrastructure:
| Binary | Purpose | Mode | Status | 
|---|---|---|---|
| chorus-agent | Autonomous AI agent | Agent Mode | ✅ Production | 
| chorus-hap | Human Agent Portal | HAP Mode | 🔶 Beta | 
| chorus | Compatibility wrapper | N/A | 🔴 Deprecated | 
All binaries share:
- P2P infrastructure (libp2p, PubSub, DHT)
- Election and coordination systems
- Security and encryption layers
- Configuration and licensing
Differences:
- Agent: Automatic task execution, autonomous reasoning
- HAP: Terminal/web UI for human interaction, manual task approval
See: Commands
Data Flow
Task Execution Flow
1. Task Request Arrives
   │
   ├─→ Via PubSub (from another agent)
   ├─→ Via HTTP API (from external system)
   └─→ Via HAP (from human operator)
   │
   ↓
2. Task Coordinator Receives Task
   │
   ├─→ Check agent availability
   ├─→ Validate task structure
   └─→ Assign to execution engine
   │
   ↓
3. Execution Engine Processes
   │
   ├─→ Detect language (Go, Rust, Python, etc.)
   ├─→ Select Docker image
   ├─→ Create sandbox configuration
   ├─→ Start container
   │   │
   │   ├─→ Mount /workspace/input (read-only source)
   │   ├─→ Mount /workspace/data (working directory)
   │   └─→ Mount /workspace/output (deliverables)
   │
   ├─→ Execute commands via Docker Exec API
   ├─→ Stream stdout/stderr
   ├─→ Monitor resource usage
   └─→ Capture exit codes
   │
   ↓
4. Result Processing
   │
   ├─→ Collect artifacts from /workspace/output
   ├─→ Generate task summary
   ├─→ Create UCXL decision record
   └─→ Publish to DHT (encrypted)
   │
   ↓
5. Result Distribution
   │
   ├─→ Broadcast completion via PubSub
   ├─→ Update task tracker (availability)
   ├─→ Notify requester (if HTTP API)
   └─→ Log to Hypercore (audit trail)
Decision Publishing Flow
Agent Decision Made
   │
   ↓
Generate UCXL Context Address
   │
   ├─→ Hash decision content (SHA-256)
   ├─→ Create ucxl:// URI
   └─→ Add metadata (agent ID, timestamp)
   │
   ↓
Encrypt Decision Data
   │
   ├─→ Use age encryption
   ├─→ Derive key from shared secret
   └─→ Create encrypted blob
   │
   ↓
Store in DHT
   │
   ├─→ Key: UCXL hash
   ├─→ Value: Encrypted decision
   └─→ TTL: Configured expiration
   │
   ↓
Announce on PubSub
   │
   ├─→ Topic: "chorus/decisions"
   ├─→ Payload: UCXL address only
   └─→ Interested peers can fetch from DHT
Election Flow
Agent Startup
   │
   ↓
Join Election Topic
   │
   ├─→ Subscribe to "chorus/election/v1"
   ├─→ Announce presence
   └─→ Share capabilities
   │
   ↓
Send Heartbeats
   │
   ├─→ Every 5 seconds
   ├─→ Include: Node ID, Uptime, Load
   └─→ Track other peers' heartbeats
   │
   ↓
Monitor Admin Status
   │
   ├─→ Track last admin heartbeat
   ├─→ Timeout: 15 seconds
   └─→ If timeout → Trigger election
   │
   ↓
Election Triggered
   │
   ├─→ All agents propose themselves
   ├─→ Vote for highest uptime
   ├─→ Consensus on winner
   └─→ Winner becomes admin
   │
   ↓
Admin Elected
   │
   ├─→ Winner assumes admin role
   ├─→ Applies admin configuration
   ├─→ Enables SLURP coordination
   └─→ Continues heartbeat at higher frequency
Deployment Models
Model 1: Local Development
┌─────────────────────────────────────────┐
│ Developer Laptop                         │
│                                          │
│  ┌──────────────┐  ┌──────────────┐    │
│  │ chorus-agent │  │ chorus-hap   │    │
│  │  (Alice)     │  │  (Human)     │    │
│  └──────┬───────┘  └──────┬───────┘    │
│         │                  │             │
│         └────────┬─────────┘             │
│                  │                       │
│           mDNS Discovery                 │
│           P2P Mesh (local)               │
│                                          │
│  Ollama: localhost:11434                │
│  Docker: /var/run/docker.sock           │
└─────────────────────────────────────────┘
Characteristics:
- Single machine deployment
- mDNS for peer discovery
- Local Ollama instance
- Shared Docker socket
- No DHT required
Use Cases:
- Local testing
- Development workflows
- Single-user tasks
Model 2: Docker Swarm Cluster
┌────────────────────────────────────────────────────────────┐
│ Docker Swarm Cluster                                        │
│                                                              │
│  Manager Node 1          Manager Node 2          Worker 1   │
│  ┌──────────────┐       ┌──────────────┐       ┌─────────┐ │
│  │ chorus-agent │←─────→│ chorus-agent │←─────→│ chorus  │ │
│  │ (Leader)     │       │ (Follower)   │       │ -agent  │ │
│  └──────────────┘       └──────────────┘       └─────────┘ │
│         ↑                       ↑                     ↑      │
│         │                       │                     │      │
│         └───────────────────────┴─────────────────────┘      │
│                     Docker Swarm Overlay Network             │
│                     P2P Mesh + DHT                           │
│                                                              │
│  Shared Services:                                           │
│  - Docker Registry (private)                                │
│  - Ollama Distributed (5 nodes)                             │
│  - NFS Storage (/rust)                                      │
│  - WHOOSH (assignment server)                               │
│  - KACHING (license server)                                 │
└────────────────────────────────────────────────────────────┘
Characteristics:
- Multi-node cluster
- DHT for global discovery
- Bootstrap peers for network joining
- Overlay networking
- Shared storage via NFS
- Centralized license validation
Use Cases:
- Production deployments
- Team collaboration
- High availability
- Scalable workloads
Model 3: Hybrid (Agent + HAP)
┌──────────────────────────────────────────────────────────┐
│ Production Environment                                    │
│                                                            │
│  Docker Swarm                    Developer Workstation    │
│  ┌──────────────┐               ┌──────────────┐         │
│  │ chorus-agent │               │ chorus-hap   │         │
│  │ (Alice)      │←─────P2P─────→│ (Human-Bob)  │         │
│  └──────┬───────┘               └──────────────┘         │
│         │                                                  │
│  ┌──────┴───────┐                                         │
│  │ chorus-agent │                                         │
│  │ (Carol)      │                                         │
│  └──────────────┘                                         │
│                                                            │
│  Autonomous agents run in swarm                           │
│  Human operator joins via HAP (local or remote)           │
│  Same P2P protocol, equal participants                    │
└──────────────────────────────────────────────────────────┘
Characteristics:
- Autonomous agents in production
- Human operators join as needed
- Collaborative decision-making
- HMMM meta-discussion
- Humans can override or guide
Use Cases:
- Supervised automation
- Human-in-the-loop workflows
- Critical decision points
- Training and oversight
Related Documents
Getting Started
- Commands Overview - Entry points and CLI tools
- Deployment Guide - How to deploy CHORUS
- Configuration - Environment variables and settings
Core Systems
- Task Execution Engine - Complete execution engine documentation
- P2P Infrastructure - libp2p networking details
- SLURP System - Distributed coordination
Security
- Security Architecture - Security layers and threat model
- Crypto Package - Encryption and key management
- SHHH - Secrets detection and redaction
- Licensing - License validation
Integration
- API Reference - HTTP API endpoints
- UCXL System - Context addressing
- AI Providers - AI integration
Next Steps
For detailed information on specific components:
- New to CHORUS? Start with System Architecture
- Want to deploy? See Deployment Guide
- Developing features? Review Component Map
- Understanding execution? Read Task Execution Engine