Files
CHORUS/docs/comprehensive/commands/chorus-agent.md
anthonyrawlins bd19709b31 docs: Add comprehensive documentation foundation (Phase 1: Architecture & Commands)
Created complete documentation infrastructure with master index and detailed
command-line tool documentation.

Documentation Structure:
- docs/comprehensive/README.md - Master index with navigation
- docs/comprehensive/architecture/README.md - System architecture overview
- docs/comprehensive/commands/chorus-agent.md - Autonomous agent binary ( Production)
- docs/comprehensive/commands/chorus-hap.md - Human Agent Portal (🔶 Beta)
- docs/comprehensive/commands/chorus.md - Deprecated wrapper (⚠️ Deprecated)

Coverage Statistics:
- 3 command binaries fully documented (3,056 lines, ~14,500 words)
- Complete source code analysis with line numbers
- Configuration reference for all environment variables
- Runtime behavior and execution flows
- P2P networking details
- Health checks and monitoring
- Example deployments (local, Docker, Swarm)
- Troubleshooting guides
- Cross-references between docs

Key Features Documented:
- Container-first architecture
- P2P mesh networking
- Democratic leader election
- Docker sandbox execution
- HMMM collaborative reasoning
- UCXL decision publishing
- DHT encrypted storage
- Multi-layer security
- Human-agent collaboration

Implementation Status Tracking:
-  Production features marked
- 🔶 Beta features identified
-  Stubbed components noted
- ⚠️ Deprecated code flagged

Next Phase: Package documentation (30+ packages in pkg/)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-09-30 13:49:46 +10:00

19 KiB

chorus-agent - Autonomous Agent Binary

Binary: chorus-agent Source: cmd/agent/main.go Status: Production Purpose: Autonomous AI agent for P2P task coordination


Overview

chorus-agent is the primary executable for running autonomous AI agents in the CHORUS system. Agents participate in peer-to-peer networks, execute tasks in isolated Docker sandboxes, collaborate with other agents via HMMM protocol, and maintain distributed state through DHT storage.

Key Features

  • Autonomous Operation: Executes tasks without human intervention
  • P2P Networking: Participates in distributed mesh network
  • Docker Sandboxing: Isolated code execution environments
  • HMMM Reasoning: Collaborative meta-discussion protocol
  • DHT Storage: Encrypted distributed data storage
  • UCXL Publishing: Immutable decision recording
  • Democratic Elections: Participates in leader election
  • Health Monitoring: Self-reporting health status

Usage

Basic Invocation

# With required environment variables
CHORUS_LICENSE_ID=dev-123 \
CHORUS_AGENT_ID=chorus-agent-1 \
./chorus-agent

Help Output

$ ./chorus-agent --help
CHORUS-agent 0.5.0-dev (build: abc123, 2025-09-30)

Usage:
  chorus-agent [--help] [--version]

CHORUS Autonomous Agent - P2P Task Coordination

This binary runs autonomous AI agents that participate in P2P task coordination,
collaborative reasoning via HMMM, and distributed decision making.

Environment (common):
  CHORUS_LICENSE_ID              (required)
  CHORUS_AGENT_ID                (optional; auto-generated if empty)
  CHORUS_P2P_PORT                (default 9000)
  CHORUS_API_PORT                (default 8080)
  CHORUS_HEALTH_PORT             (default 8081)
  CHORUS_DHT_ENABLED             (default true)
  CHORUS_BOOTSTRAP_PEERS         (comma-separated multiaddrs)
  OLLAMA_ENDPOINT                (default http://localhost:11434)

Example:
  CHORUS_LICENSE_ID=dev-123 \
  CHORUS_AGENT_ID=chorus-agent-1 \
  CHORUS_P2P_PORT=9000 CHORUS_API_PORT=8080 ./chorus-agent

Agent Features:
  - Autonomous task execution
  - P2P mesh networking
  - HMMM collaborative reasoning
  - DHT encrypted storage
  - UCXL context addressing
  - Democratic leader election
  - Health monitoring

Version Information

$ ./chorus-agent --version
CHORUS-agent 0.5.0-dev (build: abc123, 2025-09-30)

Source Code Analysis

File: cmd/agent/main.go

Lines: 79 Package: main Imports:

  • chorus/internal/runtime - Shared P2P runtime infrastructure

Build-Time Variables

// Lines 11-16
var (
    version    = "0.5.0-dev"
    commitHash = "unknown"
    buildDate  = "unknown"
)

Set via ldflags:

go build -ldflags "-X main.version=1.0.0 -X main.commitHash=$(git rev-parse --short HEAD) -X main.buildDate=$(date -u +%Y-%m-%d)"

main() Function Flow

func main() {
    // 1. CLI Argument Handling (lines 19-59)
    //    - Check for --help, -h, help
    //    - Check for --version, -v
    //    - Print usage and exit early if found

    // 2. Set Build Information (lines 61-64)
    runtime.AppVersion = version
    runtime.AppCommitHash = commitHash
    runtime.AppBuildDate = buildDate

    // 3. Initialize Shared Runtime (lines 66-72)
    sharedRuntime, err := runtime.Initialize("agent")
    if err != nil {
        // Fatal error, exit 1
    }
    defer sharedRuntime.Cleanup()

    // 4. Start Agent Mode (lines 74-78)
    if err := sharedRuntime.StartAgentMode(); err != nil {
        // Fatal error, exit 1
    }
}

Execution Phases

Phase 1: Early CLI Handling (lines 19-59)

Purpose: Handle help/version requests without loading configuration

Code:

for _, a := range os.Args[1:] {
    switch a {
    case "--help", "-h", "help":
        // Print detailed help message
        fmt.Printf("%s-agent %s (build: %s, %s)\n\n", runtime.AppName, version, commitHash, buildDate)
        // ... usage information ...
        return
    case "--version", "-v":
        fmt.Printf("%s-agent %s (build: %s, %s)\n", runtime.AppName, version, commitHash, buildDate)
        return
    }
}

Why Important: Allows users to get help without needing valid license or configuration.

Phase 2: Runtime Initialization (line 67)

Function Call: runtime.Initialize("agent")

What Happens:

  1. Load configuration from environment variables
  2. Validate CHORUS license with KACHING server
  3. Initialize AI provider (Ollama or ResetData)
  4. Create P2P libp2p node
  5. Start mDNS discovery
  6. Initialize PubSub messaging
  7. Setup DHT (if enabled)
  8. Start election manager
  9. Create task coordinator
  10. Start HTTP API server
  11. Start UCXI server (if enabled)
  12. Initialize health checks
  13. Setup SHHH sentinel (secrets detection)
  14. Configure metrics collection

Returns: *runtime.SharedRuntime containing all initialized components

See: internal/runtime Documentation for complete initialization details

Phase 3: Agent Mode Activation (line 75)

Function Call: sharedRuntime.StartAgentMode()

What Happens:

  1. Agent registers itself as available for tasks
  2. Begins listening for task assignments via PubSub
  3. Starts autonomous task execution loops
  4. Enables automatic decision making
  5. Activates HMMM meta-discussion participation
  6. Begins heartbeat broadcasting for election

Implementation: See internal/runtime/agent_support.go

Behavior Differences from HAP:

  • Agent: Automatically accepts and executes tasks
  • HAP: Prompts human for task approval

Configuration

Required Environment Variables

Variable Description Example
CHORUS_LICENSE_ID License key from KACHING dev-123

Optional Environment Variables

Variable Default Description
CHORUS_AGENT_ID Auto-generated Unique agent identifier
CHORUS_P2P_PORT 9000 libp2p listening port
CHORUS_API_PORT 8080 HTTP API port
CHORUS_HEALTH_PORT 8081 Health check port
CHORUS_DHT_ENABLED true Enable distributed hash table
CHORUS_BOOTSTRAP_PEERS "" Comma-separated multiaddrs
OLLAMA_ENDPOINT http://localhost:11434 Ollama API endpoint

Role-Based Configuration

Variable Default Description
CHORUS_AGENT_ROLE "" Agent role (admin, developer, reviewer)
CHORUS_AGENT_EXPERTISE "" Comma-separated expertise areas
CHORUS_AGENT_REPORTS_TO "" Supervisor agent ID
CHORUS_AGENT_SPECIALIZATION "general" Task specialization
CHORUS_AGENT_MAX_TASKS 3 Max concurrent tasks

AI Provider Configuration

Ollama (Default)

export CHORUS_AI_PROVIDER=ollama
export OLLAMA_ENDPOINT=http://192.168.1.72:11434

ResetData

export CHORUS_AI_PROVIDER=resetdata
export RESETDATA_API_KEY=your-api-key-here
export RESETDATA_BASE_URL=https://api.resetdata.ai
export RESETDATA_MODEL=claude-3-5-sonnet-20250930

Assignment Loading

Agents can load dynamic configuration from WHOOSH:

export ASSIGN_URL=https://whoosh.example.com/api/assignments/agent-123.json

When configured, agents:

  1. Fetch assignment JSON on startup
  2. Merge with environment config
  3. Listen for SIGHUP to reload
  4. Update configuration without restart

See: Configuration Management for assignment schema


Runtime Behavior

Startup Sequence

1. Parse CLI arguments
   ├─→ --help → print help, exit 0
   ├─→ --version → print version, exit 0
   └─→ (none) → continue

2. Set build information in runtime package

3. Initialize shared runtime
   ├─→ Load environment configuration
   ├─→ Validate license with KACHING
   │   └─→ FAIL → print error, exit 1
   ├─→ Configure AI provider
   ├─→ Create P2P node
   ├─→ Start mDNS discovery
   ├─→ Initialize PubSub
   ├─→ Setup DHT (optional)
   ├─→ Start election manager
   ├─→ Create task coordinator
   ├─→ Start HTTP API server
   └─→ Initialize health checks

4. Start agent mode
   ├─→ Register as available agent
   ├─→ Join task coordination topics
   ├─→ Begin heartbeat broadcasting
   ├─→ Enable autonomous task execution
   └─→ Activate HMMM participation

5. Run until signal (SIGINT, SIGTERM)

6. Cleanup on shutdown
   ├─→ Stop accepting new tasks
   ├─→ Complete in-flight tasks
   ├─→ Close P2P connections
   ├─→ Flush DHT cache
   ├─→ Stop HTTP servers
   └─→ Exit gracefully

Signal Handling

Signal Behavior
SIGINT Graceful shutdown (complete current tasks)
SIGTERM Graceful shutdown (complete current tasks)
SIGHUP Reload configuration from ASSIGN_URL

Task Execution Loop

Once in agent mode:

Loop Forever:
│
├─→ Listen for tasks on PubSub topic "chorus/tasks"
│
├─→ Task received:
│   ├─→ Check agent availability (< max tasks)
│   ├─→ Check task matches specialization
│   └─→ Accept or decline
│
├─→ Task accepted:
│   ├─→ Increment active task count
│   ├─→ Log task start to Hypercore
│   ├─→ Invoke execution engine
│   │   ├─→ Select Docker image based on language
│   │   ├─→ Create sandbox container
│   │   ├─→ Execute commands via Docker Exec API
│   │   ├─→ Stream output
│   │   ├─→ Monitor resource usage
│   │   └─→ Capture results
│   ├─→ Generate task summary
│   ├─→ Create UCXL decision record
│   ├─→ Publish decision to DHT
│   ├─→ Broadcast completion on PubSub
│   ├─→ Decrement active task count
│   └─→ Log task completion to Hypercore
│
└─→ Continue listening

See: Task Execution Engine for execution details


P2P Networking

Peer Discovery

mDNS (Local):

  • Discovers peers on local network
  • Service name: chorus-peer-discovery
  • No configuration required
  • Automatic peer connection

DHT (Global):

  • Discovers peers across networks
  • Requires bootstrap peers
  • Content-addressed routing
  • Kademlia-based DHT

Bootstrap Peers:

export CHORUS_BOOTSTRAP_PEERS="/ip4/192.168.1.100/tcp/9000/p2p/12D3KooWABC...,/ip4/192.168.1.101/tcp/9000/p2p/12D3KooWXYZ..."

Topics Subscribed

Topic Purpose
chorus/coordination/v1 Task coordination messages
hmmm/meta-discussion/v1 Collaborative reasoning
chorus/election/v1 Leader election heartbeats
chorus/decisions Decision announcements
chorus/health Health status broadcasts

Role-Based Topics (Optional)

If CHORUS_AGENT_ROLE is set, agent also joins:

Topic Purpose
chorus/role/{role} Role-specific coordination
chorus/expertise/{expertise} Expertise-based routing
chorus/reports/{supervisor} Reporting hierarchy

Health Checks

HTTP Endpoints

Liveness Probe:

curl http://localhost:8081/healthz
# Returns: 200 OK if agent is alive

Readiness Probe:

curl http://localhost:8081/ready
# Returns: 200 OK if agent is ready for tasks
# Returns: 503 Service Unavailable if at max capacity

Health Details:

curl http://localhost:8081/health
# Returns JSON with:
# - P2P connectivity status
# - DHT reachability
# - Active task count
# - Available capacity
# - Last heartbeat time

Health Criteria

Agent is healthy when:

  • License valid
  • P2P node connected
  • At least 1 peer discovered
  • Election manager running
  • Task coordinator active
  • HTTP API responding

Agent is ready when:

  • All health checks pass
  • Active tasks < max tasks
  • Docker daemon reachable
  • AI provider accessible

See: Health Package


Monitoring & Metrics

Prometheus Metrics

Exposed on http://localhost:8080/metrics:

Task Metrics:

  • chorus_tasks_active - Current active tasks
  • chorus_tasks_completed_total - Total completed tasks
  • chorus_tasks_failed_total - Total failed tasks
  • chorus_task_duration_seconds - Task execution duration histogram

P2P Metrics:

  • chorus_peers_connected - Number of connected peers
  • chorus_pubsub_messages_sent_total - PubSub messages sent
  • chorus_pubsub_messages_received_total - PubSub messages received
  • chorus_dht_queries_total - DHT query count
  • chorus_dht_cache_hits_total - DHT cache hits
  • chorus_dht_cache_misses_total - DHT cache misses

Execution Metrics:

  • chorus_sandbox_containers_active - Active Docker containers
  • chorus_sandbox_cpu_usage - Container CPU usage
  • chorus_sandbox_memory_usage_bytes - Container memory usage

Security Metrics:

  • chorus_shhh_findings_total - Secrets detected by SHHH
  • chorus_license_checks_total - License validation attempts
  • chorus_license_failures_total - Failed license validations

See: Metrics Package


Integration Points

WHOOSH Assignment System

Agents can load dynamic assignments from WHOOSH:

# Set assignment URL
export ASSIGN_URL=https://whoosh.example.com/api/assignments/agent-123.json

# Agent fetches assignment on startup
# Assignment JSON structure:
{
  "agent_id": "agent-123",
  "role": "developer",
  "expertise": ["rust", "go"],
  "reports_to": "agent-admin",
  "max_tasks": 5,
  "bootstrap_peers": [
    "/ip4/192.168.1.100/tcp/9000/p2p/12D3KooWABC..."
  ],
  "join_stagger_ms": 5000
}

# Reload with SIGHUP
kill -HUP $(pidof chorus-agent)

KACHING License Server

All agents validate licenses on startup:

# License validation flow
1. Agent starts with CHORUS_LICENSE_ID
2. Connects to KACHING server (from config)
3. Validates license is:
   - Valid and not expired
   - Assigned to correct cluster
   - Has required permissions
4. If invalid: agent exits with error
5. If valid: agent continues startup

See: Licensing

BACKBEAT Integration

Optional telemetry system for P2P operations:

export CHORUS_BACKBEAT_ENABLED=true
export CHORUS_BACKBEAT_ENDPOINT=http://backbeat.example.com

# When enabled, agent tracks:
# - P2P operation phases
# - DHT bootstrap timing
# - Election progression
# - Task execution phases

See: BACKBEAT Integration


Example Deployments

Local Development

#!/bin/bash
# Run local agent for development

export CHORUS_LICENSE_ID=dev-local-123
export CHORUS_AGENT_ID=dev-agent-1
export CHORUS_P2P_PORT=9000
export CHORUS_API_PORT=8080
export CHORUS_HEALTH_PORT=8081
export OLLAMA_ENDPOINT=http://localhost:11434
export CHORUS_DHT_ENABLED=false  # Disable DHT for local dev

./chorus-agent

Docker Container

FROM debian:bookworm-slim

# Install runtime dependencies
RUN apt-get update && apt-get install -y \
    ca-certificates \
    docker.io \
    && rm -rf /var/lib/apt/lists/*

# Copy binary
COPY chorus-agent /usr/local/bin/chorus-agent

# Expose ports
EXPOSE 9000 8080 8081

# Run as non-root
USER nobody

ENTRYPOINT ["/usr/local/bin/chorus-agent"]
docker run -d \
  --name chorus-agent-1 \
  -e CHORUS_LICENSE_ID=prod-123 \
  -e CHORUS_AGENT_ID=agent-1 \
  -v /var/run/docker.sock:/var/run/docker.sock \
  -p 9000:9000 \
  -p 8080:8080 \
  -p 8081:8081 \
  chorus-agent:latest

Docker Swarm Service

version: "3.8"
services:
  chorus-agent:
    image: registry.example.com/chorus-agent:1.0.0
    environment:
      CHORUS_LICENSE_ID: ${CHORUS_LICENSE_ID}
      CHORUS_P2P_PORT: 9000
      CHORUS_API_PORT: 8080
      CHORUS_DHT_ENABLED: "true"
      CHORUS_BOOTSTRAP_PEERS: "/ip4/192.168.1.100/tcp/9000/p2p/12D3KooWABC..."
      ASSIGN_URL: "https://whoosh.example.com/api/assignments/{{.Service.Name}}.{{.Task.Slot}}.json"
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro
      - /rust/containers/WHOOSH/prompts:/prompts:ro
    deploy:
      replicas: 3
      placement:
        constraints:
          - node.role == worker
    networks:
      - chorus-mesh
    ports:
      - target: 9000
        published: 9000
        mode: host

Troubleshooting

Agent Won't Start

Symptom: Agent exits immediately with error

Possible Causes:

  1. Invalid or missing license

    ❌ Failed to initialize CHORUS agent: license validation failed
    

    Fix: Check CHORUS_LICENSE_ID and KACHING server connectivity

  2. Docker socket not accessible

    ❌ Failed to create P2P node: failed to create Docker client
    

    Fix: Mount /var/run/docker.sock or check Docker daemon

  3. Port already in use

    ❌ Failed to initialize: bind: address already in use
    

    Fix: Change CHORUS_P2P_PORT or kill process on port

No Peer Discovery

Symptom: Agent starts but shows 0 connected peers

Possible Causes:

  1. mDNS blocked by firewall Fix: Allow UDP port 5353, or use bootstrap peers

  2. No bootstrap peers configured Fix: Set CHORUS_BOOTSTRAP_PEERS with valid multiaddrs

  3. Network isolation Fix: Ensure agents can reach each other on P2P ports

Tasks Not Executing

Symptom: Agent receives tasks but doesn't execute

Possible Causes:

  1. Agent at max capacity Check: curl localhost:8080/metrics | grep chorus_tasks_active Fix: Increase CHORUS_AGENT_MAX_TASKS

  2. Docker images not available Check: docker images | grep chorus Fix: Pull images: docker pull anthonyrawlins/chorus-rust-dev:latest

  3. Wrong specialization Check: Task language doesn't match agent expertise Fix: Adjust CHORUS_AGENT_EXPERTISE or remove specialization

High Memory Usage

Symptom: Agent consuming excessive memory

Possible Causes:

  1. DHT cache size too large Fix: Reduce CHORUS_DHT_CACHE_SIZE (default 100MB)

  2. Too many concurrent tasks Fix: Reduce CHORUS_AGENT_MAX_TASKS

  3. Memory leak in long-running containers Fix: Restart agent periodically or investigate task code



Implementation Status

Feature Status Notes
P2P Networking Production libp2p, mDNS, DHT
Task Execution Production Docker sandboxing
License Validation Production KACHING integration
HMMM Reasoning 🔶 Beta Collaborative meta-discussion
UCXL Publishing Production Decision recording
Election Production Democratic leader election
Health Checks Production Liveness & readiness
Metrics Production Prometheus format
Assignment Loading Production WHOOSH integration
SIGHUP Reload Production Dynamic reconfiguration
BACKBEAT Telemetry 🔶 Beta Optional P2P tracking

Last Updated: 2025-09-30