anthonyrawlins c5b7311a8b docs: Add Phase 3 coordination and infrastructure documentation
Comprehensive documentation for coordination, messaging, discovery, and internal systems.

Core Coordination Packages:
- pkg/election - Democratic leader election (uptime-based, heartbeat mechanism, SLURP integration)
- pkg/coordination - Meta-coordination with dependency detection (4 built-in rules)
- coordinator/ - Task orchestration and assignment (AI-powered scoring)
- discovery/ - mDNS peer discovery (automatic LAN detection)

Messaging & P2P Infrastructure:
- pubsub/ - GossipSub messaging (31 message types, role-based topics, HMMM integration)
- p2p/ - libp2p networking (DHT modes, connection management, security)

Monitoring & Health:
- pkg/metrics - Prometheus metrics (80+ metrics across 12 categories)
- pkg/health - Health monitoring (4 HTTP endpoints, enhanced checks, graceful degradation)

Internal Systems:
- internal/licensing - License validation (KACHING integration, cluster leases, fail-closed)
- internal/hapui - Human Agent Portal UI (9 commands, HMMM wizard, UCXL browser, decision voting)
- internal/backbeat - P2P operation telemetry (6 phases, beat synchronization, health reporting)

Documentation Statistics (Phase 3):
- 10 packages documented (~18,000 lines)
- 31 PubSub message types cataloged
- 80+ Prometheus metrics documented
- Complete API references with examples
- Integration patterns and best practices

Key Features Documented:
- Election: 5 triggers, candidate scoring (5 weighted components), stability windows
- Coordination: AI-powered dependency detection, cross-repo sessions, escalation handling
- PubSub: Topic patterns, message envelopes, SHHH redaction, Hypercore logging
- Metrics: All metric types with labels, Prometheus scrape config, alert rules
- Health: Liveness vs readiness, critical checks, Kubernetes integration
- Licensing: Grace periods, circuit breaker, cluster lease management
- HAP UI: Interactive terminal commands, HMMM composition wizard, web interface (beta)
- BACKBEAT: 6-phase operation tracking, beat budget estimation, drift detection

Implementation Status Marked:
-  Production: Election, metrics, health, licensing, pubsub, p2p, discovery, coordinator
- 🔶 Beta: HAP web interface, BACKBEAT telemetry, advanced coordination
- 🔷 Alpha: SLURP election scoring
- ⚠️ Experimental: Meta-coordination, AI-powered dependency detection

Progress: 22/62 files complete (35%)

Next Phase: AI providers, SLURP system, API layer, reasoning engine

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-09-30 18:27:39 +10:00
2025-09-02 19:53:33 +10:00
2025-09-06 14:47:41 +10:00

CHORUS Container-First Context Platform (Alpha)

CHORUS is the runtime that ties the CHORUS ecosystem together: libp2p mesh, DHT-backed storage, council/task coordination, and (eventually) SLURP contextual intelligence. The repository you are looking at is the in-progress container-first refactor. Several core systems boot today, but higher-level services (SLURP, SHHH, full HMMM routing) are still landing.

Current Status

Area Status Notes
libp2p node + PubSub Running internal/runtime/shared.go spins up the mesh, hypercore logging, availability broadcasts.
DHT + DecisionPublisher Running Encrypted storage wired through pkg/dht; decisions written via ucxl.DecisionPublisher.
Leader Election System FULLY FUNCTIONAL 🎉 MILESTONE: Complete admin election with consensus, discovery protocol, heartbeats, and SLURP activation!
SLURP (context intelligence) 🚧 Stubbed pkg/slurp/slurp.go contains TODOs for resolver, temporal graphs, intelligence. Leader integration scaffolding exists but uses placeholder IDs/request forwarding.
SHHH (secrets sentinel) 🚧 Sentinel live pkg/shhh redacts hypercore + PubSub payloads with audit + metrics hooks (policy replay TBD).
HMMM routing 🚧 Partial PubSub topics join, but capability/role announcements and HMMM router wiring are placeholders (internal/runtime/agent_support.go).

See docs/progress/CHORUS-WHOOSH-development-plan.md for the detailed build plan and docs/progress/CHORUS-WHOOSH-roadmap.md for sequencing.

Quick Start (Alpha)

The container-first workflows are still evolving; expect frequent changes.

git clone https://gitea.chorus.services/tony/CHORUS.git
cd CHORUS
cp docker/chorus.env.example docker/chorus.env
# adjust env vars (KACHING license, bootstrap peers, etc.)
docker compose -f docker/docker-compose.yml up --build

Youll get a single agent container with:

  • libp2p networking (mDNS + configured bootstrap peers)
  • election heartbeat
  • DHT storage (AGE-encrypted)
  • HTTP API + health endpoints

Missing today: SLURP context resolution, advanced SHHH policy replay, HMMM per-issue routing. Expect log warnings/TODOs for those paths.

🎉 Leader Election System (NEW!)

CHORUS now features a complete, production-ready leader election system:

Core Features

  • Consensus-based election with weighted scoring (uptime, capabilities, resources)
  • Admin discovery protocol for network-wide leader identification
  • Heartbeat system with automatic failover (15-second intervals)
  • Concurrent election prevention with randomized delays
  • SLURP activation on elected admin nodes

How It Works

  1. Bootstrap: Nodes start in idle state, no admin known
  2. Discovery: Nodes send discovery requests to find existing admin
  3. Election trigger: If no admin found after grace period, trigger election
  4. Candidacy: Eligible nodes announce themselves with capability scores
  5. Consensus: Network selects winner based on highest score
  6. Leadership: Winner starts heartbeats, activates SLURP functionality
  7. Monitoring: Nodes continuously verify admin health via heartbeats

Debugging

Use these log patterns to monitor election health:

# Monitor WHOAMI messages and leader identification
docker service logs CHORUS_chorus | grep "🤖 WHOAMI\|👑\|📡.*Discovered"

# Track election cycles
docker service logs CHORUS_chorus | grep "🗳️\|📢.*candidacy\|🏆.*winner"

# Watch discovery protocol
docker service logs CHORUS_chorus | grep "📩\|📤\|📥"

Roadmap Highlights

  1. Security substrate land SHHH sentinel, finish SLURP leader-only operations, validate COOEE enrolment (see roadmap Phase 1).
  2. Autonomous teams coordinate with WHOOSH for deployment telemetry + SLURP context export.
  3. UCXL + KACHING hook runtime telemetry into KACHING and enforce UCXL validator.

Track progress via the shared roadmap and weekly burndown dashboards.

  • WHOOSH council/team orchestration
  • KACHING telemetry/licensing
  • SLURP contextual intelligence prototypes
  • HMMM meta-discussion layer

Contributing

This repo is still alpha. Please coordinate via the roadmap tickets before landing changes. Major security/runtime decisions should include a Decision Record with a UCXL address so SLURP/BUBBLE can ingest it later.

Description
Container-First P2P Task Coordination System - Next generation distributed AI agent coordination designed for Docker/Kubernetes deployments
Readme 292 MiB
Languages
Go 97.7%
HTML 1.9%
Python 0.2%
Makefile 0.1%