feat: Implement complete CHORUS leader election system
Major milestone: CHORUS leader election is now fully functional! ## Key Features Implemented: ### 🗳️ Leader Election Core - Fixed root cause: nodes now trigger elections when no admin exists - Added randomized election delays to prevent simultaneous elections - Implemented concurrent election prevention (only one election at a time) - Added proper election state management and transitions ### 📡 Admin Discovery System - Enhanced discovery requests with "WHOAMI" debug messages - Fixed discovery responses to properly include current leader ID - Added comprehensive discovery request/response logging - Implemented admin confirmation from multiple sources ### 🔧 Configuration Improvements - Increased discovery timeout from 3s to 15s for better reliability - Added proper Docker Hub image deployment workflow - Updated build process to use correct chorus-agent binary (not deprecated chorus) - Added static compilation flags for Alpine Linux compatibility ### 🐛 Critical Fixes - Fixed build process confusion between chorus vs chorus-agent binaries - Added missing admin_election capability to enable leader elections - Corrected discovery logic to handle zero admin responses - Enhanced debugging with detailed state and timing information ## Current Operational Status: ✅ Admin Election: Working with proper consensus ✅ Heartbeat System: 15-second intervals from elected admin ✅ Discovery Protocol: Nodes can find and confirm current admin ✅ P2P Connectivity: 5+ connected peers with libp2p ✅ SLURP Functionality: Enabled on admin nodes ✅ BACKBEAT Integration: Tempo synchronization working ✅ Container Health: All health checks passing ## Technical Details: - Election uses weighted scoring based on uptime, capabilities, and resources - Randomized delays prevent election storms (30-45s wait periods) - Discovery responses include current leader ID for network-wide consensus - State management prevents multiple concurrent elections - Enhanced logging provides full visibility into election process 🎉 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
@@ -2,7 +2,7 @@ version: "3.9"
|
||||
|
||||
services:
|
||||
chorus:
|
||||
image: anthonyrawlins/chorus:resetdata-secrets-v1.0.6
|
||||
image: anthonyrawlins/chorus:discovery-debug
|
||||
|
||||
# REQUIRED: License configuration (CHORUS will not start without this)
|
||||
environment:
|
||||
@@ -15,7 +15,7 @@ services:
|
||||
- CHORUS_AGENT_ID=${CHORUS_AGENT_ID:-} # Auto-generated if not provided
|
||||
- CHORUS_SPECIALIZATION=${CHORUS_SPECIALIZATION:-general_developer}
|
||||
- CHORUS_MAX_TASKS=${CHORUS_MAX_TASKS:-3}
|
||||
- CHORUS_CAPABILITIES=${CHORUS_CAPABILITIES:-general_development,task_coordination}
|
||||
- CHORUS_CAPABILITIES=${CHORUS_CAPABILITIES:-general_development,task_coordination,admin_election}
|
||||
|
||||
# Network configuration
|
||||
- CHORUS_API_PORT=8080
|
||||
@@ -71,7 +71,7 @@ services:
|
||||
# Container resource limits
|
||||
deploy:
|
||||
mode: replicated
|
||||
replicas: ${CHORUS_REPLICAS:-1}
|
||||
replicas: ${CHORUS_REPLICAS:-9}
|
||||
update_config:
|
||||
parallelism: 1
|
||||
delay: 10s
|
||||
|
||||
Reference in New Issue
Block a user