Commit Graph

6 Commits

Author SHA1 Message Date
anthonyrawlins
007aeb149a Replace all Printf logging with structured zerolog in runtime files
Migrates CHORUS logging to 100% structured JSON format with ISO 8601 timestamps
for all runtime-critical subsystems.

Files modified:
- internal/runtime/shared.go: SimpleTaskTracker task completion logging
- api/http_server.go: HTTP server, council opportunity, and status logging
- pubsub/pubsub.go: PubSub initialization, topic management, and message handlers
- discovery/mdns.go: mDNS peer discovery and connection logging

All Printf calls replaced with structured zerolog logging using:
- .Info() for informational messages
- .Warn() for warnings and errors
- .Debug() for verbose debug output
- Structured fields: peer_id, topic_name, council_id, etc.

Version bumped to 0.5.40

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-19 17:04:27 +11:00
anthonyrawlins
63dab5c4d4 Add LightRAG MCP integration for RAG-enhanced AI reasoning
This commit integrates LightRAG (Retrieval-Augmented Generation) MCP server
support into CHORUS, enabling graph-based knowledge retrieval to enrich AI
reasoning and context resolution.

## New Components

1. **LightRAG Client** (pkg/mcp/lightrag_client.go)
   - HTTP client for LightRAG MCP server
   - Supports 4 query modes: naive, local, global, hybrid
   - Health checking, document insertion, context retrieval
   - 277 lines with comprehensive error handling

2. **Integration Tests** (pkg/mcp/lightrag_client_test.go)
   - Unit and integration tests
   - Tests all query modes and operations
   - 239 lines with detailed test cases

3. **SLURP Context Enricher** (pkg/slurp/context/lightrag.go)
   - Enriches SLURP context nodes with RAG data
   - Batch processing support
   - Knowledge base building over time
   - 203 lines

4. **Documentation** (docs/LIGHTRAG_INTEGRATION.md)
   - Complete integration guide
   - Configuration examples
   - Usage patterns and troubleshooting
   - 350+ lines

## Modified Components

1. **Configuration** (pkg/config/config.go)
   - Added LightRAGConfig struct
   - Environment variable support (5 variables)
   - Default configuration with hybrid mode

2. **Reasoning Engine** (reasoning/reasoning.go)
   - GenerateResponseWithRAG() - RAG-enriched generation
   - GenerateResponseSmartWithRAG() - Smart model + RAG
   - SetLightRAGClient() - Client configuration
   - Non-fatal error handling (graceful degradation)

3. **Runtime Initialization** (internal/runtime/shared.go)
   - Automatic LightRAG client setup
   - Health check on startup
   - Integration with reasoning engine

## Configuration

Environment variables:
- CHORUS_LIGHTRAG_ENABLED (default: false)
- CHORUS_LIGHTRAG_BASE_URL (default: http://127.0.0.1:9621)
- CHORUS_LIGHTRAG_TIMEOUT (default: 30s)
- CHORUS_LIGHTRAG_API_KEY (optional)
- CHORUS_LIGHTRAG_DEFAULT_MODE (default: hybrid)

## Features

-  Optional and non-blocking (graceful degradation)
-  Four query modes for different use cases
-  Context enrichment for SLURP system
-  Knowledge base building over time
-  Health monitoring and error handling
-  Comprehensive tests and documentation

## Testing

LightRAG server tested at http://127.0.0.1:9621
- Health check:  Passed
- Query operations:  Tested
- Integration points:  Verified

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-09-30 23:56:09 +10:00
anthonyrawlins
17673c38a6 fix: P2P connectivity regression + dynamic versioning system
## P2P Connectivity Fixes
- **Root Cause**: mDNS discovery was conditionally disabled in Task Execution Engine implementation
- **Solution**: Restored always-enabled mDNS discovery from working baseline (eb2e05f)
- **Result**: 9/9 Docker Swarm replicas with working P2P mesh, democratic elections, and leader consensus

## Dynamic Version System
- **Problem**: Hardcoded version "0.1.0-dev" in 1000+ builds made debugging impossible
- **Solution**: Implemented build-time version injection via ldflags
- **Features**: Shows commit hash, build date, and semantic version
- **Example**: `CHORUS-agent 0.5.5 (build: 9dbd361, 2025-09-26_05:55:55)`

## Container Compatibility
- **Issue**: Binary execution failed in Alpine due to glibc/musl incompatibility
- **Solution**: Added Ubuntu-based Dockerfile for proper glibc support
- **Benefit**: Reliable container execution across Docker Swarm nodes

## Key Changes
- `internal/runtime/shared.go`: Always enable mDNS discovery, dynamic version vars
- `cmd/agent/main.go`: Build-time version injection and display
- `p2p/node.go`: Restored working "🐝 Bzzz Node Status" logging format
- `Makefile`: Updated version to 0.5.5, proper ldflags configuration
- `Dockerfile.ubuntu`: New glibc-compatible container base
- `docker-compose.yml`: Updated to latest image tag for Watchtower auto-updates

## Verification
 P2P mesh connectivity: Peers exchanging availability broadcasts
 Democratic elections: Candidacy announcements and leader selection
 BACKBEAT integration: Beat synchronization and degraded mode handling
 Dynamic versioning: All containers show v0.5.5 with build metadata
 Task Execution Engine: All Phase 4 functionality preserved and working

Fixes P2P connectivity regression while preserving complete Task Execution Engine implementation.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-09-26 16:05:25 +10:00
anthonyrawlins
e523c4b543 feat: Implement CHORUS scaling improvements for robust autoscaling
Address WHOOSH issue #7 with comprehensive scaling optimizations to prevent
license server, bootstrap peer, and control plane collapse during fast scale-out.

HIGH-RISK FIXES (Must-Do):
 License gate already implemented with cache + circuit breaker + grace window
 mDNS disabled in container environments (CHORUS_MDNS_ENABLED=false)
 Connection rate limiting (5 dials/sec, 16 concurrent DHT queries)
 Connection manager with watermarks (32 low, 128 high)
 AutoNAT enabled for container networking

MEDIUM-RISK FIXES (Next Priority):
 Assignment merge layer with HTTP/file config + SIGHUP reload
 Runtime configuration system with WHOOSH assignment API support
 Election stability windows to prevent churn:
  - CHORUS_ELECTION_MIN_TERM=30s (minimum time between elections)
  - CHORUS_LEADER_MIN_TERM=45s (minimum time before challenging healthy leader)
 Bootstrap pool JSON support with priority sorting and join stagger

NEW FEATURES:
- Runtime config system with assignment overrides from WHOOSH
- SIGHUP reload handler for live configuration updates
- JSON bootstrap configuration with peer metadata (region, roles, priority)
- Configurable election stability windows with environment variables
- Multi-format bootstrap support: Assignment → JSON → CSV

FILES MODIFIED:
- pkg/config/assignment.go (NEW): Runtime assignment merge system
- docker/bootstrap.json (NEW): Example JSON bootstrap configuration
- pkg/election/election.go: Added stability windows and churn prevention
- internal/runtime/shared.go: Integrated assignment loading and conditional mDNS
- p2p/node.go: Added connection management and rate limiting
- pkg/config/hybrid_config.go: Added rate limiting configuration fields
- docker/docker-compose.yml: Updated environment variables and configs
- README.md: Updated status table with scaling milestone

This implementation enables wave-based autoscaling without system collapse,
addressing all scaling concerns from WHOOSH issue #7.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-09-23 17:50:40 +10:00
anthonyrawlins
1bb736c09a Harden CHORUS security and messaging stack 2025-09-20 23:21:35 +10:00
anthonyrawlins
1806a4fe09 feat(prompts): load system prompts and defaults from Docker volume; set runtime system prompt; add BACKBEAT standards 2025-09-06 15:42:41 +10:00