docs: Add Phase 2 core package documentation (Execution, Config, Runtime, Security)
Comprehensive documentation for 7 critical packages covering execution engine, configuration management, runtime infrastructure, and security layers. Package Documentation Added: - pkg/execution - Complete task execution engine API (Docker sandboxing, image selection) - pkg/config - Configuration management (80+ env vars, dynamic assignments, SIGHUP reload) - internal/runtime - Shared P2P runtime (initialization, lifecycle, agent mode) - pkg/dht - Distributed hash table (LibP2P DHT, encrypted storage, bootstrap) - pkg/crypto - Cryptography (age encryption, key derivation, secure random) - pkg/ucxl - UCXL validation (decision publishing, content addressing, immutable audit) - pkg/shhh - Secrets management (sentinel, pattern matching, redaction, audit logging) Documentation Statistics (Phase 2): - 7 package files created (~12,000 lines total) - Complete API reference for all exported symbols - Line-by-line source code analysis - 30+ usage examples across packages - Implementation status tracking (Production/Beta/Alpha/TODO) - Cross-references to 20+ related documents Key Features Documented: - Docker Exec API usage (not SSH) for sandboxed execution - 4-tier language detection priority system - RuntimeConfig vs static Config with merge semantics - SIGHUP signal handling for dynamic reconfiguration - Graceful shutdown with dependency ordering - Age encryption integration (filippo.io/age) - DHT cache management and cleanup - UCXL address format (ucxl://) and decision schema - SHHH pattern matching and severity levels - Bootstrap peer priority (assignment > config > env) - Join stagger for thundering herd prevention Progress Tracking: - PROGRESS.md added with detailed completion status - Phase 1: 5 files complete (Foundation) - Phase 2: 7 files complete (Core Packages) - Total: 12 files, ~16,000 lines documented - Overall: 15% complete (12/62 planned files) Next Phase: Coordination & AI packages (pkg/slurp, pkg/election, pkg/ai, pkg/providers) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
941
docs/comprehensive/internal/runtime.md
Normal file
941
docs/comprehensive/internal/runtime.md
Normal file
@@ -0,0 +1,941 @@
|
||||
# internal/runtime - Shared P2P Runtime Infrastructure
|
||||
|
||||
**Package:** `internal/runtime`
|
||||
**Files:** `shared.go` (687 lines), `agent_support.go` (324 lines)
|
||||
**Status:** ✅ Production
|
||||
**Purpose:** Shared initialization and lifecycle management for all CHORUS binaries
|
||||
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
The `internal/runtime` package provides the **unified initialization and lifecycle management** infrastructure used by all CHORUS binaries (`chorus-agent`, `chorus-hap`). It consolidates:
|
||||
|
||||
- **Configuration loading** from environment variables
|
||||
- **License validation** with KACHING server
|
||||
- **P2P networking** setup (libp2p, mDNS, DHT)
|
||||
- **Component initialization** (PubSub, Election, Coordinator, API servers)
|
||||
- **Health monitoring** and graceful shutdown
|
||||
- **Dynamic reconfiguration** via SIGHUP signal
|
||||
|
||||
### Key Responsibilities
|
||||
|
||||
✅ Single initialization path for all binaries
|
||||
✅ Consistent component lifecycle management
|
||||
✅ Graceful shutdown with dependency ordering
|
||||
✅ Health monitoring and readiness checks
|
||||
✅ Dynamic assignment loading from WHOOSH
|
||||
✅ BACKBEAT telemetry integration
|
||||
✅ SHHH secrets detection setup
|
||||
|
||||
---
|
||||
|
||||
## Package Structure
|
||||
|
||||
### Files
|
||||
|
||||
| File | Lines | Purpose |
|
||||
|------|-------|---------|
|
||||
| `shared.go` | 687 | Main initialization, SharedRuntime, component setup |
|
||||
| `agent_support.go` | 324 | Agent mode behaviors, announcements, health checks |
|
||||
|
||||
### Build Variables
|
||||
|
||||
```go
|
||||
// Lines 36-42 in shared.go
|
||||
var (
|
||||
AppName = "CHORUS"
|
||||
AppVersion = "0.1.0-dev"
|
||||
AppCommitHash = "unknown"
|
||||
AppBuildDate = "unknown"
|
||||
)
|
||||
```
|
||||
|
||||
**Set by main packages:**
|
||||
```go
|
||||
// In cmd/agent/main.go or cmd/hap/main.go
|
||||
runtime.AppVersion = version
|
||||
runtime.AppCommitHash = commitHash
|
||||
runtime.AppBuildDate = buildDate
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Core Type: SharedRuntime
|
||||
|
||||
### Definition
|
||||
|
||||
```go
|
||||
// Lines 108-133 in shared.go
|
||||
type SharedRuntime struct {
|
||||
Config *config.Config
|
||||
RuntimeConfig *config.RuntimeConfig
|
||||
Logger *SimpleLogger
|
||||
Context context.Context
|
||||
Cancel context.CancelFunc
|
||||
Node *p2p.Node
|
||||
PubSub *pubsub.PubSub
|
||||
HypercoreLog *logging.HypercoreLog
|
||||
MDNSDiscovery *discovery.MDNSDiscovery
|
||||
BackbeatIntegration *backbeat.Integration
|
||||
DHTNode *dht.LibP2PDHT
|
||||
EncryptedStorage *dht.EncryptedDHTStorage
|
||||
DecisionPublisher *ucxl.DecisionPublisher
|
||||
ElectionManager *election.ElectionManager
|
||||
TaskCoordinator *coordinator.TaskCoordinator
|
||||
HTTPServer *api.HTTPServer
|
||||
UCXIServer *ucxi.Server
|
||||
HealthManager *health.Manager
|
||||
EnhancedHealth *health.EnhancedHealthChecks
|
||||
ShutdownManager *shutdown.Manager
|
||||
TaskTracker *SimpleTaskTracker
|
||||
Metrics *metrics.CHORUSMetrics
|
||||
Shhh *shhh.Sentinel
|
||||
}
|
||||
```
|
||||
|
||||
### Field Descriptions
|
||||
|
||||
| Field | Type | Purpose | Optional |
|
||||
|-------|------|---------|----------|
|
||||
| `Config` | `*config.Config` | Static configuration from env | No |
|
||||
| `RuntimeConfig` | `*config.RuntimeConfig` | Dynamic assignments | No |
|
||||
| `Logger` | `*SimpleLogger` | Basic logging interface | No |
|
||||
| `Context` | `context.Context` | Root context | No |
|
||||
| `Cancel` | `context.CancelFunc` | Cancellation function | No |
|
||||
| `Node` | `*p2p.Node` | libp2p host | No |
|
||||
| `PubSub` | `*pubsub.PubSub` | Message broadcasting | No |
|
||||
| `HypercoreLog` | `*logging.HypercoreLog` | Append-only event log | No |
|
||||
| `MDNSDiscovery` | `*discovery.MDNSDiscovery` | Local peer discovery | No |
|
||||
| `BackbeatIntegration` | `*backbeat.Integration` | P2P telemetry | Yes |
|
||||
| `DHTNode` | `*dht.LibP2PDHT` | Distributed hash table | Yes |
|
||||
| `EncryptedStorage` | `*dht.EncryptedDHTStorage` | Encrypted DHT wrapper | Yes |
|
||||
| `DecisionPublisher` | `*ucxl.DecisionPublisher` | UCXL decision recording | Yes |
|
||||
| `ElectionManager` | `*election.ElectionManager` | Leader election | No |
|
||||
| `TaskCoordinator` | `*coordinator.TaskCoordinator` | Task distribution | No |
|
||||
| `HTTPServer` | `*api.HTTPServer` | REST API | No |
|
||||
| `UCXIServer` | `*ucxi.Server` | UCXL content resolution | Yes |
|
||||
| `HealthManager` | `*health.Manager` | Health monitoring | No |
|
||||
| `EnhancedHealth` | `*health.EnhancedHealthChecks` | Advanced checks | Yes |
|
||||
| `ShutdownManager` | `*shutdown.Manager` | Graceful shutdown | No |
|
||||
| `TaskTracker` | `*SimpleTaskTracker` | Active task tracking | No |
|
||||
| `Metrics` | `*metrics.CHORUSMetrics` | Metrics collection | No |
|
||||
| `Shhh` | `*shhh.Sentinel` | Secrets detection | No |
|
||||
|
||||
---
|
||||
|
||||
## Initialization Flow
|
||||
|
||||
### Function: Initialize()
|
||||
|
||||
```go
|
||||
// Line 136 in shared.go
|
||||
func Initialize(appMode string) (*SharedRuntime, error)
|
||||
```
|
||||
|
||||
**Parameters:**
|
||||
- `appMode`: Either `"agent"` or `"hap"` to distinguish binary type
|
||||
|
||||
**Returns:**
|
||||
- `*SharedRuntime`: Fully initialized runtime with all components
|
||||
- `error`: If any critical component fails to initialize
|
||||
|
||||
### Initialization Phases
|
||||
|
||||
```
|
||||
Phase 1: Configuration (lines 136-199)
|
||||
├─→ Create SharedRuntime struct
|
||||
├─→ Initialize SimpleLogger
|
||||
├─→ Create root context
|
||||
├─→ Load configuration from environment (LoadFromEnvironment)
|
||||
├─→ Initialize RuntimeConfig for dynamic assignments
|
||||
├─→ Load assignment from WHOOSH if ASSIGN_URL set
|
||||
├─→ Start SIGHUP reload handler for runtime reconfiguration
|
||||
└─→ CRITICAL: Validate license with KACHING (lines 182-191)
|
||||
└─→ FATAL if license invalid
|
||||
|
||||
Phase 2: AI Provider (lines 193-198)
|
||||
├─→ Configure AI provider (Ollama or ResetData)
|
||||
├─→ Set model selection webhook
|
||||
└─→ Initialize prompt sources
|
||||
|
||||
Phase 3: Security (lines 201-213)
|
||||
├─→ Initialize metrics collector
|
||||
├─→ Create SHHH sentinel for secrets detection
|
||||
└─→ Set audit sink for redaction logging
|
||||
|
||||
Phase 4: BACKBEAT (lines 215-229)
|
||||
├─→ Create BACKBEAT integration (optional)
|
||||
├─→ Start beat synchronization if available
|
||||
└─→ Warn if unavailable (non-fatal)
|
||||
|
||||
Phase 5: P2P Node (lines 231-252)
|
||||
├─→ Create libp2p node (p2p.NewNode)
|
||||
├─→ Log node ID and listening addresses
|
||||
├─→ Initialize Hypercore append-only log
|
||||
└─→ Set SHHH redactor on Hypercore log
|
||||
|
||||
Phase 6: Discovery (lines 254-259)
|
||||
├─→ Create mDNS discovery service
|
||||
└─→ Service name: "chorus-peer-discovery"
|
||||
|
||||
Phase 7: PubSub (lines 261-284)
|
||||
├─→ Initialize PubSub with Hypercore logging
|
||||
├─→ Set SHHH redactor on PubSub
|
||||
├─→ Subscribe to default topics
|
||||
└─→ Join role-based topics if role configured
|
||||
|
||||
Phase 8: Election System (lines 286-289)
|
||||
├─→ Call initializeElectionSystem()
|
||||
└─→ See Election Initialization section below
|
||||
|
||||
Phase 9: DHT Storage (lines 291-293)
|
||||
├─→ Call initializeDHTStorage()
|
||||
└─→ See DHT Initialization section below
|
||||
|
||||
Phase 10: Services (lines 295-297)
|
||||
├─→ Call initializeServices()
|
||||
└─→ See Services Initialization section below
|
||||
|
||||
Return: Fully initialized SharedRuntime
|
||||
```
|
||||
|
||||
### Election Initialization
|
||||
|
||||
```go
|
||||
// Lines 347-401 in shared.go
|
||||
func (r *SharedRuntime) initializeElectionSystem() error
|
||||
```
|
||||
|
||||
**Process:**
|
||||
|
||||
1. **Create Election Manager** (line 349)
|
||||
```go
|
||||
electionManager := election.NewElectionManager(
|
||||
r.Context,
|
||||
r.Config,
|
||||
r.Node.Host(),
|
||||
r.PubSub,
|
||||
r.Node.ID().ShortString(),
|
||||
)
|
||||
```
|
||||
|
||||
2. **Set Callbacks** (lines 352-392)
|
||||
- **OnAdminChange**: Fired when admin changes
|
||||
- Logs admin transition
|
||||
- Tracks with BACKBEAT if available
|
||||
- If this node becomes admin:
|
||||
- Enables SLURP functionality
|
||||
- Applies admin role configuration
|
||||
|
||||
- **OnElectionComplete**: Fired when election finishes
|
||||
- Logs winner
|
||||
- Tracks with BACKBEAT if available
|
||||
|
||||
3. **Start Election Manager** (lines 394-399)
|
||||
```go
|
||||
if err := electionManager.Start(); err != nil {
|
||||
return fmt.Errorf("failed to start election manager: %v", err)
|
||||
}
|
||||
```
|
||||
|
||||
4. **Store Reference** (line 397)
|
||||
|
||||
### DHT Initialization
|
||||
|
||||
```go
|
||||
// Lines 403-521 in shared.go
|
||||
func (r *SharedRuntime) initializeDHTStorage() error
|
||||
```
|
||||
|
||||
**Process:**
|
||||
|
||||
1. **Check if DHT Enabled** (line 409)
|
||||
```go
|
||||
if r.Config.V2.DHT.Enabled {
|
||||
```
|
||||
|
||||
2. **Create DHT Node** (lines 411-417)
|
||||
```go
|
||||
dhtNode, err = dht.NewLibP2PDHT(r.Context, r.Node.Host())
|
||||
```
|
||||
|
||||
3. **Bootstrap DHT** (lines 419-435)
|
||||
- Track with BACKBEAT if available
|
||||
- Call `dhtNode.Bootstrap()`
|
||||
- Handle errors gracefully
|
||||
|
||||
4. **Connect to Bootstrap Peers** (lines 437-487)
|
||||
- Get bootstrap peers from RuntimeConfig (assignment overrides)
|
||||
- Fall back to static config if no assignment
|
||||
- Apply join stagger delay if configured (thundering herd prevention)
|
||||
- For each bootstrap peer:
|
||||
- Parse multiaddr
|
||||
- Extract peer info
|
||||
- Track with BACKBEAT if available
|
||||
- Connect via `r.Node.Host().Connect()`
|
||||
|
||||
5. **Initialize Encrypted Storage** (lines 489-500)
|
||||
```go
|
||||
encryptedStorage = dht.NewEncryptedDHTStorage(
|
||||
r.Context,
|
||||
r.Node.Host(),
|
||||
dhtNode,
|
||||
r.Config,
|
||||
r.Node.ID().ShortString(),
|
||||
)
|
||||
encryptedStorage.StartCacheCleanup(5 * time.Minute)
|
||||
```
|
||||
|
||||
6. **Initialize Decision Publisher** (lines 502-510)
|
||||
```go
|
||||
decisionPublisher = ucxl.NewDecisionPublisher(
|
||||
r.Context,
|
||||
r.Config,
|
||||
encryptedStorage,
|
||||
r.Node.ID().ShortString(),
|
||||
r.Config.Agent.ID,
|
||||
)
|
||||
```
|
||||
|
||||
7. **Store References** (lines 516-518)
|
||||
|
||||
### Services Initialization
|
||||
|
||||
```go
|
||||
// Lines 523-598 in shared.go
|
||||
func (r *SharedRuntime) initializeServices() error
|
||||
```
|
||||
|
||||
**Process:**
|
||||
|
||||
1. **Create Task Tracker** (lines 524-535)
|
||||
```go
|
||||
taskTracker := &SimpleTaskTracker{
|
||||
maxTasks: r.Config.Agent.MaxTasks,
|
||||
activeTasks: make(map[string]bool),
|
||||
}
|
||||
if r.DecisionPublisher != nil {
|
||||
taskTracker.decisionPublisher = r.DecisionPublisher
|
||||
}
|
||||
```
|
||||
|
||||
2. **Create Task Coordinator** (lines 537-550)
|
||||
```go
|
||||
taskCoordinator := coordinator.NewTaskCoordinator(
|
||||
r.Context,
|
||||
r.PubSub,
|
||||
r.HypercoreLog,
|
||||
r.Config,
|
||||
r.Node.ID().ShortString(),
|
||||
nil, // HMMM router placeholder
|
||||
taskTracker,
|
||||
)
|
||||
taskCoordinator.Start()
|
||||
```
|
||||
|
||||
3. **Start HTTP API Server** (lines 552-560)
|
||||
```go
|
||||
httpServer := api.NewHTTPServer(
|
||||
r.Config.Network.APIPort,
|
||||
r.HypercoreLog,
|
||||
r.PubSub,
|
||||
)
|
||||
go func() {
|
||||
if err := httpServer.Start(); err != nil && err != http.ErrServerClosed {
|
||||
r.Logger.Error("❌ HTTP server error: %v", err)
|
||||
}
|
||||
}()
|
||||
```
|
||||
|
||||
4. **Start UCXI Server (Optional)** (lines 562-596)
|
||||
- Only if UCXL enabled and server enabled in config
|
||||
- Create content storage directory
|
||||
- Initialize address resolver
|
||||
- Create UCXI server config
|
||||
- Start server in goroutine
|
||||
|
||||
---
|
||||
|
||||
## Agent Mode
|
||||
|
||||
### Function: StartAgentMode()
|
||||
|
||||
```go
|
||||
// Lines 33-84 in agent_support.go
|
||||
func (r *SharedRuntime) StartAgentMode() error
|
||||
```
|
||||
|
||||
**Purpose:** Activates autonomous agent behaviors after initialization
|
||||
|
||||
**Process:**
|
||||
|
||||
1. **Start Background Goroutines** (lines 34-37)
|
||||
```go
|
||||
go r.announceAvailability() // Broadcast work capacity every 30s
|
||||
go r.announceCapabilitiesOnChange() // Announce capabilities once
|
||||
go r.announceRoleOnStartup() // Announce role if configured
|
||||
```
|
||||
|
||||
2. **Start Status Reporter** (line 40)
|
||||
```go
|
||||
go r.statusReporter() // Log peer count every 60s
|
||||
```
|
||||
|
||||
3. **Setup Health & Shutdown** (lines 46-75)
|
||||
- Create shutdown manager (30s graceful timeout)
|
||||
- Create health manager
|
||||
- Register health checks (setupHealthChecks)
|
||||
- Register shutdown components (setupGracefulShutdown)
|
||||
- Start health monitoring
|
||||
- Start health HTTP server (port 8081)
|
||||
- Start shutdown manager
|
||||
|
||||
4. **Wait for Shutdown** (line 80)
|
||||
```go
|
||||
shutdownManager.Wait() // Blocks until SIGINT/SIGTERM
|
||||
```
|
||||
|
||||
### Availability Broadcasting
|
||||
|
||||
```go
|
||||
// Lines 86-116 in agent_support.go
|
||||
func (r *SharedRuntime) announceAvailability()
|
||||
```
|
||||
|
||||
**Behavior:**
|
||||
- Runs every 30 seconds
|
||||
- Publishes to PubSub topic: `AvailabilityBcast`
|
||||
- Payload:
|
||||
```go
|
||||
{
|
||||
"node_id": "12D3Koo...",
|
||||
"available_for_work": true/false,
|
||||
"current_tasks": 2,
|
||||
"max_tasks": 3,
|
||||
"last_activity": 1727712345,
|
||||
"status": "ready" | "working" | "busy",
|
||||
"timestamp": 1727712345
|
||||
}
|
||||
```
|
||||
|
||||
**Status Values:**
|
||||
- `"ready"`: 0 active tasks
|
||||
- `"working"`: 1+ tasks but < max
|
||||
- `"busy"`: At max capacity
|
||||
|
||||
### Capabilities Broadcasting
|
||||
|
||||
```go
|
||||
// Lines 129-165 in agent_support.go
|
||||
func (r *SharedRuntime) announceCapabilitiesOnChange()
|
||||
```
|
||||
|
||||
**Behavior:**
|
||||
- Runs once on startup
|
||||
- Publishes to PubSub topic: `CapabilityBcast`
|
||||
- Payload:
|
||||
```go
|
||||
{
|
||||
"agent_id": "chorus-agent-1",
|
||||
"node_id": "12D3Koo...",
|
||||
"version": "0.5.0-dev",
|
||||
"capabilities": ["code_execution", "git_operations"],
|
||||
"expertise": ["rust", "go"],
|
||||
"models": ["qwen2.5-coder:32b"],
|
||||
"specialization": "backend",
|
||||
"max_tasks": 3,
|
||||
"current_tasks": 0,
|
||||
"timestamp": 1727712345,
|
||||
"availability": "ready"
|
||||
}
|
||||
```
|
||||
|
||||
**TODO** (line 164): Watch for live capability changes and re-broadcast
|
||||
|
||||
### Role Broadcasting
|
||||
|
||||
```go
|
||||
// Lines 167-204 in agent_support.go
|
||||
func (r *SharedRuntime) announceRoleOnStartup()
|
||||
```
|
||||
|
||||
**Behavior:**
|
||||
- Runs once on startup (only if role configured)
|
||||
- Publishes to PubSub topic: `RoleAnnouncement`
|
||||
- Uses role-based message options
|
||||
- Payload:
|
||||
```go
|
||||
{
|
||||
"agent_id": "chorus-agent-1",
|
||||
"node_id": "12D3Koo...",
|
||||
"role": "developer",
|
||||
"expertise": ["rust", "go"],
|
||||
"capabilities": ["code_execution"],
|
||||
"reports_to": "admin-agent",
|
||||
"specialization": "backend",
|
||||
"timestamp": 1727712345
|
||||
}
|
||||
```
|
||||
|
||||
### Health Checks Setup
|
||||
|
||||
```go
|
||||
// Lines 206-264 in agent_support.go
|
||||
func (r *SharedRuntime) setupHealthChecks(healthManager *health.Manager)
|
||||
```
|
||||
|
||||
**Registered Checks:**
|
||||
|
||||
1. **BACKBEAT Health Check** (lines 208-236)
|
||||
- Name: `"backbeat"`
|
||||
- Interval: 30 seconds
|
||||
- Timeout: 10 seconds
|
||||
- Critical: No
|
||||
- Checks: Connection to BACKBEAT server
|
||||
- Only registered if BACKBEAT integration available
|
||||
|
||||
2. **Enhanced Health Checks** (lines 248-263)
|
||||
- Requires: PubSub, ElectionManager, DHTNode
|
||||
- Creates: `EnhancedHealthChecks` instance
|
||||
- Registers: Election, DHT, PubSub, Replication checks
|
||||
- See: `pkg/health` package for details
|
||||
|
||||
### Graceful Shutdown Setup
|
||||
|
||||
```go
|
||||
// Lines 266-323 in agent_support.go
|
||||
func (r *SharedRuntime) setupGracefulShutdown(
|
||||
shutdownManager *shutdown.Manager,
|
||||
healthManager *health.Manager,
|
||||
)
|
||||
```
|
||||
|
||||
**Shutdown Order** (by priority, higher = later):
|
||||
|
||||
| Priority | Component | Timeout | Critical |
|
||||
|----------|-----------|---------|----------|
|
||||
| 10 | HTTP API Server | Default | Yes |
|
||||
| 15 | Health Manager | Default | Yes |
|
||||
| 20 | UCXI Server | Default | Yes |
|
||||
| 30 | PubSub | Default | Yes |
|
||||
| 35 | DHT Node | Default | Yes |
|
||||
| 40 | P2P Node | Default | Yes |
|
||||
| 45 | Election Manager | Default | Yes |
|
||||
| 50 | BACKBEAT Integration | Default | Yes |
|
||||
|
||||
**Why This Order:**
|
||||
1. Stop accepting new requests (HTTP)
|
||||
2. Stop health reporting
|
||||
3. Stop content resolution (UCXI)
|
||||
4. Stop broadcasting messages (PubSub)
|
||||
5. Stop DHT queries/storage
|
||||
6. Close P2P connections
|
||||
7. Stop election participation
|
||||
8. Disconnect BACKBEAT telemetry
|
||||
|
||||
---
|
||||
|
||||
## Cleanup Flow
|
||||
|
||||
### Function: Cleanup()
|
||||
|
||||
```go
|
||||
// Lines 302-344 in shared.go
|
||||
func (r *SharedRuntime) Cleanup()
|
||||
```
|
||||
|
||||
**Manual Cleanup** (used if StartAgentMode not called):
|
||||
|
||||
```
|
||||
1. Stop BACKBEAT Integration (line 306)
|
||||
2. Close mDNS Discovery (lines 310-312)
|
||||
3. Close PubSub (lines 314-316)
|
||||
4. Close DHT Node (lines 318-320)
|
||||
5. Close P2P Node (lines 322-324)
|
||||
6. Stop HTTP Server (lines 326-328)
|
||||
7. Stop UCXI Server (lines 330-332)
|
||||
8. Stop Election Manager (lines 334-336)
|
||||
9. Cancel Context (lines 338-340)
|
||||
10. Log completion (line 343)
|
||||
```
|
||||
|
||||
**Note:** If `StartAgentMode()` is called, graceful shutdown manager handles cleanup automatically.
|
||||
|
||||
---
|
||||
|
||||
## Helper Types
|
||||
|
||||
### SimpleLogger
|
||||
|
||||
```go
|
||||
// Lines 44-57 in shared.go
|
||||
type SimpleLogger struct{}
|
||||
|
||||
func (l *SimpleLogger) Info(msg string, args ...interface{})
|
||||
func (l *SimpleLogger) Warn(msg string, args ...interface{})
|
||||
func (l *SimpleLogger) Error(msg string, args ...interface{})
|
||||
```
|
||||
|
||||
**Purpose:** Basic logging implementation for runtime components
|
||||
|
||||
**Output:** Uses `log.Printf()` with level prefixes
|
||||
|
||||
### SimpleTaskTracker
|
||||
|
||||
```go
|
||||
// Lines 59-106 in shared.go
|
||||
type SimpleTaskTracker struct {
|
||||
maxTasks int
|
||||
activeTasks map[string]bool
|
||||
decisionPublisher *ucxl.DecisionPublisher
|
||||
}
|
||||
```
|
||||
|
||||
**Methods:**
|
||||
|
||||
| Method | Purpose |
|
||||
|--------|---------|
|
||||
| `GetActiveTasks() []string` | Returns list of active task IDs |
|
||||
| `GetMaxTasks() int` | Returns max concurrent tasks |
|
||||
| `AddTask(taskID string)` | Marks task as active |
|
||||
| `RemoveTask(taskID string)` | Marks task complete, publishes decision |
|
||||
|
||||
**Decision Publishing:**
|
||||
- When task completes, publishes to DHT via UCXL
|
||||
- Only if `decisionPublisher` is set
|
||||
- Includes: task ID, success status, summary, modified files
|
||||
|
||||
---
|
||||
|
||||
## AI Provider Configuration
|
||||
|
||||
### Function: initializeAIProvider()
|
||||
|
||||
```go
|
||||
// Lines 620-686 in shared.go
|
||||
func initializeAIProvider(cfg *config.Config, logger *SimpleLogger) error
|
||||
```
|
||||
|
||||
**Supported Providers:**
|
||||
|
||||
1. **ResetData** (lines 627-640)
|
||||
```go
|
||||
reasoning.SetAIProvider("resetdata")
|
||||
reasoning.SetResetDataConfig(reasoning.ResetDataConfig{
|
||||
BaseURL: cfg.AI.ResetData.BaseURL,
|
||||
APIKey: cfg.AI.ResetData.APIKey,
|
||||
Model: cfg.AI.ResetData.Model,
|
||||
Timeout: cfg.AI.ResetData.Timeout,
|
||||
})
|
||||
```
|
||||
|
||||
2. **Ollama** (lines 642-644)
|
||||
```go
|
||||
reasoning.SetAIProvider("ollama")
|
||||
reasoning.SetOllamaEndpoint(cfg.AI.Ollama.Endpoint)
|
||||
```
|
||||
|
||||
3. **Default** (lines 646-660)
|
||||
- Falls back to ResetData if unknown provider
|
||||
- Logs warning
|
||||
|
||||
**Model Configuration** (lines 662-667):
|
||||
```go
|
||||
reasoning.SetModelConfig(
|
||||
cfg.Agent.Models,
|
||||
cfg.Agent.ModelSelectionWebhook,
|
||||
cfg.Agent.DefaultReasoningModel,
|
||||
)
|
||||
```
|
||||
|
||||
**Prompt Initialization** (lines 669-683):
|
||||
- Read prompts from `CHORUS_PROMPTS_DIR`
|
||||
- Read default instructions from `CHORUS_DEFAULT_INSTRUCTIONS_PATH`
|
||||
- Compose role-specific system prompt if role configured
|
||||
- Fall back to default instructions if no role
|
||||
|
||||
---
|
||||
|
||||
## SHHH Integration
|
||||
|
||||
### Audit Sink
|
||||
|
||||
```go
|
||||
// Lines 609-618 in shared.go
|
||||
type shhhAuditSink struct {
|
||||
logger *SimpleLogger
|
||||
}
|
||||
|
||||
func (s *shhhAuditSink) RecordRedaction(_ context.Context, event shhh.AuditEvent)
|
||||
```
|
||||
|
||||
**Purpose:** Logs all SHHH redaction events
|
||||
|
||||
**Log Format:**
|
||||
```
|
||||
[WARN] 🔒 SHHH redaction applied (rule=api_key severity=high path=/workspace/data/config.json)
|
||||
```
|
||||
|
||||
### Findings Observer
|
||||
|
||||
```go
|
||||
// Lines 600-607 in shared.go
|
||||
func (r *SharedRuntime) handleShhhFindings(ctx context.Context, findings []shhh.Finding)
|
||||
```
|
||||
|
||||
**Purpose:** Records SHHH findings in metrics
|
||||
|
||||
**Implementation:**
|
||||
```go
|
||||
for _, finding := range findings {
|
||||
r.Metrics.IncrementSHHHFindings(
|
||||
finding.Rule,
|
||||
string(finding.Severity),
|
||||
finding.Count,
|
||||
)
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Configuration Integration
|
||||
|
||||
### Environment Loading
|
||||
|
||||
**Performed in Initialize()** (line 149):
|
||||
```go
|
||||
cfg, err := config.LoadFromEnvironment()
|
||||
```
|
||||
|
||||
**See:** `pkg/config` documentation for complete environment variable reference
|
||||
|
||||
### Assignment Loading
|
||||
|
||||
**Dynamic Assignment** (lines 160-176):
|
||||
```go
|
||||
if assignURL := os.Getenv("ASSIGN_URL"); assignURL != "" {
|
||||
runtime.Logger.Info("📡 Loading assignment from WHOOSH: %s", assignURL)
|
||||
|
||||
ctx, cancel := context.WithTimeout(runtime.Context, 10*time.Second)
|
||||
if err := runtime.RuntimeConfig.LoadAssignment(ctx, assignURL); err != nil {
|
||||
runtime.Logger.Warn("⚠️ Failed to load assignment: %v", err)
|
||||
} else {
|
||||
runtime.Logger.Info("✅ Assignment loaded successfully")
|
||||
}
|
||||
cancel()
|
||||
|
||||
// Start reload handler for SIGHUP
|
||||
runtime.RuntimeConfig.StartReloadHandler(runtime.Context, assignURL)
|
||||
}
|
||||
```
|
||||
|
||||
**SIGHUP Reload:**
|
||||
- Send `kill -HUP <pid>` to reload assignment
|
||||
- No restart required
|
||||
- Updates: bootstrap peers, role, expertise, max tasks, etc.
|
||||
|
||||
---
|
||||
|
||||
## Usage Examples
|
||||
|
||||
### Example 1: Basic Initialization (Agent)
|
||||
|
||||
```go
|
||||
package main
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"os"
|
||||
"chorus/internal/runtime"
|
||||
)
|
||||
|
||||
func main() {
|
||||
// Set build info
|
||||
runtime.AppVersion = "1.0.0"
|
||||
runtime.AppCommitHash = "abc123"
|
||||
runtime.AppBuildDate = "2025-09-30"
|
||||
|
||||
// Initialize runtime
|
||||
rt, err := runtime.Initialize("agent")
|
||||
if err != nil {
|
||||
fmt.Fprintf(os.Stderr, "Failed to initialize: %v\n", err)
|
||||
os.Exit(1)
|
||||
}
|
||||
defer rt.Cleanup()
|
||||
|
||||
// Start agent mode (blocks until shutdown)
|
||||
if err := rt.StartAgentMode(); err != nil {
|
||||
fmt.Fprintf(os.Stderr, "Agent mode failed: %v\n", err)
|
||||
os.Exit(1)
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Example 2: Custom HAP Mode
|
||||
|
||||
```go
|
||||
func main() {
|
||||
runtime.AppVersion = "1.0.0"
|
||||
|
||||
rt, err := runtime.Initialize("hap")
|
||||
if err != nil {
|
||||
fmt.Fprintf(os.Stderr, "Failed to initialize: %v\n", err)
|
||||
os.Exit(1)
|
||||
}
|
||||
defer rt.Cleanup()
|
||||
|
||||
// HAP mode: manual interaction instead of StartAgentMode()
|
||||
terminal := hapui.NewTerminalInterface(rt)
|
||||
if err := terminal.Start(); err != nil {
|
||||
fmt.Fprintf(os.Stderr, "Terminal failed: %v\n", err)
|
||||
os.Exit(1)
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Example 3: Accessing Components
|
||||
|
||||
```go
|
||||
func main() {
|
||||
rt, _ := runtime.Initialize("agent")
|
||||
defer rt.Cleanup()
|
||||
|
||||
// Access initialized components
|
||||
nodeID := rt.Node.ID().ShortString()
|
||||
fmt.Printf("Node ID: %s\n", nodeID)
|
||||
|
||||
// Publish custom message
|
||||
rt.PubSub.Publish("chorus/custom", []byte("hello"))
|
||||
|
||||
// Store data in DHT
|
||||
if rt.EncryptedStorage != nil {
|
||||
rt.EncryptedStorage.Put(context.Background(), "key", []byte("value"))
|
||||
}
|
||||
|
||||
// Check if this node is admin
|
||||
if rt.ElectionManager.IsAdmin() {
|
||||
fmt.Println("This node is admin")
|
||||
}
|
||||
|
||||
// Start agent behaviors
|
||||
rt.StartAgentMode()
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Implementation Status
|
||||
|
||||
| Feature | Status | Notes |
|
||||
|---------|--------|-------|
|
||||
| **Initialization** | ✅ Production | Complete initialization flow |
|
||||
| **Configuration Loading** | ✅ Production | Environment + assignments |
|
||||
| **License Validation** | ✅ Production | KACHING integration |
|
||||
| **P2P Node Setup** | ✅ Production | libp2p, mDNS, DHT |
|
||||
| **PubSub Initialization** | ✅ Production | Topic subscriptions |
|
||||
| **Election System** | ✅ Production | Democratic election |
|
||||
| **DHT Storage** | ✅ Production | Encrypted distributed storage |
|
||||
| **Task Coordination** | ✅ Production | Work distribution |
|
||||
| **HTTP API Server** | ✅ Production | REST endpoints |
|
||||
| **UCXI Server** | 🔶 Beta | Optional content resolution |
|
||||
| **Health Monitoring** | ✅ Production | Liveness & readiness |
|
||||
| **Graceful Shutdown** | ✅ Production | Dependency-ordered cleanup |
|
||||
| **BACKBEAT Integration** | 🔶 Beta | Optional P2P telemetry |
|
||||
| **SHHH Sentinel** | ✅ Production | Secrets detection |
|
||||
| **Metrics Collection** | ✅ Production | Prometheus format |
|
||||
| **Agent Mode** | ✅ Production | Autonomous behaviors |
|
||||
| **Availability Broadcasting** | ✅ Production | Every 30s |
|
||||
| **Capabilities Broadcasting** | ✅ Production | On startup |
|
||||
| **Role Broadcasting** | ✅ Production | On startup if configured |
|
||||
| **SIGHUP Reload** | ✅ Production | Dynamic reconfiguration |
|
||||
| **Live Capability Updates** | ❌ TODO | Re-broadcast on config change |
|
||||
|
||||
---
|
||||
|
||||
## Error Handling
|
||||
|
||||
### Critical Errors (Fatal)
|
||||
|
||||
These errors cause immediate exit:
|
||||
|
||||
1. **Configuration Loading Failure** (line 151)
|
||||
```
|
||||
❌ Configuration error: <details>
|
||||
```
|
||||
|
||||
2. **License Validation Failure** (line 189)
|
||||
```
|
||||
❌ License validation failed: <details>
|
||||
```
|
||||
|
||||
3. **P2P Node Creation Failure** (line 234)
|
||||
```
|
||||
❌ Failed to create P2P node: <details>
|
||||
```
|
||||
|
||||
4. **PubSub Initialization Failure** (line 264)
|
||||
```
|
||||
❌ Failed to create PubSub: <details>
|
||||
```
|
||||
|
||||
### Non-Critical Errors (Warnings)
|
||||
|
||||
These errors log warnings but allow startup to continue:
|
||||
|
||||
1. **Assignment Loading Failure** (line 166)
|
||||
```
|
||||
⚠️ Failed to load assignment (continuing with base config): <details>
|
||||
```
|
||||
|
||||
2. **BACKBEAT Initialization Failure** (line 219)
|
||||
```
|
||||
⚠️ BACKBEAT integration initialization failed: <details>
|
||||
📍 P2P operations will run without beat synchronization
|
||||
```
|
||||
|
||||
3. **DHT Bootstrap Failure** (line 426)
|
||||
```
|
||||
⚠️ DHT bootstrap failed: <details>
|
||||
```
|
||||
|
||||
4. **Bootstrap Peer Connection Failure** (line 473)
|
||||
```
|
||||
⚠️ Failed to connect to bootstrap peer <addr>: <details>
|
||||
```
|
||||
|
||||
5. **UCXI Storage Creation Failure** (line 572)
|
||||
```
|
||||
⚠️ Failed to create UCXI storage: <details>
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Related Documentation
|
||||
|
||||
- [Commands: chorus-agent](../commands/chorus-agent.md) - Uses Initialize("agent")
|
||||
- [Commands: chorus-hap](../commands/chorus-hap.md) - Uses Initialize("hap")
|
||||
- [pkg/config](../packages/config.md) - Configuration structures
|
||||
- [pkg/health](../packages/health.md) - Health monitoring
|
||||
- [pkg/shutdown](../packages/shutdown.md) - Graceful shutdown
|
||||
- [pkg/election](../packages/election.md) - Leader election
|
||||
- [pkg/dht](../packages/dht.md) - Distributed hash table
|
||||
- [internal/licensing](licensing.md) - License validation
|
||||
- [internal/backbeat](backbeat.md) - P2P telemetry
|
||||
|
||||
---
|
||||
|
||||
## Summary
|
||||
|
||||
The `internal/runtime` package is the **backbone** of CHORUS:
|
||||
|
||||
✅ **Single Initialization**: All binaries use same initialization path
|
||||
✅ **Component Lifecycle**: Consistent startup, operation, shutdown
|
||||
✅ **Health Monitoring**: Liveness, readiness, and enhanced checks
|
||||
✅ **Graceful Shutdown**: Dependency-ordered cleanup with timeouts
|
||||
✅ **Dynamic Configuration**: SIGHUP reload without restart
|
||||
✅ **Agent Behaviors**: Availability, capabilities, role broadcasting
|
||||
✅ **Security Integration**: License validation, secrets detection
|
||||
✅ **P2P Foundation**: libp2p, DHT, PubSub, Election, Coordination
|
||||
|
||||
This package ensures **consistent, reliable, and production-ready** initialization for all CHORUS components.
|
||||
Reference in New Issue
Block a user