# Package: coordinator **Location**: `/home/tony/chorus/project-queues/active/CHORUS/coordinator/` ## Overview The `coordinator` package provides the **TaskCoordinator** - the main orchestrator for distributed task management in CHORUS. It handles task discovery, intelligent assignment, execution coordination, and real-time progress tracking across multiple repositories and agents. The coordinator integrates with the PubSub system for role-based collaboration and uses AI-powered execution engines for autonomous task completion. ## Core Components ### TaskCoordinator The central orchestrator managing task lifecycle across the distributed CHORUS network. ```go type TaskCoordinator struct { pubsub *pubsub.PubSub hlog *logging.HypercoreLog ctx context.Context config *config.Config hmmmRouter *hmmm.Router // Repository management providers map[int]repository.TaskProvider // projectID -> provider providerLock sync.RWMutex factory repository.ProviderFactory // Task management activeTasks map[string]*ActiveTask // taskKey -> active task taskLock sync.RWMutex taskMatcher repository.TaskMatcher taskTracker TaskProgressTracker // Task execution executionEngine execution.TaskExecutionEngine // Agent tracking nodeID string agentInfo *repository.AgentInfo // Sync settings syncInterval time.Duration lastSync map[int]time.Time syncLock sync.RWMutex } ``` **Key Responsibilities:** - Discover available tasks across multiple repositories - Score and assign tasks based on agent capabilities and expertise - Coordinate task execution with AI-powered execution engines - Track active tasks and broadcast progress updates - Request and coordinate multi-agent collaboration - Integrate with HMMM for meta-discussion and coordination ### ActiveTask Represents a task currently being worked on by an agent. ```go type ActiveTask struct { Task *repository.Task Provider repository.TaskProvider ProjectID int ClaimedAt time.Time Status string // claimed, working, completed, failed AgentID string Results map[string]interface{} } ``` **Task Lifecycle States:** 1. **claimed** - Task has been claimed by an agent 2. **working** - Agent is actively executing the task 3. **completed** - Task finished successfully 4. **failed** - Task execution failed ### TaskProgressTracker Interface Callback interface for tracking task progress and updating availability broadcasts. ```go type TaskProgressTracker interface { AddTask(taskID string) RemoveTask(taskID string) } ``` This interface ensures availability broadcasts accurately reflect current workload. ## Task Coordination Flow ### 1. Initialization ```go coordinator := NewTaskCoordinator( ctx, ps, // PubSub instance hlog, // Hypercore log cfg, // Agent configuration nodeID, // P2P node ID hmmmRouter, // HMMM router for meta-discussion tracker, // Task progress tracker ) coordinator.Start() ``` **Initialization Process:** 1. Creates agent info from configuration 2. Sets up task execution engine with AI providers 3. Announces agent role and capabilities via PubSub 4. Starts task discovery loop 5. Begins listening for role-based messages ### 2. Task Discovery and Assignment **Discovery Loop** (runs every 30 seconds): ``` taskDiscoveryLoop() -> (Discovery now handled by WHOOSH integration) ``` **Task Evaluation** (`shouldProcessTask`): ```go func (tc *TaskCoordinator) shouldProcessTask(task *repository.Task) bool { // 1. Check capacity: currentTasks < maxTasks // 2. Check if already assigned to this agent // 3. Score task fit for agent capabilities // 4. Return true if score > 0.5 threshold } ``` **Task Scoring:** - Agent role matches required role - Agent expertise matches required expertise - Current workload vs capacity - Task priority level - Historical performance scores ### 3. Task Claiming and Processing ``` processTask() flow: 1. Evaluate if collaboration needed (shouldRequestCollaboration) 2. Request collaboration via PubSub if needed 3. Claim task through repository provider 4. Create ActiveTask and store in activeTasks map 5. Log claim to Hypercore 6. Announce claim via PubSub (TaskProgress message) 7. Seed HMMM meta-discussion room for task 8. Start execution in background goroutine ``` **Collaboration Request Criteria:** - Task priority >= 8 (high priority) - Task requires expertise agent doesn't have - Complex multi-component tasks ### 4. Task Execution **AI-Powered Execution** (`executeTaskWithAI`): ```go executionRequest := &execution.TaskExecutionRequest{ ID: "repo:taskNumber", Type: determineTaskType(task), // bug_fix, feature_development, etc. Description: buildTaskDescription(task), Context: buildTaskContext(task), Requirements: &execution.TaskRequirements{ AIModel: "", // Auto-selected based on role SandboxType: "docker", RequiredTools: []string{"git", "curl"}, EnvironmentVars: map[string]string{ "TASK_ID": taskID, "REPOSITORY": repoName, "AGENT_ID": agentID, "AGENT_ROLE": agentRole, }, }, Timeout: 10 * time.Minute, } result := tc.executionEngine.ExecuteTask(ctx, executionRequest) ``` **Task Type Detection:** - **bug_fix** - Keywords: "bug", "fix" - **feature_development** - Keywords: "feature", "implement" - **testing** - Keywords: "test" - **documentation** - Keywords: "doc", "documentation" - **refactoring** - Keywords: "refactor" - **code_review** - Keywords: "review" - **development** - Default for general tasks **Fallback Mock Execution:** If AI execution engine is unavailable or fails, falls back to mock execution with simulated work time. ### 5. Task Completion ``` executeTask() completion flow: 1. Update ActiveTask status to "completed" 2. Complete task through repository provider 3. Remove from activeTasks map 4. Update TaskProgressTracker 5. Log completion to Hypercore 6. Announce completion via PubSub ``` **Task Result Structure:** ```go type TaskResult struct { Success bool Message string Metadata map[string]interface{} // Includes: // - execution_type (ai_powered/mock) // - duration // - commands_executed // - files_generated // - resource_usage // - artifacts } ``` ## PubSub Integration ### Published Message Types #### 1. RoleAnnouncement **Topic**: `hmmm/meta-discussion/v1` **Frequency**: Once on startup, when capabilities change ```json { "type": "role_announcement", "from": "peer_id", "from_role": "Senior Backend Developer", "data": { "agent_id": "agent-001", "node_id": "Qm...", "role": "Senior Backend Developer", "expertise": ["Go", "PostgreSQL", "Kubernetes"], "capabilities": ["code", "test", "deploy"], "max_tasks": 3, "current_tasks": 0, "status": "ready", "specialization": "microservices" } } ``` #### 2. TaskProgress **Topic**: `CHORUS/coordination/v1` **Frequency**: On claim, start, completion **Task Claim:** ```json { "type": "task_progress", "from": "peer_id", "from_role": "Senior Backend Developer", "thread_id": "task-myrepo-42", "data": { "task_number": 42, "repository": "myrepo", "title": "Add authentication endpoint", "agent_id": "agent-001", "agent_role": "Senior Backend Developer", "claim_time": "2025-09-30T10:00:00Z", "estimated_completion": "2025-09-30T11:00:00Z" } } ``` **Task Status Update:** ```json { "type": "task_progress", "from": "peer_id", "from_role": "Senior Backend Developer", "thread_id": "task-myrepo-42", "data": { "task_number": 42, "repository": "myrepo", "agent_id": "agent-001", "agent_role": "Senior Backend Developer", "status": "started" | "completed", "timestamp": "2025-09-30T10:05:00Z" } } ``` #### 3. TaskHelpRequest **Topic**: `hmmm/meta-discussion/v1` **Frequency**: When collaboration needed ```json { "type": "task_help_request", "from": "peer_id", "from_role": "Senior Backend Developer", "to_roles": ["Database Specialist"], "required_expertise": ["PostgreSQL", "Query Optimization"], "priority": "high", "thread_id": "task-myrepo-42", "data": { "task_number": 42, "repository": "myrepo", "title": "Optimize database queries", "required_role": "Database Specialist", "required_expertise": ["PostgreSQL", "Query Optimization"], "priority": 8, "requester_role": "Senior Backend Developer", "reason": "expertise_gap" } } ``` ### Received Message Types #### 1. TaskHelpRequest **Handler**: `handleTaskHelpRequest` **Response Logic:** 1. Check if agent has required expertise 2. Verify agent has available capacity (currentTasks < maxTasks) 3. If can help, send TaskHelpResponse 4. Reflect offer into HMMM per-issue room **Response Message:** ```json { "type": "task_help_response", "from": "peer_id", "from_role": "Database Specialist", "thread_id": "task-myrepo-42", "data": { "agent_id": "agent-002", "agent_role": "Database Specialist", "expertise": ["PostgreSQL", "Query Optimization", "Indexing"], "availability": 2, "offer_type": "collaboration", "response_to": { /* original help request data */ } } } ``` #### 2. ExpertiseRequest **Handler**: `handleExpertiseRequest` Processes requests for specific expertise areas. #### 3. CoordinationRequest **Handler**: `handleCoordinationRequest` Handles coordination requests for multi-agent tasks. #### 4. RoleAnnouncement **Handler**: `handleRoleAnnouncement` Logs when other agents announce their roles and capabilities. ## HMMM Integration ### Per-Issue Room Seeding When a task is claimed, the coordinator seeds a HMMM meta-discussion room: ```go seedMsg := hmmm.Message{ Version: 1, Type: "meta_msg", IssueID: int64(taskNumber), ThreadID: fmt.Sprintf("issue-%d", taskNumber), MsgID: uuid.New().String(), NodeID: nodeID, HopCount: 0, Timestamp: time.Now().UTC(), Message: "Seed: Task 'title' claimed. Description: ...", } hmmmRouter.Publish(ctx, seedMsg) ``` **Purpose:** - Creates dedicated discussion space for task - Enables agents to coordinate on specific tasks - Integrates with broader meta-coordination system - Provides context for SLURP event generation ### Help Offer Reflection When agents offer help, the offer is reflected into the HMMM room: ```go hmsg := hmmm.Message{ Version: 1, Type: "meta_msg", IssueID: issueID, ThreadID: fmt.Sprintf("issue-%d", issueID), MsgID: uuid.New().String(), NodeID: nodeID, HopCount: 0, Timestamp: time.Now().UTC(), Message: fmt.Sprintf("Help offer from %s (availability %d)", agentRole, availableSlots), } ``` ## Availability Tracking The coordinator tracks task progress to keep availability broadcasts accurate: ```go // When task is claimed: if tc.taskTracker != nil { tc.taskTracker.AddTask(taskKey) } // When task completes: if tc.taskTracker != nil { tc.taskTracker.RemoveTask(taskKey) } ``` This ensures the availability broadcaster (in `internal/runtime`) has accurate real-time data: ```json { "type": "availability_broadcast", "data": { "node_id": "Qm...", "available_for_work": true, "current_tasks": 1, "max_tasks": 3, "last_activity": 1727692800, "status": "working", "timestamp": 1727692800 } } ``` ## Task Assignment Algorithm ### Scoring System The `TaskMatcher` scores tasks for agents based on multiple factors: ``` Score = (roleMatch * 0.4) + (expertiseMatch * 0.3) + (availabilityScore * 0.2) + (performanceScore * 0.1) Where: - roleMatch: 1.0 if agent role matches required role, 0.5 for partial match - expertiseMatch: percentage of required expertise agent possesses - availabilityScore: (maxTasks - currentTasks) / maxTasks - performanceScore: agent's historical performance metric (0.0-1.0) ``` **Threshold**: Tasks with score > 0.5 are considered for assignment. ### Assignment Priority Tasks are prioritized by: 1. **Priority Level** (task.Priority field, 0-10) 2. **Task Score** (calculated by matcher) 3. **Age** (older tasks first) 4. **Dependencies** (tasks blocking others) ### Claim Race Condition Handling Multiple agents may attempt to claim the same task: ``` 1. Agent A evaluates task: score = 0.8, attempts claim 2. Agent B evaluates task: score = 0.7, attempts claim 3. Repository provider uses atomic claim operation 4. First successful claim wins 5. Other agents receive claim failure 6. Failed agents continue to next task ``` ## Error Handling ### Task Execution Failures ```go // On AI execution failure: if err := tc.executeTaskWithAI(activeTask); err != nil { // Fall back to mock execution taskResult = tc.executeMockTask(activeTask) } // On completion failure: if err := provider.CompleteTask(task, result); err != nil { // Update status to failed activeTask.Status = "failed" activeTask.Results = map[string]interface{}{ "error": err.Error(), } } ``` ### Collaboration Request Failures ```go err := tc.pubsub.PublishRoleBasedMessage( pubsub.TaskHelpRequest, data, opts) if err != nil { // Log error but continue with task fmt.Printf("⚠️ Failed to request collaboration: %v\n", err) // Task execution proceeds without collaboration } ``` ### HMMM Seeding Failures ```go if err := tc.hmmmRouter.Publish(ctx, seedMsg); err != nil { // Log error to Hypercore tc.hlog.AppendString("system_error", map[string]interface{}{ "error": "hmmm_seed_failed", "task_number": taskNumber, "repository": repository, "message": err.Error(), }) // Task execution continues without HMMM room } ``` ## Agent Configuration ### Required Configuration ```yaml agent: id: "agent-001" role: "Senior Backend Developer" expertise: - "Go" - "PostgreSQL" - "Docker" - "Kubernetes" capabilities: - "code" - "test" - "deploy" max_tasks: 3 specialization: "microservices" models: - name: "llama3.1:70b" provider: "ollama" endpoint: "http://192.168.1.72:11434" ``` ### AgentInfo Structure ```go type AgentInfo struct { ID string Role string Expertise []string CurrentTasks int MaxTasks int Status string // ready, working, busy, offline LastSeen time.Time Performance map[string]interface{} // score: 0.8 Availability string // available, busy, offline } ``` ## Hypercore Logging All coordination events are logged to Hypercore: ### Task Claimed ```go hlog.Append(logging.TaskClaimed, map[string]interface{}{ "task_number": taskNumber, "repository": repository, "title": title, "required_role": requiredRole, "priority": priority, }) ``` ### Task Completed ```go hlog.Append(logging.TaskCompleted, map[string]interface{}{ "task_number": taskNumber, "repository": repository, "duration": durationSeconds, "results": resultsMap, }) ``` ## Status Reporting ### Coordinator Status ```go status := coordinator.GetStatus() // Returns: { "agent_id": "agent-001", "role": "Senior Backend Developer", "expertise": ["Go", "PostgreSQL", "Docker"], "current_tasks": 1, "max_tasks": 3, "active_providers": 2, "status": "working", "active_tasks": [ { "repository": "myrepo", "number": 42, "title": "Add authentication", "status": "working", "claimed_at": "2025-09-30T10:00:00Z" } ] } ``` ## Best Practices ### Task Coordinator Usage 1. **Initialize Early**: Create coordinator during agent startup 2. **Set Task Tracker**: Always provide TaskProgressTracker for accurate availability 3. **Configure HMMM**: Wire up hmmmRouter for meta-discussion integration 4. **Monitor Status**: Periodically check GetStatus() for health monitoring 5. **Handle Failures**: Implement proper error handling for degraded operation ### Configuration Tuning 1. **Max Tasks**: Set based on agent resources (CPU, memory, AI model capacity) 2. **Sync Interval**: Balance between responsiveness and network overhead (default: 30s) 3. **Task Scoring**: Adjust threshold (default: 0.5) based on task availability 4. **Collaboration**: Enable for high-priority or expertise-gap tasks ### Performance Optimization 1. **Task Discovery**: Delegate to WHOOSH for efficient search and indexing 2. **Concurrent Execution**: Use goroutines for parallel task execution 3. **Lock Granularity**: Minimize lock contention with separate locks for providers/tasks 4. **Caching**: Cache agent info and provider connections ## Integration Points ### With PubSub - Publishes: RoleAnnouncement, TaskProgress, TaskHelpRequest - Subscribes: TaskHelpRequest, ExpertiseRequest, CoordinationRequest - Topics: CHORUS/coordination/v1, hmmm/meta-discussion/v1 ### With HMMM - Seeds per-issue discussion rooms - Reflects help offers into rooms - Enables agent coordination on specific tasks ### With Repository Providers - Claims tasks atomically - Fetches task details - Updates task status - Completes tasks with results ### With Execution Engine - Converts repository tasks to execution requests - Executes tasks with AI providers - Handles sandbox environments - Collects execution metrics and artifacts ### With Hypercore - Logs task claims - Logs task completions - Logs coordination errors - Provides audit trail ## Task Message Format ### PubSub Task Messages All task-related messages follow the standard PubSub Message format: ```go type Message struct { Type MessageType // e.g., "task_progress" From string // Peer ID Timestamp time.Time Data map[string]interface{} // Message payload HopCount int FromRole string // Agent role ToRoles []string // Target roles RequiredExpertise []string // Required expertise ProjectID string Priority string // low, medium, high, urgent ThreadID string // Conversation thread } ``` ### Task Assignment Message Flow ``` 1. TaskAnnouncement (WHOOSH → PubSub) ├─ Available task discovered └─ Broadcast to coordination topic 2. Task Evaluation (Local) ├─ Score task for agent └─ Decide whether to claim 3. TaskClaim (Agent → Repository) ├─ Atomic claim operation └─ Only one agent succeeds 4. TaskProgress (Agent → PubSub) ├─ Announce claim to network └─ Status: "claimed" 5. TaskHelpRequest (Optional, Agent → PubSub) ├─ Request collaboration if needed └─ Target specific roles/expertise 6. TaskHelpResponse (Other Agents → PubSub) ├─ Offer assistance └─ Include availability info 7. TaskProgress (Agent → PubSub) ├─ Announce work started └─ Status: "started" 8. Task Execution (Local with AI Engine) ├─ Execute task in sandbox └─ Generate artifacts 9. TaskProgress (Agent → PubSub) ├─ Announce completion └─ Status: "completed" ``` ## See Also - [discovery/](discovery.md) - mDNS peer discovery for local network - [pkg/coordination/](coordination.md) - Coordination primitives and dependency detection - [pubsub/](../pubsub.md) - PubSub messaging system - [pkg/execution/](execution.md) - Task execution engine - [pkg/hmmm/](hmmm.md) - Meta-discussion and coordination - [internal/runtime](../internal/runtime.md) - Agent runtime and availability broadcasting