CHORUS/docs/comprehensive/packages/coordinator.md

# Package: coordinator

**Location**: `/home/tony/chorus/project-queues/active/CHORUS/coordinator/`

## Overview

The `coordinator` package provides the **TaskCoordinator** - the main orchestrator for distributed task management in CHORUS. It handles task discovery, intelligent assignment, execution coordination, and real-time progress tracking across multiple repositories and agents. The coordinator integrates with the PubSub system for role-based collaboration and uses AI-powered execution engines for autonomous task completion.

## Core Components

### TaskCoordinator

The central orchestrator managing task lifecycle across the distributed CHORUS network.

```go
type TaskCoordinator struct {
    pubsub     *pubsub.PubSub
    hlog       *logging.HypercoreLog
    ctx        context.Context
    config     *config.Config
    hmmmRouter *hmmm.Router

    // Repository management
    providers    map[int]repository.TaskProvider // projectID -> provider
    providerLock sync.RWMutex
    factory      repository.ProviderFactory

    // Task management
    activeTasks map[string]*ActiveTask // taskKey -> active task
    taskLock    sync.RWMutex
    taskMatcher repository.TaskMatcher
    taskTracker TaskProgressTracker

    // Task execution
    executionEngine execution.TaskExecutionEngine

    // Agent tracking
    nodeID    string
    agentInfo *repository.AgentInfo

    // Sync settings
    syncInterval time.Duration
    lastSync     map[int]time.Time
    syncLock     sync.RWMutex
}
```

**Key Responsibilities:**
- Discover available tasks across multiple repositories
- Score and assign tasks based on agent capabilities and expertise
- Coordinate task execution with AI-powered execution engines
- Track active tasks and broadcast progress updates
- Request and coordinate multi-agent collaboration
- Integrate with HMMM for meta-discussion and coordination

### ActiveTask

Represents a task currently being worked on by an agent.

```go
type ActiveTask struct {
    Task      *repository.Task
    Provider  repository.TaskProvider
    ProjectID int
    ClaimedAt time.Time
    Status    string // claimed, working, completed, failed
    AgentID   string
    Results   map[string]interface{}
}
```

**Task Lifecycle States:**
1. **claimed** - Task has been claimed by an agent
2. **working** - Agent is actively executing the task
3. **completed** - Task finished successfully
4. **failed** - Task execution failed

### TaskProgressTracker Interface

Callback interface for tracking task progress and updating availability broadcasts.

```go
type TaskProgressTracker interface {
    AddTask(taskID string)
    RemoveTask(taskID string)
}
```

This interface ensures availability broadcasts accurately reflect current workload.

## Task Coordination Flow

### 1. Initialization

```go
coordinator := NewTaskCoordinator(
    ctx,
    ps,           // PubSub instance
    hlog,         // Hypercore log
    cfg,          // Agent configuration
    nodeID,       // P2P node ID
    hmmmRouter,   // HMMM router for meta-discussion
    tracker,      // Task progress tracker
)

coordinator.Start()
```

**Initialization Process:**
1. Creates agent info from configuration
2. Sets up task execution engine with AI providers
3. Announces agent role and capabilities via PubSub
4. Starts task discovery loop
5. Begins listening for role-based messages

### 2. Task Discovery and Assignment

**Discovery Loop** (runs every 30 seconds):
```
taskDiscoveryLoop() ->
  (Discovery now handled by WHOOSH integration)
```

**Task Evaluation** (`shouldProcessTask`):
```go
func (tc *TaskCoordinator) shouldProcessTask(task *repository.Task) bool {
    // 1. Check capacity: currentTasks < maxTasks
    // 2. Check if already assigned to this agent
    // 3. Score task fit for agent capabilities
    // 4. Return true if score > 0.5 threshold
}
```

**Task Scoring:**
- Agent role matches required role
- Agent expertise matches required expertise
- Current workload vs capacity
- Task priority level
- Historical performance scores

### 3. Task Claiming and Processing

```
processTask() flow:
  1. Evaluate if collaboration needed (shouldRequestCollaboration)
  2. Request collaboration via PubSub if needed
  3. Claim task through repository provider
  4. Create ActiveTask and store in activeTasks map
  5. Log claim to Hypercore
  6. Announce claim via PubSub (TaskProgress message)
  7. Seed HMMM meta-discussion room for task
  8. Start execution in background goroutine
```

**Collaboration Request Criteria:**
- Task priority >= 8 (high priority)
- Task requires expertise agent doesn't have
- Complex multi-component tasks

### 4. Task Execution

**AI-Powered Execution** (`executeTaskWithAI`):

```go
executionRequest := &execution.TaskExecutionRequest{
    ID:          "repo:taskNumber",
    Type:        determineTaskType(task), // bug_fix, feature_development, etc.
    Description: buildTaskDescription(task),
    Context:     buildTaskContext(task),
    Requirements: &execution.TaskRequirements{
        AIModel:        "", // Auto-selected based on role
        SandboxType:    "docker",
        RequiredTools:  []string{"git", "curl"},
        EnvironmentVars: map[string]string{
            "TASK_ID":    taskID,
            "REPOSITORY": repoName,
            "AGENT_ID":   agentID,
            "AGENT_ROLE": agentRole,
        },
    },
    Timeout: 10 * time.Minute,
}

result := tc.executionEngine.ExecuteTask(ctx, executionRequest)
```

**Task Type Detection:**
- **bug_fix** - Keywords: "bug", "fix"
- **feature_development** - Keywords: "feature", "implement"
- **testing** - Keywords: "test"
- **documentation** - Keywords: "doc", "documentation"
- **refactoring** - Keywords: "refactor"
- **code_review** - Keywords: "review"
- **development** - Default for general tasks

**Fallback Mock Execution:**
If AI execution engine is unavailable or fails, falls back to mock execution with simulated work time.

### 5. Task Completion

```
executeTask() completion flow:
  1. Update ActiveTask status to "completed"
  2. Complete task through repository provider
  3. Remove from activeTasks map
  4. Update TaskProgressTracker
  5. Log completion to Hypercore
  6. Announce completion via PubSub
```

**Task Result Structure:**
```go
type TaskResult struct {
    Success  bool
    Message  string
    Metadata map[string]interface{} // Includes:
                                     // - execution_type (ai_powered/mock)
                                     // - duration
                                     // - commands_executed
                                     // - files_generated
                                     // - resource_usage
                                     // - artifacts
}
```

## PubSub Integration

### Published Message Types

#### 1. RoleAnnouncement
**Topic**: `hmmm/meta-discussion/v1`
**Frequency**: Once on startup, when capabilities change

```json
{
  "type": "role_announcement",
  "from": "peer_id",
  "from_role": "Senior Backend Developer",
  "data": {
    "agent_id": "agent-001",
    "node_id": "Qm...",
    "role": "Senior Backend Developer",
    "expertise": ["Go", "PostgreSQL", "Kubernetes"],
    "capabilities": ["code", "test", "deploy"],
    "max_tasks": 3,
    "current_tasks": 0,
    "status": "ready",
    "specialization": "microservices"
  }
}
```

#### 2. TaskProgress
**Topic**: `CHORUS/coordination/v1`
**Frequency**: On claim, start, completion

**Task Claim:**
```json
{
  "type": "task_progress",
  "from": "peer_id",
  "from_role": "Senior Backend Developer",
  "thread_id": "task-myrepo-42",
  "data": {
    "task_number": 42,
    "repository": "myrepo",
    "title": "Add authentication endpoint",
    "agent_id": "agent-001",
    "agent_role": "Senior Backend Developer",
    "claim_time": "2025-09-30T10:00:00Z",
    "estimated_completion": "2025-09-30T11:00:00Z"
  }
}
```

**Task Status Update:**
```json
{
  "type": "task_progress",
  "from": "peer_id",
  "from_role": "Senior Backend Developer",
  "thread_id": "task-myrepo-42",
  "data": {
    "task_number": 42,
    "repository": "myrepo",
    "agent_id": "agent-001",
    "agent_role": "Senior Backend Developer",
    "status": "started" | "completed",
    "timestamp": "2025-09-30T10:05:00Z"
  }
}
```

#### 3. TaskHelpRequest
**Topic**: `hmmm/meta-discussion/v1`
**Frequency**: When collaboration needed

```json
{
  "type": "task_help_request",
  "from": "peer_id",
  "from_role": "Senior Backend Developer",
  "to_roles": ["Database Specialist"],
  "required_expertise": ["PostgreSQL", "Query Optimization"],
  "priority": "high",
  "thread_id": "task-myrepo-42",
  "data": {
    "task_number": 42,
    "repository": "myrepo",
    "title": "Optimize database queries",
    "required_role": "Database Specialist",
    "required_expertise": ["PostgreSQL", "Query Optimization"],
    "priority": 8,
    "requester_role": "Senior Backend Developer",
    "reason": "expertise_gap"
  }
}
```

### Received Message Types

#### 1. TaskHelpRequest
**Handler**: `handleTaskHelpRequest`

**Response Logic:**
1. Check if agent has required expertise
2. Verify agent has available capacity (currentTasks < maxTasks)
3. If can help, send TaskHelpResponse
4. Reflect offer into HMMM per-issue room

**Response Message:**
```json
{
  "type": "task_help_response",
  "from": "peer_id",
  "from_role": "Database Specialist",
  "thread_id": "task-myrepo-42",
  "data": {
    "agent_id": "agent-002",
    "agent_role": "Database Specialist",
    "expertise": ["PostgreSQL", "Query Optimization", "Indexing"],
    "availability": 2,
    "offer_type": "collaboration",
    "response_to": { /* original help request data */ }
  }
}
```

#### 2. ExpertiseRequest
**Handler**: `handleExpertiseRequest`

Processes requests for specific expertise areas.

#### 3. CoordinationRequest
**Handler**: `handleCoordinationRequest`

Handles coordination requests for multi-agent tasks.

#### 4. RoleAnnouncement
**Handler**: `handleRoleAnnouncement`

Logs when other agents announce their roles and capabilities.

## HMMM Integration

### Per-Issue Room Seeding

When a task is claimed, the coordinator seeds a HMMM meta-discussion room:

```go
seedMsg := hmmm.Message{
    Version:   1,
    Type:      "meta_msg",
    IssueID:   int64(taskNumber),
    ThreadID:  fmt.Sprintf("issue-%d", taskNumber),
    MsgID:     uuid.New().String(),
    NodeID:    nodeID,
    HopCount:  0,
    Timestamp: time.Now().UTC(),
    Message:   "Seed: Task 'title' claimed. Description: ...",
}

hmmmRouter.Publish(ctx, seedMsg)
```

**Purpose:**
- Creates dedicated discussion space for task
- Enables agents to coordinate on specific tasks
- Integrates with broader meta-coordination system
- Provides context for SLURP event generation

### Help Offer Reflection

When agents offer help, the offer is reflected into the HMMM room:

```go
hmsg := hmmm.Message{
    Version:   1,
    Type:      "meta_msg",
    IssueID:   issueID,
    ThreadID:  fmt.Sprintf("issue-%d", issueID),
    MsgID:     uuid.New().String(),
    NodeID:    nodeID,
    HopCount:  0,
    Timestamp: time.Now().UTC(),
    Message:   fmt.Sprintf("Help offer from %s (availability %d)",
                          agentRole, availableSlots),
}
```

## Availability Tracking

The coordinator tracks task progress to keep availability broadcasts accurate:

```go
// When task is claimed:
if tc.taskTracker != nil {
    tc.taskTracker.AddTask(taskKey)
}

// When task completes:
if tc.taskTracker != nil {
    tc.taskTracker.RemoveTask(taskKey)
}
```

This ensures the availability broadcaster (in `internal/runtime`) has accurate real-time data:

```json
{
  "type": "availability_broadcast",
  "data": {
    "node_id": "Qm...",
    "available_for_work": true,
    "current_tasks": 1,
    "max_tasks": 3,
    "last_activity": 1727692800,
    "status": "working",
    "timestamp": 1727692800
  }
}
```

## Task Assignment Algorithm

### Scoring System

The `TaskMatcher` scores tasks for agents based on multiple factors:

```
Score = (roleMatch * 0.4) +
        (expertiseMatch * 0.3) +
        (availabilityScore * 0.2) +
        (performanceScore * 0.1)

Where:
- roleMatch: 1.0 if agent role matches required role, 0.5 for partial match
- expertiseMatch: percentage of required expertise agent possesses
- availabilityScore: (maxTasks - currentTasks) / maxTasks
- performanceScore: agent's historical performance metric (0.0-1.0)
```

**Threshold**: Tasks with score > 0.5 are considered for assignment.

### Assignment Priority

Tasks are prioritized by:
1. **Priority Level** (task.Priority field, 0-10)
2. **Task Score** (calculated by matcher)
3. **Age** (older tasks first)
4. **Dependencies** (tasks blocking others)

### Claim Race Condition Handling

Multiple agents may attempt to claim the same task:

```
1. Agent A evaluates task: score = 0.8, attempts claim
2. Agent B evaluates task: score = 0.7, attempts claim
3. Repository provider uses atomic claim operation
4. First successful claim wins
5. Other agents receive claim failure
6. Failed agents continue to next task
```

## Error Handling

### Task Execution Failures

```go
// On AI execution failure:
if err := tc.executeTaskWithAI(activeTask); err != nil {
    // Fall back to mock execution
    taskResult = tc.executeMockTask(activeTask)
}

// On completion failure:
if err := provider.CompleteTask(task, result); err != nil {
    // Update status to failed
    activeTask.Status = "failed"
    activeTask.Results = map[string]interface{}{
        "error": err.Error(),
    }
}
```

### Collaboration Request Failures

```go
err := tc.pubsub.PublishRoleBasedMessage(
    pubsub.TaskHelpRequest, data, opts)
if err != nil {
    // Log error but continue with task
    fmt.Printf("⚠️ Failed to request collaboration: %v\n", err)
    // Task execution proceeds without collaboration
}
```

### HMMM Seeding Failures

```go
if err := tc.hmmmRouter.Publish(ctx, seedMsg); err != nil {
    // Log error to Hypercore
    tc.hlog.AppendString("system_error", map[string]interface{}{
        "error":       "hmmm_seed_failed",
        "task_number": taskNumber,
        "repository":  repository,
        "message":     err.Error(),
    })
    // Task execution continues without HMMM room
}
```

## Agent Configuration

### Required Configuration

```yaml
agent:
  id: "agent-001"
  role: "Senior Backend Developer"
  expertise:
    - "Go"
    - "PostgreSQL"
    - "Docker"
    - "Kubernetes"
  capabilities:
    - "code"
    - "test"
    - "deploy"
  max_tasks: 3
  specialization: "microservices"
  models:
    - name: "llama3.1:70b"
      provider: "ollama"
      endpoint: "http://192.168.1.72:11434"
```

### AgentInfo Structure

```go
type AgentInfo struct {
    ID           string
    Role         string
    Expertise    []string
    CurrentTasks int
    MaxTasks     int
    Status       string // ready, working, busy, offline
    LastSeen     time.Time
    Performance  map[string]interface{} // score: 0.8
    Availability string // available, busy, offline
}
```

## Hypercore Logging

All coordination events are logged to Hypercore:

### Task Claimed
```go
hlog.Append(logging.TaskClaimed, map[string]interface{}{
    "task_number":   taskNumber,
    "repository":    repository,
    "title":         title,
    "required_role": requiredRole,
    "priority":      priority,
})
```

### Task Completed
```go
hlog.Append(logging.TaskCompleted, map[string]interface{}{
    "task_number": taskNumber,
    "repository":  repository,
    "duration":    durationSeconds,
    "results":     resultsMap,
})
```

## Status Reporting

### Coordinator Status

```go
status := coordinator.GetStatus()
// Returns:
{
    "agent_id":         "agent-001",
    "role":             "Senior Backend Developer",
    "expertise":        ["Go", "PostgreSQL", "Docker"],
    "current_tasks":    1,
    "max_tasks":        3,
    "active_providers": 2,
    "status":           "working",
    "active_tasks": [
        {
            "repository": "myrepo",
            "number":     42,
            "title":      "Add authentication",
            "status":     "working",
            "claimed_at": "2025-09-30T10:00:00Z"
        }
    ]
}
```

## Best Practices

### Task Coordinator Usage

1. **Initialize Early**: Create coordinator during agent startup
2. **Set Task Tracker**: Always provide TaskProgressTracker for accurate availability
3. **Configure HMMM**: Wire up hmmmRouter for meta-discussion integration
4. **Monitor Status**: Periodically check GetStatus() for health monitoring
5. **Handle Failures**: Implement proper error handling for degraded operation

### Configuration Tuning

1. **Max Tasks**: Set based on agent resources (CPU, memory, AI model capacity)
2. **Sync Interval**: Balance between responsiveness and network overhead (default: 30s)
3. **Task Scoring**: Adjust threshold (default: 0.5) based on task availability
4. **Collaboration**: Enable for high-priority or expertise-gap tasks

### Performance Optimization

1. **Task Discovery**: Delegate to WHOOSH for efficient search and indexing
2. **Concurrent Execution**: Use goroutines for parallel task execution
3. **Lock Granularity**: Minimize lock contention with separate locks for providers/tasks
4. **Caching**: Cache agent info and provider connections

## Integration Points

### With PubSub
- Publishes: RoleAnnouncement, TaskProgress, TaskHelpRequest
- Subscribes: TaskHelpRequest, ExpertiseRequest, CoordinationRequest
- Topics: CHORUS/coordination/v1, hmmm/meta-discussion/v1

### With HMMM
- Seeds per-issue discussion rooms
- Reflects help offers into rooms
- Enables agent coordination on specific tasks

### With Repository Providers
- Claims tasks atomically
- Fetches task details
- Updates task status
- Completes tasks with results

### With Execution Engine
- Converts repository tasks to execution requests
- Executes tasks with AI providers
- Handles sandbox environments
- Collects execution metrics and artifacts

### With Hypercore
- Logs task claims
- Logs task completions
- Logs coordination errors
- Provides audit trail

## Task Message Format

### PubSub Task Messages

All task-related messages follow the standard PubSub Message format:

```go
type Message struct {
    Type              MessageType            // e.g., "task_progress"
    From              string                 // Peer ID
    Timestamp         time.Time
    Data              map[string]interface{} // Message payload
    HopCount          int
    FromRole          string                 // Agent role
    ToRoles           []string               // Target roles
    RequiredExpertise []string               // Required expertise
    ProjectID         string
    Priority          string                 // low, medium, high, urgent
    ThreadID          string                 // Conversation thread
}
```

### Task Assignment Message Flow

```
1. TaskAnnouncement (WHOOSH → PubSub)
   ├─ Available task discovered
   └─ Broadcast to coordination topic

2. Task Evaluation (Local)
   ├─ Score task for agent
   └─ Decide whether to claim

3. TaskClaim (Agent → Repository)
   ├─ Atomic claim operation
   └─ Only one agent succeeds

4. TaskProgress (Agent → PubSub)
   ├─ Announce claim to network
   └─ Status: "claimed"

5. TaskHelpRequest (Optional, Agent → PubSub)
   ├─ Request collaboration if needed
   └─ Target specific roles/expertise

6. TaskHelpResponse (Other Agents → PubSub)
   ├─ Offer assistance
   └─ Include availability info

7. TaskProgress (Agent → PubSub)
   ├─ Announce work started
   └─ Status: "started"

8. Task Execution (Local with AI Engine)
   ├─ Execute task in sandbox
   └─ Generate artifacts

9. TaskProgress (Agent → PubSub)
   ├─ Announce completion
   └─ Status: "completed"
```

## See Also

- [discovery/](discovery.md) - mDNS peer discovery for local network
- [pkg/coordination/](coordination.md) - Coordination primitives and dependency detection
- [pubsub/](../pubsub.md) - PubSub messaging system
- [pkg/execution/](execution.md) - Task execution engine
- [pkg/hmmm/](hmmm.md) - Meta-discussion and coordination
- [internal/runtime](../internal/runtime.md) - Agent runtime and availability broadcasting