> Legacy module reference (WHOOSH-era architecture).  
> Current coordinator implementation is SWOOSH in `/home/tony/chorus/SWOOSH`.  
> This file is retained for historical context and migration lineage.

# WHOOSH: Autonomous AI Development Teams Architecture

## Executive Summary

WHOOSH is evolving from project kickoff council formation to **self-organizing AI development teams** that mirror human collaboration patterns. Currently implemented as a **Council Formation Engine**, WHOOSH automatically detects new project Design Briefs and assembles specialized councils of CHORUS agents to handle project kickoffs. This foundation enables future expansion to autonomous teams that collaborate through P2P channels, reach consensus on solutions, and submit high-quality deliverables to SLURP.

**Current Implementation**: WHOOSH monitors Gitea repositories for "Design Brief" issues labeled `chorus-entrypoint`, then intelligently composes kickoff councils using role definitions from human-roles.yaml. CHORUS agents are deployed via Docker Swarm to collaborate on project initialization, producing kickoff artifacts that define project direction and requirements.

**Future Vision**: Extend beyond project kickoffs to ongoing team management where CHORUS agents autonomously join teams based on capabilities, collaborate democratically through HMMM protocol, and deliver solutions without central orchestration points of failure.

### Current Implementation Snapshot (2025-10)

- **Bootstrap rendezvous & topology** – WHOOSH now exposes `/api/v1/bootstrap-peers` and `/api/v1/topology` (see `internal/server/bootstrap.go`) powered by a combined Swarm + QUIC discovery layer (`internal/p2p/discovery.go`, `internal/p2p/quic_client.go`). Returned entries include transport preference, certificate hash, and prioritised multiaddrs so CHORUS containers can form meshes without hard-coded peers.
- **Assignment broker for CHORUS replicas** – `internal/orchestrator/assignment_broker.go` maintains template-driven runtime assignments (role, model, bootstrap peers, join stagger) and serves them via `/api/v1/assignments`. CHORUS’s runtime (`pkg/config.RuntimeConfig`) consumes the broker to merge WHOOSH-defined overrides into live container config.
- **Backbeat-integrated orchestration** – A dedicated BACKBEAT client (`internal/backbeat/integration.go`) tracks beat cadence, reports search/analysis operations, and emits WHOOSH health claims. Tempo hints feed the scaling controller so wave launches respect cluster rhythm.
- **Wave-based scaling & health gates** – The orchestrator stack (`internal/orchestrator/*.go`) coordinates Docker Swarm deployment, health gating (KACHING, BACKBEAT, self), bootstrap pool management, and metrics exports. Scaling decisions are captured through `ScalingMetricsCollector` and surfaced by the `ScalingAPI`.
- **Spec-Kit enterprise plugin** – The spec-kit HTTP client (`internal/composer/spec_kit_client.go`) wraps the external Spec-Kit service with retries, circuit-breaker toggles, and structured artifact ingestion. Outputs are normalised into council artefacts so SLURP/BUBBLE can ingest enterprise deliverables alongside community workflows.

## Architecture Overview

### Current Implementation: Council Formation Engine

```
┌─────────────────────────────────────────────────────────────────┐
│                 WHOOSH COUNCIL FORMATION                        │
│          (Issue Detection + Council Composition)                │
└─────────────────────┬───────────────────────────────────────────┘
                      │ detects Design Brief issues
                      ▼
┌─────────────────────────────────────────────────────────────────┐
│                    GITEA MONITORING                             │
│         (chorus-entrypoint Labels + Webhook Triggers)           │
└─────────────────────┬───────────────────────────────────────────┘
                      │ triggers council deployment
                      ▼
┌─────────────────────────────────────────────────────────────────┐
│                 CHORUS AGENT DEPLOYMENT                         │
│            (Docker Swarm + human-roles.yaml)                   │
└─────────────────────┬───────────────────────────────────────────┘
                      │ councils collaborate via P2P
                      ▼
┌─────────────────────────────────────────────────────────────────┐
│               COUNCIL P2P COLLABORATION                         │
│              (HMMM Protocol + UCXL Addressing)                 │
└─────────────────────┬───────────────────────────────────────────┘
                      │ produces kickoff artifacts
                      ▼
┌─────────────────────────────────────────────────────────────────┐
│                PROJECT KICKOFF DELIVERABLES                     │
│        (Manifests, DRs, Scaffold Plans, Gate Tests)            │
└─────────────────────────────────────────────────────────────────┘
```

### Future Vision: Autonomous Team Architecture

```
┌─────────────────────────────────────────────────────────────────┐
│              WHOOSH TEAM COMPOSER (Phase 2)                     │
│              (LLM-Powered Team Formation)                       │
└─────────────────────┬───────────────────────────────────────────┘
                      │ extends council formation to ongoing teams
                      ▼
┌─────────────────────────────────────────────────────────────────┐
│                    GITEA TEAM MANAGEMENT                        │
│           (Team Issues + Role Assignments)                      │
└─────────────────────┬───────────────────────────────────────────┘
                      │ agents monitor & self-assign
                      ▼
┌─────────────────────────────────────────────────────────────────┐
│                 AUTONOMOUS CHORUS AGENTS                        │
│              (Self-Aware Capability Matching)                   │
└─────────────────────┬───────────────────────────────────────────┘
                      │ join team collaboration channels
                      ▼
┌─────────────────────────────────────────────────────────────────┐
│                P2P TEAM COLLABORATION                           │
│              (Dedicated Team Channels)                          │
└─────────────────────┬───────────────────────────────────────────┘
                      │ consensus-driven completion
                      ▼
┌─────────────────────────────────────────────────────────────────┐
│                   SLURP INTEGRATION                             │
│            (Artifact Submission + Context)                      │
└─────────────────────────────────────────────────────────────────┘
```

## WHOOSH Team Composer

### Purpose
Uses LLM reasoning to analyze incoming tasks and determine optimal team compositions based on:
- Task complexity and scope
- Required skill domains
- Estimated effort and timeline
- Quality requirements

### Team Composition Logic

**Example Task Analysis:**
```
Task: "Implement secure user authentication system with OAuth2 integration"

LLM Analysis:
- Complexity: High
- Domains: Security, Backend API, Frontend UI, Database, Testing
- Estimated Timeline: 2-3 days
- Quality Requirements: High (security-critical)

Recommended Team Composition:
├── Security Architect (1x)
│   ├── Role: Define security requirements and review implementation
│   ├── Skills: OAuth2, JWT, encryption, security best practices
│   └── AI Model: deepseek-coder-v2 (security focus)
├── Backend Developer (1x)
│   ├── Role: Implement API endpoints and authentication logic
│   ├── Skills: REST APIs, database integration, middleware
│   └── AI Model: qwen2.5-coder:32b (backend specialization)
├── Frontend Developer (1x)
│   ├── Role: Build login UI and authentication flows
│   ├── Skills: React/Vue, state management, form validation
│   └── AI Model: starcoder2:15b (frontend focus)
├── Database Engineer (1x)
│   ├── Role: Design user tables and session management
│   ├── Skills: SQL, database design, performance optimization
│   └── AI Model: granite3-dense:8b (data modeling)
└── QA Engineer (1x)
    ├── Role: Security testing and integration validation
    ├── Skills: Testing frameworks, security scanning, automation
    └── AI Model: phi4:14b (testing focus)
```

### Team Templates

**Pre-defined team configurations for common scenarios:**

1. **Feature Development Team**
   - Backend Developer + Frontend Developer + QA Engineer

2. **Bug Fix Team**
   - Debugger + Code Reviewer

3. **Architecture Design Team**
   - System Architect + Security Architect + Performance Engineer

4. **Documentation Team**
   - Technical Writer + Code Reviewer + Subject Matter Expert

5. **Refactoring Team**
   - Code Analyzer + Refactoring Specialist + QA Engineer

### Scaling and Resilience

WHOOSH is designed for robust, automated scaling and resilience. The system is governed by a set of Service Level Objectives (SLOs) and includes automated testing, failure drills, and operational guardrails to ensure stability under pressure.

#### Golden Signals & SLOs

The following SLOs are defined and monitored, with alerts triggered upon breach:

*   **KACHING**: p95 lease issuance < 250 ms; error/429 rate < 1%.
*   **Join success**: ≥ 95% of new replicas join mesh within 30 s of container Ready.
*   **BackBeat (JetStream)**: per-subject **consumer lag** < 200 msgs; publish acks < 100 ms.
*   **Election stability**: ≤ 1 leader change / 10 min per cluster during steady state.
*   **Swarm**: task start success ≥ 99%; median start→assigned<10 s.

Metrics exposed to monitor these SLOs include:
*   `chorus_license_lease_latency_ms`, `chorus_license_breaker_open_total`
*   `chorus_bootstrap_join_duration_ms`, `chorus_join_success_total{result=...}`
*   `backbeat_stream_lag`, `backbeat_ack_latency_ms`
*   `whoosh_wave_size`, `whoosh_wave_backoff_ms`, `whoosh_gate_block_seconds`
*   `chorus_election_changes_total`

#### Synthetic Scale Test

A nightly, repeatable synthetic scale test is scripted to validate scaling behavior:

1.  Scale from N=3 → 3 + 12 in **waves**, following the WHOOSH policy.
2.  Hold for 5 minutes under synthetic load (CHORUS pulls a fixed assignment that triggers normal P2P + small workload).
3.  Scale back to 3.

**Pass/Fail Criteria**:
*   No gate held > 2 min.
*   No breaker > 60 s open.
*   Join p95 < 25 s.
*   0% orphan tasks after scale-down.

#### Canary Config Reload

To prove the runtime assignment merge works without restarts:

*   A prompt/model change is triggered via WHOOSH for a **10% canary** (by assignment).
*   Success/error rates and JetStream lag are watched for 2 minutes.
*   The change is then rolled out to 100%.

#### Failure Drills

Cheap chaos engineering drills are run to test resilience:

*   **KACHING brownout**: Inject 500 ms latency + 5% 429s for 2 minutes. Expect grace-window starts, brief breaker openings, and no P2P join until the lease is acquired. The system should auto-heal and join within 30s after recovery.
*   **Bootstrap peer loss**: Take 50% of the pool out mid-wave. Expect WHOOSH to pause the next wave, with existing joiners still succeeding via a subset. The pool health should recover before continuing.
*   **BackBeat clog**: Cap consumer read to simulate lag > threshold. Expect the WHOOSH gate to block scaling, while replicas continue local work without dropping messages.
*   **Leader eviction**: Kill the current leader. Verify the stability window prevents thrashing and a new leader is elected in < 5s.

#### Ops Guardrails

*   **Admission control** in WHOOSH (implied by gates): Hard cap on "max replicas added / 5 min".
*   **Per-node placement**: Use Swarm labels so waves don’t pile onto one box (e.g., `placement.max_replicas_per_node: 1` for critical roles).
*   **Quarantine mode** in CHORUS: When a license fails after grace or bootstrap joins time out, expose `/health?quarantine=1` and refuse task intake until cleared.

#### Rollback & Recovery Runbook

1.  **Parameter rollback**: WHOOSH re-points `ASSIGNMENT_EPOCH` to the last-good configuration, and `POST /v1/reload` is sent to all replicas.
2.  **Scale rollback**: `target = previous_replicas`. WHOOSH drains the newest assignments first. Requires `join_success ≥ 90%` before any further changes.
3.  **KACHING outage**: Flip the cluster to **cached-lease only** for up to 10 minutes, block new waves, and page the on-call team.
4.  **Bootstrap meltdown**: Promote 3 healthy workers to **temporary bootstrap** (via label + static list), then resume.

#### WHOOSH Autoscale Policy (Example)

The scaler configuration is kept out of the code in a YAML file:

```yaml
cluster: prod
service: chorus
wave:
  max_per_wave: 8
  min_per_wave: 3
  period_sec: 25
  placement:
    max_replicas_per_node: 1
gates:
  kaching:
    p95_latency_ms: 250
    max_error_rate: 0.01
  backbeat:
    max_stream_lag: 200
  bootstrap:
    min_healthy_peers: 3
  join:
    min_success_rate: 0.80
backoff:
  initial_ms: 15000
  factor: 2.0
  jitter: 0.2
  max_ms: 120000
quarantine:
  enable: true
  exit_on: "kaching_ok && bootstrap_ok"
canary:
  fraction: 0.1
  promote_after_sec: 120
```


## CHORUS Agent Self-Organization

### Agent Self-Awareness
Each CHORUS agent maintains awareness of:
- **Primary Capabilities**: Core skills and specializations
- **Secondary Capabilities**: Additional skills they can contribute
- **Current Load**: Active team memberships and availability
- **Performance History**: Success rates and peer feedback
- **Preferred AI Models**: Best-performing models for their tasks

## TODO

- Team Composer API: Implement llama3.1-based team analysis as a Dockerized service (task analysis → capability mapping → team proposals) with unit/integration tests and metrics.
- SLURP integration: Add endpoints for curated bundle ingest/retrieval and document contracts/auth; validate E2E with BUBBLE/DHT.
- CHORUS connectivity: Enable and validate live consensus/task flows (configure `chorus_endpoints`) with health checks and error handling; remove reliance on mocked data.
- Replace mocked UI test routes with real backend calls for agent lifecycle and health checks.
- Hardware-driven model selection: Add agent-side hardware discovery to drive model selection; avoid hardcoded cluster IPs or model names in configs.
- MCP integration later: Keep MCP optional; maintain clean API boundary for now.
- **Collaboration Style**: Team role preferences

### Autonomous Team Joining Process

1. **Monitoring Phase**
   - Agents continuously monitor GITEA for team formation issues
   - Filter by matching capabilities and availability
   - Assess team composition gaps they could fill

2. **Self-Assessment Phase**
   ```
   Agent Self-Evaluation:
   - "This team needs a frontend developer"
   - "I have React/TypeScript skills (confidence: 85%)"
   - "My current load: 1 active team (capacity available)"
   - "Team timeline: 3 days (fits my schedule)"
   - "Decision: JOIN TEAM"
   ```

3. **Team Application Phase**
   - Agent comments on GITEA issue with capability summary
   - Provides availability window and estimated contribution
   - Existing team members can review and approve/decline

4. **Integration Phase**
   - Agent joins P2P team channel
   - Introduces capabilities and proposes initial approach
   - Begins collaborative work with team

### Capability Matching Algorithm
```python
def assess_team_fit(agent, team_requirement):
    skill_match = calculate_skill_overlap(agent.capabilities, team_requirement.skills)
    availability_match = check_schedule_compatibility(agent.schedule, team_requirement.timeline)
    team_chemistry = assess_collaboration_history(agent, team_requirement.existing_members)
    
    fit_score = (skill_match * 0.5) + (availability_match * 0.3) + (team_chemistry * 0.2)
    return fit_score
```

## GITEA Team Management

### Team Issue Structure
Each team is represented by a GITEA issue with structured metadata:

```yaml
Title: "Team Formation: Secure Authentication System Implementation"

Labels:
- team:auth-system-v2
- complexity:high  
- timeline:3-days
- domain:security
- domain:backend
- domain:frontend

Team Composition:
- [ ] Security Architect (required)
- [x] Backend Developer (@agent-backend-specialist)
- [ ] Frontend Developer (required)
- [ ] QA Engineer (required)
- [ ] Code Reviewer (optional)

Timeline: 2024-08-15 to 2024-08-18
P2P Channel: team-auth-system-v2-channel
SLURP Address: ucxl://teams/auth-system-v2/artifacts
```

### Role Status Management
- **Open**: Role available for assignment
- **Applied**: Agent has expressed interest
- **Assigned**: Agent confirmed for role
- **Active**: Agent currently working
- **Completed**: Role deliverables finished
- **Blocked**: Role waiting on dependencies

### Progress Tracking
Teams update GITEA issue with:
- Daily progress summaries
- Milestone achievements
- Blocker identification
- Resource requests
- Quality gate completions

## P2P Team Collaboration Channels

#### HMMM in the loop 
**Reasoning channels, not just chat.** Team channels carry **structured thought** (HMMM) as well as messages: intermediate chains, critiques, and mini-memos are timestamped, attributed, and ingested by SLURP for later DRs. This enables consensus with evidence, not vibes.

### Channel Architecture
Each team gets dedicated communication infrastructure:

```
Team Channel: team-auth-system-v2-channel
├── Topic Streams:
│   ├── #planning (initial design discussions)
│   ├── #implementation (development coordination)  
│   ├── #review (code/design reviews)
│   ├── #testing (QA coordination)
│   └── #integration (final assembly)
├── File Sharing: Distributed artifact storage
├── Screen Sharing: Real-time collaboration sessions
└── Voice Channels: Synchronous discussion capability
```


```

### Context Preservation
All team communications are automatically:
- Timestamped and attributed to agents
- Categorized by topic stream
- Indexed for searchability
- Ingested by SLURP into Hypercore distributed log

## Consensus Mechanisms

> For quorum rules, vote semantics (green/yellow/red), tempo (beats), and the front‑of‑house review/delivery API contracts, see the WHOOSH Review & Consensus Policy: [../Policy/WHOOSH-Review-Policy.md](../Policy/WHOOSH-Review-Policy.md).

### Democratic Decision Making

Refer to the Review Policy for project‑configurable defaults and API shapes: [../Policy/WHOOSH-Review-Policy.md](../Policy/WHOOSH-Review-Policy.md).

**1. Voting Systems**
- **Simple Majority**: Basic feature decisions
- **Supermajority (2/3)**: Architecture changes
- **Unanimous**: Security-critical decisions
- **Technical Lead Override**: Deadlock resolution

**2. Quality Gates**
Before task completion, teams must achieve consensus on:
- **Functional Requirements**: All specified features implemented
- **Quality Standards**: Code review, testing, documentation complete
- **Security Review**: Security-sensitive changes approved by security role
- **Performance Benchmarks**: Performance requirements met
- **Integration Testing**: End-to-end functionality verified

**3. Completion Criteria**
```yaml
Completion Checklist:
- [ ] All assigned roles have marked deliverables complete
- [ ] Peer reviews completed by at least 2 team members
- [ ] Automated tests passing (unit + integration)
- [ ] Security review approved (if applicable)
- [ ] Documentation updated
- [ ] Team consensus vote: "Ready for submission" (majority required)
```

### Conflict Resolution

**1. Technical Disagreements**
- Structured debate with evidence presentation
- Prototype/spike development for comparison
- Expert agent consultation by posting to ...
- Escalation to WHOOSH Admin User (human) for tie-breaking

**2. Resource Conflicts**
- Workload re-balancing among team members
- Additional agent recruitment if needed
- Scope reduction with consensus approval and Issue lodgement

**3. Quality Disputes**
- Independent review by WHOOSH Admin User (human)
- Automated quality metric evaluation
- Compromise solution development
- Innovation agent inclusion to team

## CHORUS Integration

### UCXL-based Messaging Address Structure**

eg. For the following address:

ucxl://any:role@project:task/#/

**@project:task** *is* the Team ID.

This means any inter-agent discussions published to **@project:task** are seen by those CHORUS team members.

We use the *[antennae protocol]* (for libp2p) to pub / sub messaging between agents by sending the reasoning component to the other members.

So a communications log might look like this...

publish to chat room **@website:architecture-design**

PeerID = D0019:senior-software-architect

```
{
	"channel": "**@website:architecture-design**",
	"from-agentid": "**D0019:senior-software-architect**",
	"reponding-to": "None", 
	"thoughts": "
		<thinking>
		...
		</thinking>"
}
```

So as noted in our system prompts to every agent, between each step we gather any thoughts of our peers.

GET from API endpoint /api/v1/antennae/@website:architecture-design**

## Implementation Phases

### Phase 1: Foundation (WHOOSH Team Composer)
- **LLM-powered task analysis service**
- **Team composition templates and logic**
- **GITEA issue creation with team metadata**
- **Basic team formation workflows**

### Phase 2: Agent Enhancement (CHORUS Self-Organization)
- **Agent capability self-assessment systems**
- **GITEA monitoring and team application logic**
- **Autonomous team joining decision algorithms**
- **Agent-to-agent communication protocols**

### Phase 3: Collaboration Infrastructure (P2P Channels)
- **Team-specific communication channel creation**
- **Message routing and topic organization**
- **Real-time collaboration tools integration**
- **Communication archival for SLURP submission**

### Phase 4: Consensus Systems (Democratic Decision Making)
- **Voting mechanisms and quorum rules**
- **Quality gate automation and verification**
- **Conflict resolution procedures**
- **Completion criteria validation**

### Phase 5: Integration (SLURP Connectivity)
- **Artifact packaging and submission workflows**
- **UCXL address management and organization**
- **Context preservation and knowledge extraction**
- **Performance analytics and optimization**

## Benefits & Considerations

### Key Benefits

✅ **Fault Tolerance**: No single points of failure - teams operate independently
✅ **Scalability**: Teams form and dissolve dynamically based on demand
✅ **Quality**: Consensus-driven decisions improve deliverable quality
✅ **Knowledge Preservation**: Full context captured for future learning
✅ **Natural Collaboration**: Mirrors effective human team patterns
✅ **Autonomous Operation**: Minimal human intervention required
✅ **Adaptive**: Teams adjust composition based on task evolution
✅ **Observable**: Full transparency through GITEA and P2P channels

### Considerations & Challenges

⚠️ **Initial Complexity**: Sophisticated system requiring careful implementation
⚠️ **Coordination Overhead**: Team formation and consensus processes take time
⚠️ **Agent Training**: CHORUS agents need enhanced self-awareness capabilities
⚠️ **Network Dependencies**: P2P channels require reliable connectivity
⚠️ **Quality Variance**: Team effectiveness may vary based on composition
⚠️ **Resource Competition**: Popular agents may become bottlenecks
⚠️ **Conflict Resolution**: Complex disputes may require escalation mechanisms

### Success Metrics

**Team Formation Efficiency:**
- Time from task request to team formation
- Percentage of teams that form successfully
- Quality of initial team composition decisions

**Collaboration Effectiveness:**
- Team productivity metrics (velocity, quality)
- Communication frequency and engagement
- Consensus achievement rates

**Deliverable Quality:**
- Automated quality metrics (test coverage, security scores)
- Peer review feedback scores
- Stakeholder satisfaction ratings

**System Resilience:**
- Team reformation after agent failures
- Graceful degradation under load
- Recovery from network partitions

**Knowledge Accumulation:**
- Reuse of solutions and patterns
- Agent skill development over time
- Continuous improvement in team formation

## Future Evolution

### Advanced Capabilities
- **Cross-Team Coordination**: Teams collaborating on larger initiatives
- **Agent Specialization**: Agents developing deep expertise in specific domains
- **Dynamic Reconfiguration**: Teams adapting composition mid-task
- **Predictive Formation**: AI predicting optimal teams before task assignment
- **Quality Prediction**: Estimating deliverable quality during team formation

### Integration Opportunities
- **External Stakeholders**: Human team members or external AI services
- **Compliance Integration**: Automated regulatory and policy compliance
- **Performance Optimization**: ML-driven team composition optimization
- **Resource Management**: Intelligent compute and storage allocation
- **Governance**: Auditable decision trails and accountability mechanisms

This evolution represents a fundamental shift toward truly autonomous AI development capabilities that augment and eventually potentially replace traditional software development team structures, while maintaining the collaborative, consensus-driven decision-making that ensures high-quality outcomes.

---

# 🏗️ Technical Architecture

## Current Implementation Architecture

WHOOSH is currently implemented as a specialized council formation system integrated into the existing CHORUS stack, with clear separation between detection, composition, deployment, and monitoring concerns.

### High-Level System Flow
```mermaid
graph TB
    subgraph "Gitea Repository Monitoring"
        REPO[Repository] --> ISSUE[Design Brief Issue]
        ISSUE --> LABEL[chorus-entrypoint]
    end
    
    subgraph "WHOOSH Council Formation"
        LABEL --> MONITOR[WHOOSH Monitor]
        MONITOR --> DETECT[Issue Detection]
        DETECT --> COMPOSE[Council Composition]
        COMPOSE --> ROLES[human-roles.yaml]
    end
    
    subgraph "CHORUS Deployment"
        COMPOSE --> DEPLOY[Docker Swarm Deploy]
        DEPLOY --> CHORUS1[CHORUS Agent 1]
        DEPLOY --> CHORUS2[CHORUS Agent 2] 
        DEPLOY --> CHORUS3[CHORUS Agent N]
    end
    
    subgraph "P2P Collaboration"
        CHORUS1 --> P2P[P2P Network]
        CHORUS2 --> P2P
        CHORUS3 --> P2P
        P2P --> ARTIFACTS[Council Artifacts]
    end
    
    subgraph "Persistence Layer"
        DETECT --> PGDB[(PostgreSQL)]
        ARTIFACTS --> PGDB
        PGDB --> COUNCILS[Councils Table]
        PGDB --> AGENTS[Council Agents Table] 
        PGDB --> OUTPUTS[Council Artifacts Table]
    end
```

## Ecosystem Integration Points

### BZZZ Task Management Integration
**Current**: Council artifacts provide structured input for BZZZ task creation
- Council deliverables (manifests, DRs, scaffold plans) inform task breakdown structure
- Project context and constraints flow from council decisions to task specifications
- Council role recommendations influence team composition for ongoing development

**Future**: Direct handoff mechanisms between councils and BZZZ teams
- Automatic task generation based on scaffold plans
- Agent transition from council roles to development team roles
- Progress tracking continuity from kickoff through delivery

### SLURP Knowledge Integration
**Current**: Council communications and artifacts preserved via UCXL addressing
- All HMMM protocol messages stored with proper addressing for future reference
- Decision rationale and evidence captured in structured format
- Artifacts tagged with council ID and role attribution

**Future**: Enhanced knowledge graph integration
- Automated DR generation from council consensus decisions
- Cross-project pattern recognition and reuse recommendations
- Council effectiveness analytics based on project outcomes

### CHORUS Agent Ecosystem
**Current**: CHORUS agents configured with council-specific roles and context
- Role identifiers passed via environment variables from human-roles.yaml
- Design Brief content provided as task context
- P2P network access for inter-council communication

**Future**: Enhanced agent capabilities for team transitions
- Agent memory persistence across council and team phases
- Specialized council expertise development over time
- Cross-council knowledge sharing and best practice propagation

## Institutional Quality Gates

- **Provenance present:** artifacts reference UCXL addresses and cite prior DRs.
- **Secrets clean:** SHHH pass on channel logs and artifacts.
- **Temporal pin:** decisions pin the **addressed** time slice (`~~/`, `#/`) used.

## Council Formation → SLURP Integration

> WHOOSH composes councils → HMMM captures structured reasoning → **SLURP** ingests and packages kickoff artifacts for DR publication and future project reference.

### Current Artifact Flow
```
Design Brief Detection → Council Formation → HMMM Reasoning → Artifact Production → SLURP Ingestion
                                                                              ↓
                                                                    UCXL-addressed storage
                                                                              ↓  
                                                                    Future project reference
```

## Current Services Architecture

### WHOOSH Council Formation Stack
```mermaid
graph TB
    subgraph "CHORUS Unified Stack"
        subgraph "Frontend Layer"
            UI[WHOOSH Dashboard]
            WS[WebSocket Council Updates]
            API[Council API]
        end
        
        subgraph "WHOOSH Services"
            MONITOR[Repository Monitor]
            COMPOSER[Council Composer]
            DEPLOYER[Agent Deployer]
            TRACKER[Progress Tracker]
        end
        
        subgraph "Data Layer"
            POSTGRES[(PostgreSQL)]
            COUNCILS[Councils Table]
            AGENTS[Council Agents Table]
            ARTIFACTS[Artifacts Table]
        end
        
        subgraph "CHORUS Agent Network"
            LEAD[Lead Design Director]
            ARCH[Senior Software Architect]
            SEC[Security Expert]
            DB[Database Engineer]
            STRAT[Marketing Strategist]
        end
    end
    
    subgraph "External Integrations"
        GITEA[Gitea Repository]
        DOCKER[Docker Swarm]
        P2P[P2P Network]
        SLURP[SLURP Knowledge Store]
    end
    
    GITEA --> MONITOR
    MONITOR --> COMPOSER
    COMPOSER --> DEPLOYER
    DEPLOYER --> DOCKER
    
    DOCKER --> LEAD
    DOCKER --> ARCH
    DOCKER --> SEC
    DOCKER --> DB
    DOCKER --> STRAT
    
    LEAD --> P2P
    ARCH --> P2P
    SEC --> P2P
    DB --> P2P
    STRAT --> P2P
    
    P2P --> ARTIFACTS
    ARTIFACTS --> SLURP
    
    MONITOR --> POSTGRES
    COMPOSER --> POSTGRES
    TRACKER --> POSTGRES
```

### Future: Full Autonomous Team Architecture
```mermaid
graph TB
    subgraph "Enhanced WHOOSH Platform"
        subgraph "Frontend Layer"
            DASH[Team Dashboard]
            METRICS[Analytics UI]
            CONTROL[Control Panel]
        end
        
        subgraph "Core Services"
            TEAM_COMPOSER[Team Composer]
            AGENT_MANAGER[Agent Manager]
            WORKFLOW_ENGINE[Workflow Engine]
            CONSENSUS[Consensus Engine]
        end
        
        subgraph "Intelligence Layer"
            CAPABILITY_MATCHER[Capability Matcher]
            PERFORMANCE_ANALYZER[Performance Analyzer]
            PREDICTOR[Team Success Predictor]
        end
    end
    
    subgraph "Autonomous Agent Ecosystem"
        SELF_ORG[Self-Organizing Agents]
        SPECIALIST[Domain Specialists]
        GENERALIST[Generalist Agents]
    end
```

## Component Specifications

### Current Implementation Components

#### 🗺️ Repository Monitor

**Purpose**: Continuously monitors Gitea repositories for Design Brief issues that trigger council formation.

**Key Responsibilities**:
- Webhook-based repository event processing
- Design Brief issue detection (`chorus-entrypoint` labels)
- Repository sync status management (initial vs incremental)
- Issue content extraction and context building

**Current API Endpoints**:
```bash
GET    /api/repositories          # List monitored repositories
POST   /api/repositories          # Add repository to monitoring
GET    /api/repositories/{id}/issues  # Get repository issues
POST   /webhooks/gitea            # Gitea webhook endpoint
```

**Database Schema (Current)**:
```sql
-- Repository monitoring
repositories (
    id UUID PRIMARY KEY,
    full_name VARCHAR(255) NOT NULL,
    gitea_id INTEGER NOT NULL,
    sync_status VARCHAR(50) DEFAULT 'pending',
    last_issue_sync TIMESTAMP,
    created_at TIMESTAMP DEFAULT NOW()
);

-- Design Brief issues that trigger councils
issues (
    id UUID PRIMARY KEY,
    repository_id UUID REFERENCES repositories(id),
    gitea_id INTEGER NOT NULL,
    title VARCHAR(255) NOT NULL,
    body TEXT,
    labels JSONB,
    state VARCHAR(20) DEFAULT 'open',
    created_at TIMESTAMP DEFAULT NOW()
);
```

#### 🏢 Council Composer

**Purpose**: Analyzes Design Briefs and determines optimal council composition based on project requirements.

**Key Responsibilities**:
- Design Brief content analysis
- Role mapping from human-roles.yaml
- Council size and composition optimization
- Resource availability checking

**Current Implementation**:
```go
// Council composition logic
func ComposeCouncil(designBrief DesignBrief) (*CouncilComposition, error) {
    projectType := analyzeProjectType(designBrief.Content)
    requiredDomains := extractDomains(designBrief.Content)
    
    baseRoles := getBaseRolesForType(projectType)
    additionalRoles := getAdditionalRoles(requiredDomains)
    
    return &CouncilComposition{
        Roles: append(baseRoles, additionalRoles...),
        Size: len(baseRoles) + len(additionalRoles),
        ProjectContext: designBrief.Content,
    }, nil
}
```

#### 🚀 Agent Deployer

**Purpose**: Deploys CHORUS agents via Docker Swarm with council-specific configuration.

**Key Responsibilities**:
- Docker Swarm service creation and management
- Agent environment variable configuration
- P2P network setup for council communication
- Service health monitoring and recovery

**Current Deployment Logic**:
```go
// Docker service deployment for council agents
func DeployCouncilAgent(role string, councilID string, context string) error {
    serviceName := fmt.Sprintf("council-%s-%s", councilID, role)
    
    serviceSpec := swarm.ServiceSpec{
        Annotations: swarm.Annotations{
            Name: serviceName,
        },
        TaskTemplate: swarm.TaskSpec{
            ContainerSpec: &swarm.ContainerSpec{
                Image: "anthonyrawlins/chorus:latest",
                Env: []string{
                    fmt.Sprintf("CHORUS_ROLE=%s", role),
                    fmt.Sprintf("CHORUS_TASK_CONTEXT=%s", context),
                    fmt.Sprintf("P2P_NETWORK=council-%s", councilID),
                },
                Mounts: []mount.Mount{{
                    Type:   mount.TypeBind,
                    Source: "/rust/containers/WHOOSH/prompts",
                    Target: "/app/prompts",
                }},
            },
        },
    }
    
    return dockerClient.ServiceCreate(context.Background(), serviceSpec, types.ServiceCreateOptions{})
}
```

#### 📊 Progress Tracker

**Purpose**: Monitors council progress and artifact production throughout the kickoff process.

**Key Responsibilities**:
- Council agent deployment status tracking
- Artifact production monitoring
- Decision consensus tracking
- Council completion detection
- Error handling and recovery coordination

**Database Tracking (Current)**:
```sql
-- Council progress tracking
council_agents (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    council_id UUID REFERENCES councils(id),
    role_name VARCHAR(100) NOT NULL,
    service_id VARCHAR(255), -- Docker service ID
    status VARCHAR(50) DEFAULT 'pending',
    deployed_at TIMESTAMP,
    UNIQUE(council_id, role_name)
);

council_artifacts (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    council_id UUID REFERENCES councils(id),
    artifact_type VARCHAR(50) NOT NULL,
    content TEXT,
    produced_by VARCHAR(255),
    status VARCHAR(50) DEFAULT 'draft',
    produced_at TIMESTAMP DEFAULT NOW()
);
```

### Future Components

#### 🤖 Enhanced Agent Manager (Planned)

**Purpose**: Manages autonomous agent capabilities, performance, and self-organization for ongoing teams.

**Future Responsibilities**:
- Agent capability self-assessment
- Dynamic team joining algorithms
- Performance tracking and optimization
- Cross-team agent coordination
- Predictive team formation

**Future Agent Self-Registration Protocol**:
```json
{
    "agent_id": "chorus-agent-001",
    "name": "Senior Software Architect",
    "current_role": "senior-software-architect",
    "specializations": ["microservices", "system-design", "scalability"],
    "council_history": [
        {"council_id": "marketplace-kickoff", "role": "architect", "rating": 4.8},
        {"council_id": "analytics-platform", "role": "architect", "rating": 4.9}
    ],
    "capabilities": {
        "architecture_design": 0.95,
        "technology_selection": 0.90,
        "team_leadership": 0.85,
        "consensus_building": 0.88
    },
    "availability": {
        "current_councils": 1,
        "max_concurrent": 3,
        "preferred_domains": ["fintech", "ecommerce", "enterprise"]
    },
    "learning_metrics": {
        "councils_completed": 47,
        "avg_artifact_quality": 4.7,
        "consensus_success_rate": 0.92,
        "stakeholder_satisfaction": 4.8
    }
}
```

**Current Council Health Monitoring**:
```go
// Council agent health check
type CouncilAgentHealth struct {
    AgentID        string    `json:"agent_id"`
    CouncilID      string    `json:"council_id"`
    Role           string    `json:"role"`
    ServiceID      string    `json:"service_id"`
    Status         string    `json:"status"`
    LastSeen       time.Time `json:"last_seen"`
    ArtifactsCount int       `json:"artifacts_produced"`
    P2PConnected   bool      `json:"p2p_connected"`
    ErrorMessage   *string   `json:"error_message,omitempty"`
}

// Future: Enhanced agent health with self-awareness
type AutonomousAgentHealth struct {
    CouncilAgentHealth
    SelfAssessment struct {
        TaskFit        float64 `json:"task_fit_confidence"`
        Workload       float64 `json:"current_workload_percent"`
        Collaboration  float64 `json:"team_collaboration_score"`
        LearningRate   float64 `json:"recent_learning_velocity"`
    } `json:"self_assessment"`
}
```


## Current Data Architecture

### 🗄️ Council Database Schema (Implemented)

**Current Tables**:
```sql
-- Council management (from migrations/005_add_council_tables.up.sql)
CREATE TABLE councils (
    id UUID PRIMARY KEY,
    project_name VARCHAR(255) NOT NULL,
    repository VARCHAR(500) NOT NULL,
    project_brief TEXT NOT NULL,
    constraints TEXT,
    tech_limits TEXT,
    compliance_notes TEXT,
    targets TEXT,
    status VARCHAR(50) NOT NULL DEFAULT 'forming',
    created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    task_id UUID REFERENCES tasks(id)
);

-- Council agent tracking
CREATE TABLE council_agents (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    council_id UUID NOT NULL REFERENCES councils(id),
    role_name VARCHAR(100) NOT NULL,
    agent_name VARCHAR(255) NOT NULL,
    deployed BOOLEAN NOT NULL DEFAULT false,
    service_id VARCHAR(255), -- Docker service ID
    status VARCHAR(50) NOT NULL DEFAULT 'pending',
    UNIQUE(council_id, role_name)
);

-- Council artifact production
CREATE TABLE council_artifacts (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    council_id UUID NOT NULL REFERENCES councils(id),
    artifact_type VARCHAR(50) NOT NULL, -- kickoff_manifest, seminal_dr, etc.
    content TEXT,
    produced_by VARCHAR(255),
    status VARCHAR(50) NOT NULL DEFAULT 'draft',
    produced_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

-- Council decision tracking
CREATE TABLE council_decisions (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    council_id UUID NOT NULL REFERENCES councils(id),
    decision_type VARCHAR(50) NOT NULL,
    decision_title VARCHAR(255) NOT NULL,
    options JSONB,
    chosen_option JSONB,
    votes JSONB,
    decided_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
```

**Performance Indexes**:
```sql
-- Optimized for council operations
CREATE INDEX idx_councils_status ON councils(status);
CREATE INDEX idx_councils_repository ON councils(repository);
CREATE INDEX idx_council_agents_council_id ON council_agents(council_id);
CREATE INDEX idx_council_agents_deployed ON council_agents(deployed);
CREATE INDEX idx_council_artifacts_type ON council_artifacts(artifact_type);
```

### Future: Enhanced Team Database Schema

**Planned Extensions for Autonomous Teams**:
```sql
-- Team capability tracking (future)
CREATE TABLE agent_capabilities (
    agent_id UUID,
    capability_name VARCHAR(100),
    proficiency_score FLOAT,
    confidence_level FLOAT,
    last_updated TIMESTAMP
);

-- Team performance metrics (future)
CREATE TABLE team_performance (
    team_id UUID,
    project_id UUID,
    success_metrics JSONB,
    completion_time INTERVAL,
    quality_score FLOAT,
    stakeholder_satisfaction FLOAT
);
```

## Council Communication Architecture

### Current: P2P Council Network

**HMMM Protocol Integration**:
```typescript
interface CouncilMessage {
    messageId: string;
    councilId: string;
    fromRole: string;
    toRoles: string[]; // Broadcast or targeted
    messageType: 'reasoning' | 'decision' | 'artifact' | 'consensus';
    content: {
        thinking: string;     // HMMM reasoning chain
        evidence: any[];      // Supporting data
        recommendation?: any; // Proposed action/decision
        confidence: number;   // 0.0 to 1.0
    };
    ucxlAddress: string;     // UCXL addressing
    timestamp: string;
}

// Example council communication
{
    "messageId": "msg_council_001",
    "councilId": "marketplace-kickoff",
    "fromRole": "senior-software-architect",
    "toRoles": ["lead-design-director", "database-engineer"],
    "messageType": "reasoning",
    "content": {
        "thinking": "Given the multi-vendor requirements, microservices architecture provides necessary isolation...",
        "evidence": ["scalability-requirements", "vendor-isolation-needs"],
        "recommendation": "microservices-with-api-gateway",
        "confidence": 0.87
    },
    "ucxlAddress": "ucxl://senior-software-architect@marketplace:kickoff#architecture/",
    "timestamp": "2025-01-12T10:30:00Z"
}
```

### Future: Enhanced Team Communication

**Autonomous Team Coordination**:
```typescript
interface TeamCoordinationMessage {
    teamId: string;
    phase: 'planning' | 'execution' | 'review' | 'integration';
    priority: 'low' | 'medium' | 'high' | 'critical';
    requiresConsensus: boolean;
    votingDeadline?: string;
    escalationPath?: string[];
}
```

### 📡 Event Streaming

**Event Bus Architecture**:
```python
@dataclass
class WHOOSHEvent:
    id: str
    type: str
    source: str
    timestamp: datetime
    data: Dict[str, Any]
    correlation_id: Optional[str] = None

class EventBus:
    async def publish(self, event: WHOOSHEvent) -> None:
        """Publish event to all subscribers"""
        
    async def subscribe(self, event_type: str, handler: Callable) -> str:
        """Subscribe to specific event types"""
        
    async def unsubscribe(self, subscription_id: str) -> None:
        """Remove subscription"""
```

**Event Types**:
```python
# Agent Events
AGENT_REGISTERED = "agent.registered"
AGENT_STATUS_CHANGED = "agent.status_changed"
AGENT_PERFORMANCE_UPDATE = "agent.performance_update"

# Task Events
TASK_CREATED = "task.created"
TASK_ASSIGNED = "task.assigned"
TASK_STARTED = "task.started"
TASK_COMPLETED = "task.completed"
TASK_FAILED = "task.failed"

# Workflow Events
WORKFLOW_EXECUTION_STARTED = "workflow.execution_started"
WORKFLOW_NODE_COMPLETED = "workflow.node_completed"
WORKFLOW_EXECUTION_COMPLETED = "workflow.execution_completed"

# System Events
SYSTEM_ALERT = "system.alert"
SYSTEM_MAINTENANCE = "system.maintenance"
```

## Security Architecture

### 🔒 Authentication & Authorization

**JWT Token Structure**:
```json
{
    "sub": "user_id",
    "iat": 1625097600,
    "exp": 1625184000,
    "roles": ["admin", "developer"],
    "permissions": [
        "workflows.create",
        "agents.manage",
        "executions.view"
    ],
    "tenant": "organization_id"
}
```

**Permission Matrix**:
```yaml
roles:
  admin:
    permissions: ["*"]
    description: "Full system access"
    
  developer:
    permissions:
      - "workflows.*"
      - "executions.*"
      - "agents.view"
      - "tasks.create"
    description: "Development and execution access"
    
  viewer:
    permissions:
      - "workflows.view"
      - "executions.view"
      - "agents.view"
    description: "Read-only access"
```

### 🛡️ API Security

**Rate Limiting**:
```python
# Rate limits by endpoint and user role
RATE_LIMITS = {
    "api.workflows.create": {"admin": 100, "developer": 50, "viewer": 0},
    "api.executions.start": {"admin": 200, "developer": 100, "viewer": 0},
    "api.agents.register": {"admin": 10, "developer": 0, "viewer": 0},
}
```

**Input Validation**:
```python
from pydantic import BaseModel, validator

class WorkflowCreateRequest(BaseModel):
    name: str
    description: Optional[str]
    n8n_data: Dict[str, Any]
    
    @validator('name')
    def validate_name(cls, v):
        if len(v) < 3 or len(v) > 255:
            raise ValueError('Name must be 3-255 characters')
        return v
    
    @validator('n8n_data')
    def validate_n8n_data(cls, v):
        required_fields = ['nodes', 'connections']
        if not all(field in v for field in required_fields):
            raise ValueError('Invalid n8n workflow format')
        return v
```

## Deployment Architecture

### 🐳 Container Strategy

**Docker Compose Structure**:
```yaml
version: '3.8'
services:
  whoosh-coordinator:
    image: whoosh/coordinator:latest
    environment:
      - DATABASE_URL=postgresql://user:pass@postgres:5432/whoosh
      - REDIS_URL=redis://redis:6379
    depends_on: [postgres, redis]
    
  whoosh-frontend:
    image: whoosh/frontend:latest
    environment:
      - API_URL=http://whoosh-coordinator:8000
    depends_on: [whoosh-coordinator]
    
  postgres:
    image: postgres:15
    environment:
      - POSTGRES_DB=whoosh
      - POSTGRES_USER=whoosh
      - POSTGRES_PASSWORD=${DB_PASSWORD}
    volumes:
      - postgres_data:/var/lib/postgresql/data
      
  redis:
    image: redis:7-alpine
    volumes:
      - redis_data:/data
      
  prometheus:
    image: prom/prometheus:latest
    volumes:
      - ./monitoring/prometheus.yml:/etc/prometheus/prometheus.yml
      
  grafana:
    image: grafana/grafana:latest
    environment:
      - GF_SECURITY_ADMIN_PASSWORD=${GRAFANA_PASSWORD}
    volumes:
      - grafana_data:/var/lib/grafana
```

### 🌐 Network Architecture

**Production Network Topology**:
```
Internet
    ↓
[Traefik Load Balancer] (SSL Termination)
    ↓
[tengig Overlay Network]
    ↓
┌─────────────────────────────────────┐
│  WHOOSH Application Services         │
│  ├── Frontend (React)              │
│  ├── Backend API (FastAPI)         │
│  ├── WebSocket Gateway             │
│  └── Task Queue Workers            │
└─────────────────────────────────────┘
    ↓
┌─────────────────────────────────────┐
│  Data Services                      │
│  ├── PostgreSQL (Primary DB)       │
│  ├── Redis (Cache + Sessions)      │
│  ├── InfluxDB (Metrics)            │
│  └── Prometheus (Monitoring)       │
└─────────────────────────────────────┘
    ↓
┌─────────────────────────────────────┐
│  AI Agent Network (2-node cluster) │
│  ├── WALNUT (192.168.1.27:11434)  │
│  │   └── ROCm (RX 9060 XT)        │
│  └── ACACIA (192.168.1.72:11434)  │
│      └── CUDA (RTX 2080 Super)    │
└─────────────────────────────────────┘
```

## Performance Considerations

### 🚀 Optimization Strategies

**Database Optimization**:
- Connection pooling with asyncpg
- Query optimization with proper indexing
- Time-series data partitioning for metrics
- Read replicas for analytics queries

**Caching Strategy**:
- Redis for session and temporary data
- Application-level caching for expensive computations
- CDN for static assets
- Database query result caching

**Concurrency Management**:
- AsyncIO for I/O-bound operations
- Connection pools for database and HTTP clients
- Semaphores for limiting concurrent agent requests
- Queue-based task processing

### 📊 Monitoring & Observability

**Key Metrics**:
```yaml
# Application Metrics
- whoosh_active_agents_total
- whoosh_task_queue_length
- whoosh_workflow_executions_total
- whoosh_api_request_duration_seconds
- whoosh_websocket_connections_active

# Infrastructure Metrics  
- whoosh_database_connections_active
- whoosh_redis_memory_usage_bytes
- whoosh_container_cpu_usage_percent
- whoosh_container_memory_usage_bytes

# Business Metrics
- whoosh_workflows_created_daily
- whoosh_execution_success_rate
- whoosh_agent_utilization_percent
- whoosh_average_task_completion_time
```

**Alerting Rules**:
```yaml
groups:
- name: whoosh.rules
  rules:
  - alert: HighErrorRate
    expr: rate(whoosh_api_errors_total[5m]) > 0.1
    for: 2m
    labels:
      severity: warning
    annotations:
      summary: "High error rate detected"
      
  - alert: AgentDown
    expr: whoosh_agent_health_status == 0
    for: 1m
    labels:
      severity: critical
    annotations:
      summary: "Agent {{ $labels.agent_id }} is down"
```

This architecture provides a solid foundation for the unified WHOOSH platform, combining the best practices from our existing distributed AI projects while ensuring scalability, maintainability, and observability.