- Changed NetworkName from 'chorus_default' to 'chorus_net' - This matches the actual network 'CHORUS_chorus_net' (service prefix added automatically) - Fixes discovered_count:0 issue - now successfully discovering all 25 agents - Updated IMPLEMENTATION-SUMMARY with deployment status Result: All 25 CHORUS agents now discovered successfully via Docker Swarm API 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
41 KiB
WHOOSH Backend Architecture Documentation
Version: 0.1.1-debug Last Updated: October 2025 Status: Beta (MVP + Council Formation)
Table of Contents
- System Overview
- Architecture Patterns
- Core Components
- Database Architecture
- API Layer
- External Service Integrations
- Orchestration & Deployment
- Configuration Management
- Security & Authentication
- Observability
- Development Workflow
System Overview
WHOOSH is an autonomous AI development team orchestration system built in Go. It monitors Gitea repositories for Design Brief issues, forms project kickoff councils, composes teams, and deploys CHORUS AI agents to Docker Swarm for autonomous development work.
Current Status
✅ Working:
- Gitea Design Brief detection + council composition
- Docker Swarm agent deployment with role-specific environment variables
- JWT authentication, rate limiting, OpenTelemetry hooks
- Repository monitoring and issue synchronization
- Team composition with heuristic-based analysis
🚧 Under Construction:
- API persistence (REST handlers return placeholder data while Postgres wiring is finished)
- Analysis ingestion (composer relies on heuristic classification; LLM analysis is logged but unimplemented)
- Deployment telemetry (results aren't persisted yet)
- Autonomous team joining and role balancing
Technology Stack
- Language: Go 1.22+ (toolchain go1.24.5)
- Web Framework: go-chi/chi/v5 (HTTP router)
- Database: PostgreSQL (pgx/v5 driver)
- Container Orchestration: Docker Swarm API
- Migrations: golang-migrate/migrate/v4
- Logging: zerolog (structured logging)
- Tracing: OpenTelemetry + Jaeger
- Authentication: JWT (golang-jwt/jwt/v5)
- External Services: Gitea API, BACKBEAT timing system, N8N workflows
Architecture Patterns
1. Layered Architecture
┌─────────────────────────────────────────┐
│ API Layer (server/) │ HTTP Handlers, Routing, Middleware
├─────────────────────────────────────────┤
│ Business Logic Layer │
│ ┌─────────────┬──────────────────────┐ │
│ │ Composer │ Orchestrator │ │ Team Formation, Agent Deployment
│ ├─────────────┼──────────────────────┤ │
│ │ Monitor │ Council │ │ Repository Sync, Council Formation
│ └─────────────┴──────────────────────┘ │
├─────────────────────────────────────────┤
│ Integration Layer │ Gitea, Docker, BACKBEAT, N8N
├─────────────────────────────────────────┤
│ Data Layer (database/) │ PostgreSQL Connection Pool
└─────────────────────────────────────────┘
2. Service-Oriented Design
Each internal package represents a distinct service with clear responsibilities:
- Composer: Task analysis and team composition
- Orchestrator: Container deployment and scaling
- Monitor: Repository monitoring and issue ingestion
- Council: Project kickoff council formation
- Gitea Client: Gitea API integration
- Agent Registry: Agent lifecycle management
3. Context-Driven Execution
All operations use Go context for:
- Request tracing (OpenTelemetry spans)
- Timeout management
- Graceful cancellation
- Propagation of request-scoped values
Core Components
1. Server (internal/server/)
Responsibilities:
- HTTP server lifecycle management
- Router configuration (chi)
- Middleware setup (CORS, auth, rate limiting, security headers)
- Health check endpoints
- API route registration
Key Files:
server.go: Main server struct, initialization, routing setup
Initialization Flow:
1. Load configuration from environment variables
2. Initialize database connection pool
3. Initialize external service clients (Gitea, Docker)
4. Create business logic services (composer, orchestrator, monitor)
5. Setup router with middleware
6. Register API routes
7. Start background services (monitor, P2P discovery, agent registry)
8. Start HTTP server
API Routes (v1):
/api/v1/teams- Team management/api/v1/tasks- Task ingestion and management/api/v1/projects- Project management (Gitea repositories)/api/v1/agents- Agent registration and status/api/v1/repositories- Repository monitoring configuration/api/v1/councils- Council management and artifacts/api/v1/assignments- Agent assignment broker (if Docker enabled)/api/v1/scaling- Wave-based scaling API (if Docker enabled)/api/v1/slurp- SLURP proxy for UCXL content submission/api/v1/backbeat- BACKBEAT status monitoring
2. Monitor (internal/monitor/)
Responsibilities:
- Periodic repository synchronization (default: 5 minutes)
- Issue detection and ingestion from Gitea
- Design Brief detection for council formation
- Task creation and updates in database
- Triggering team composition or council formation
Key Features:
- Incremental sync using
sinceparameter (after initial scan) - Label-based filtering (e.g.,
bzzz-task,chorus-entrypoint) - Support for multiple sync states:
pending,initial_scan,active,error,disabled - Automatic transition from initial scan to active when content found
Council Detection Logic:
func isProjectKickoffBrief(issue) bool {
// Must have "chorus-entrypoint" label
// Must have "Design Brief" in title
return hasChorusEntrypoint && containsDesignBrief
}
Sync Flow:
1. Get all monitored repositories (WHERE monitor_issues = true)
2. For each repository:
a. Fetch issues from Gitea API
b. Filter by CHORUS labels if enabled
c. Create or update task records
d. Check for Design Brief issues → trigger council formation
e. Check for bzzz-task issues → trigger team composition
f. Update repository sync timestamps
3. Log sync results and statistics
3. Composer (internal/composer/)
Responsibilities:
- Task classification (feature, bug fix, security, etc.)
- Complexity analysis and risk assessment
- Skill requirement extraction
- Team composition and agent matching
- Team persistence to database
Configuration:
type ComposerConfig struct {
ClassificationModel string // LLM model for classification
SkillAnalysisModel string // LLM model for skill analysis
MatchingModel string // LLM model for team matching
DefaultStrategy string // "minimal_viable"
MinTeamSize int // 1
MaxTeamSize int // 3
SkillMatchThreshold float64 // 0.6
AnalysisTimeoutSecs int // 30-60
FeatureFlags FeatureFlags
}
Feature Flags:
EnableLLMClassification: Use LLM vs heuristics (default: false)EnableLLMSkillAnalysis: Use LLM vs heuristics (default: false)EnableLLMTeamMatching: Use LLM vs heuristics (default: false)EnableFailsafeFallback: Fallback to heuristics on LLM failure (default: true)
Analysis Pipeline:
TaskAnalysisInput
↓
1. classifyTask() → TaskClassification
- determineTaskType() [heuristic or LLM]
- estimateComplexity()
- identifyDomains()
↓
2. analyzeSkillRequirements() → SkillRequirements
- Map domains to skills
- Determine critical vs desirable
↓
3. getAvailableAgents() → []*Agent
↓
4. composeTeam() → TeamComposition
- selectRequiredRoles()
- matchAgentsToRoles()
- calculateConfidence()
↓
5. CreateTeam() → Team (persisted to DB)
Task Types:
feature_developmentbug_fixrefactoringsecurityintegrationmigrationresearchoptimizationmaintenance
4. Council (internal/council/)
Responsibilities:
- Project kickoff council formation
- Core agent selection (Product Manager, Engineering Lead, Quality Lead)
- Optional agent selection (Security, DevOps, UX)
- Council composition persistence
Council Composition:
type CouncilComposition struct {
CouncilID uuid.UUID
ProjectName string
CoreAgents []CouncilAgent // PM, Eng Lead, QA Lead
OptionalAgents []CouncilAgent // Security, DevOps, UX
Strategy string
Status string
}
Council Roles:
-
Core Agents (always deployed):
- Product Manager (PM)
- Engineering Lead (eng-lead)
- Quality Lead (qa-lead)
-
Optional Agents (deployed based on project needs):
- Security Lead (sec-lead)
- DevOps Lead (devops-lead)
- UX Lead (ux-lead)
5. Orchestrator (internal/orchestrator/)
Responsibilities:
- Docker Swarm service deployment
- Agent container configuration
- Resource allocation (CPU/memory limits)
- Volume mounting and network configuration
- Service scaling and health monitoring
Components:
SwarmManager (swarm_manager.go)
- Docker Swarm API client wrapper
- Service creation, scaling, removal
- Task monitoring and status tracking
Key Methods:
DeployAgent(config *AgentDeploymentConfig) (*swarm.Service, error)
ScaleService(serviceName string, replicas int) error
GetServiceStatus(serviceName string) (*ServiceStatus, error)
RemoveAgent(serviceID string) error
AgentDeployer (agent_deployer.go)
- Team agent deployment orchestration
- Council agent deployment orchestration
- Agent assignment to CHORUS containers
Deployment Flow:
DeploymentRequest
↓
1. For each agent in team/council:
a. selectAgentImage() → CHORUS image
b. buildAgentEnvironment() → env vars
c. buildAgentVolumes() → Docker socket + workspace
d. calculateResources() → CPU/memory limits
e. deploySingleAgent() → Swarm service
↓
2. recordDeployment() → Update database
3. updateTeamDeploymentStatus() → Track overall status
Agent Environment Variables:
CHORUS_AGENT_NAME=<role_name> # Maps to human-roles.yaml
CHORUS_TEAM_ID=<uuid>
CHORUS_TASK_ID=<uuid>
CHORUS_PROJECT=<repository>
CHORUS_TASK_TITLE=<title>
CHORUS_TASK_DESC=<description>
CHORUS_PRIORITY=<priority>
CHORUS_EXTERNAL_URL=<issue_url>
WHOOSH_COORDINATOR=true
WHOOSH_ENDPOINT=http://whoosh:8080
DOCKER_HOST=unix:///var/run/docker.sock
Resource Allocation:
ResourceLimits{
CPULimit: 1000000000, // 1 CPU core
MemoryLimit: 1073741824, // 1 GB RAM
CPURequest: 500000000, // 0.5 CPU core
MemoryRequest: 536870912, // 512 MB RAM
}
Scaling System (scaling_*.go)
- Wave-based scaling controller
- Bootstrap pool manager
- Assignment broker
- Health gates (KACHING, BACKBEAT, CHORUS)
- Metrics collector
Scaling Components:
ScalingController: Coordinates scaling operationsBootstrapPoolManager: Manages pre-warmed agent poolAssignmentBroker: Assigns tasks to available agentsHealthGates: Checks system health before scalingScalingMetricsCollector: Tracks scaling operation metrics
6. Gitea Client (internal/gitea/)
Responsibilities:
- Gitea API client with retry logic
- Issue listing and retrieval
- Repository information fetching
- Label management and creation
- Webhook payload parsing
Configuration Options:
type GITEAConfig struct {
BaseURL string // Gitea instance URL
Token string // API token
TokenFile string // Token from file
WebhookPath string // Webhook endpoint path
WebhookToken string // Webhook secret
EagerFilter bool // Pre-filter by labels at API level
FullRescan bool // Ignore since parameter for full rescan
DebugURLs bool // Log exact URLs
MaxRetries int // Retry attempts (default: 3)
RetryDelay time.Duration // Delay between retries (default: 2s)
}
Retry Logic:
- Automatic retry on 5xx errors and 429 (rate limiting)
- Configurable max retries and delay
- No retry on 4xx client errors
- Exponential backoff via configured delay
Issue Fetching:
func GetIssues(owner, repo string, opts IssueListOptions) ([]Issue, error)
- Supports state filtering (open/closed/all)
- Label filtering (eager at API or in-code)
- Since parameter for incremental sync
- Pagination support
Label Management:
func EnsureRequiredLabels(owner, repo string) error
- Creates standardized labels:
- bug, enhancement, duplicate, invalid, etc.
- bzzz-task (CHORUS task marker)
- chorus-entrypoint (Design Brief marker)
7. BACKBEAT Integration (internal/backbeat/)
Responsibilities:
- Integration with BACKBEAT timing system (NATS-based)
- Beat-synchronized status emission
- Search operation tracking
- Health monitoring
Key Concepts:
- Beat: Regular timing event (every 30 seconds at 2 BPM default)
- Downbeat: Bar start event (every 4 beats = 2 minutes)
- StatusClaim: Progress update emitted to NATS
Search Operation Phases:
PhaseStarted → PhaseIndexing → PhaseQuerying → PhaseRanking → PhaseCompleted/PhaseFailed
Integration Flow:
1. Start(ctx) → Connect to NATS cluster
2. OnBeat() → Emit status claims every beat
3. OnDownbeat() → Cleanup completed operations
4. StartSearch() → Register new search operation
5. UpdateSearchPhase() → Update operation progress
6. CompleteSearch() → Mark operation complete
8. Authentication & Security (internal/auth/)
Components:
Middleware (middleware.go)
- JWT token validation
- Service token authentication
- Admin role checking
- Request authentication
Methods:
Authenticate(next http.Handler) http.Handler // Generic auth
ServiceTokenRequired(next http.Handler) http.Handler // Service tokens only
AdminRequired(next http.Handler) http.Handler // Admin role required
Rate Limiter (ratelimit.go)
- IP-based rate limiting
- Configurable requests per time window
- In-memory storage with automatic cleanup
Default Configuration:
RateLimiter{
RequestsPerMinute: 100,
CleanupInterval: time.Minute,
}
9. Validation (internal/validation/)
Security Headers:
func SecurityHeaders(next http.Handler) http.Handler
- X-Content-Type-Options: nosniff
- X-Frame-Options: DENY
- X-XSS-Protection: 1; mode=block
- Content-Security-Policy: default-src 'self'
Input Validation:
- UUID validation
- Request body size limits
- Content-Type validation
10. Tracing (internal/tracing/)
OpenTelemetry Integration:
- Jaeger exporter for distributed tracing
- Span creation for key operations
- Context propagation across services
- Performance monitoring
Span Types:
StartSpan(ctx, "operation_name") → Generic span
StartMonitorSpan(ctx, "operation", "repository") → Repository monitoring
StartCouncilSpan(ctx, "operation", "council_id") → Council operations
StartDeploymentSpan(ctx, "operation", "resource_id") → Deployment operations
Configuration:
type OpenTelemetryConfig struct {
Enabled bool
ServiceName string // "whoosh"
ServiceVersion string // "1.0.0"
Environment string // "production"
JaegerEndpoint string // "http://localhost:14268/api/traces"
SampleRate float64 // 1.0 (100%)
}
Database Architecture
Schema Overview
Core Tables:
teams- Team recordsteam_roles- Role definitions (executor, coordinator, reviewer)team_assignments- Agent-to-role assignmentsagents- AI agent registrytasks- Task records from Gitea/external sourcesrepositories- Monitored repository configurationsrepository_sync_logs- Sync operation historycouncils- Project kickoff council recordscouncil_agents- Council agent assignmentscouncil_artifacts- Council-generated artifacts
Key Relationships
repositories (1) ──→ (N) tasks
tasks (1) ──→ (1) teams (assigned_team_id)
tasks (1) ──→ (1) councils (via task_id)
teams (1) ──→ (N) team_assignments
team_assignments (N) ──→ (1) agents
team_assignments (N) ──→ (1) team_roles
councils (1) ──→ (N) council_agents
Migration System
Location: /migrations/*.sql
Migration Files:
001_init_schema.up.sql- Initial teams, agents, roles002_add_tasks_table.up.sql- Task management003_add_repositories_table.up.sql- Repository monitoring004_enhance_task_team_integration.up.sql- Enhanced relationships005_add_council_tables.up.sql- Council management006_add_performance_indexes.up.sql- Query optimization007_add_team_deployment_status.up.sql- Deployment tracking
Running Migrations:
# Automatic on startup (if AutoMigrate=true)
WHOOSH_DATABASE_AUTO_MIGRATE=true go run ./cmd/whoosh
# Manual via migrate CLI
migrate -database "postgres://..." -path ./migrations up
Connection Pooling
type DatabaseConfig struct {
MaxOpenConns int // 25 (default)
MaxIdleConns int // 5 (default)
MaxConnLifetime time.Duration // 1 hour
MaxConnIdleTime time.Duration // 30 minutes
}
Key Indexes
Performance Indexes:
-- Agent availability
idx_agents_status_last_seen ON agents(status, last_seen)
-- Repository lookups
idx_repositories_full_name_lookup ON repositories(full_name)
idx_repositories_last_issue_sync ON repositories(last_issue_sync)
-- Task lookups
idx_tasks_external_source_lookup ON tasks(external_id, source_type)
idx_tasks_repository_id ON tasks(repository_id)
idx_tasks_assigned_team_id ON tasks(assigned_team_id)
-- Team deployment
idx_teams_deployment_status ON teams(deployment_status)
API Layer
Request/Response Format
Standard Response:
{
"status": "success",
"data": { ... },
"message": "Operation completed successfully"
}
Error Response:
{
"status": "error",
"error": "Error message",
"details": { ... }
}
Authentication
JWT Token Format:
Authorization: Bearer <jwt_token>
Service Token Format:
Authorization: Bearer <service_token>
Key API Endpoints
Teams API
GET /api/v1/teams - List all teams (with pagination)
POST /api/v1/teams - Create new team (admin only)
GET /api/v1/teams/{teamID} - Get team details
PUT /api/v1/teams/{teamID}/status - Update team status (admin only)
POST /api/v1/teams/analyze - Analyze task for team composition
Tasks API
GET /api/v1/tasks - List all tasks
POST /api/v1/tasks/ingest - Ingest task from external source (service token)
GET /api/v1/tasks/{taskID} - Get task details
Projects API (Gitea Repositories)
GET /api/v1/projects - List all projects
POST /api/v1/projects - Create new project (admin only)
GET /api/v1/projects/{projectID} - Get project details
GET /api/v1/projects/{projectID}/tasks - List project tasks
POST /api/v1/projects/{projectID}/tasks/{taskNumber}/claim - Claim task
Repositories API
GET /api/v1/repositories - List monitored repositories
POST /api/v1/repositories - Add repository for monitoring (admin only)
GET /api/v1/repositories/{repoID} - Get repository details
PUT /api/v1/repositories/{repoID} - Update repository config (admin only)
POST /api/v1/repositories/{repoID}/sync - Trigger manual sync (admin only)
POST /api/v1/repositories/{repoID}/ensure-labels - Create standard labels (admin only)
GET /api/v1/repositories/{repoID}/logs - Get sync logs
Councils API
GET /api/v1/councils/{councilID} - Get council details
GET /api/v1/councils/{councilID}/artifacts - List council artifacts
POST /api/v1/councils/{councilID}/artifacts - Create artifact (admin only)
Agents API
GET /api/v1/agents - List all agents
POST /api/v1/agents/register - Register new agent
PUT /api/v1/agents/{agentID}/status - Update agent status
Scaling API (if Docker enabled)
GET /api/v1/scaling/status - Get scaling system status
POST /api/v1/scaling/scale-up - Manually trigger scale-up
POST /api/v1/scaling/scale-down - Manually trigger scale-down
GET /api/v1/scaling/metrics - Get scaling metrics
Health & Monitoring
GET /health - Basic health check
GET /health/ready - Readiness probe
GET /admin/health/details - Detailed health information
GET /api/v1/backbeat/status - BACKBEAT integration status
Webhook Endpoints
Gitea Webhook
POST /webhooks/gitea - Receive Gitea webhook events
Supported Events:
issues- Issue opened/closed/editedissue_comment- Comment addedpush- Code pushedpull_request- PR opened/merged
Webhook Security:
- HMAC signature verification using webhook token
- X-Gitea-Signature header validation
External Service Integrations
1. Gitea Integration
Base URL: Configured via WHOOSH_GITEA_BASE_URL
Authentication: API token (from file or environment)
API Operations:
- List repositories
- Get repository details
- List issues (with filtering)
- Get issue details
- Create/manage labels
- Test connection
Webhook Integration:
- Receives issue events (create, update, close)
- Triggers team composition or council formation
- Updates task status in database
2. Docker Swarm Integration
Socket: Unix socket (/var/run/docker.sock) or TCP
Operations:
- Service creation (
ServiceCreate) - Service scaling (
ServiceUpdate) - Service inspection (
ServiceInspectWithRaw) - Task listing (
TaskList) - Service removal (
ServiceRemove) - Service logs (
ServiceLogs)
Network: Agents deployed to chorus_default network by default
Image Registry: registry.home.deepblack.cloud (private registry)
Standard Image: docker.io/anthonyrawlins/chorus:backbeat-v2.0.1
3. BACKBEAT Integration
Protocol: NATS messaging
NATS URL: Configured via WHOOSH_BACKBEAT_NATS_URL
Operations:
- Beat synchronization (30-second intervals at 2 BPM)
- Status claim emission
- Health monitoring
- Task progress tracking
Health Indicators:
- Connected to NATS cluster
- Current beat index
- Measured BPM vs target tempo
- Tempo drift
- Reconnection count
- Active searches/operations
4. N8N Workflows
Base URL: https://n8n.home.deepblack.cloud
Integration Points:
- Gitea webhook → N8N → BZZZ task coordination
- WHOOSH events → N8N → External notifications
- Council formation → N8N → Project initialization workflows
5. SLURP (UCXL Content System)
Purpose: UCXL address-based artifact storage
API Endpoints:
POST /api/v1/slurp/submit- Submit artifact to SLURPGET /api/v1/slurp/artifacts/{ucxlAddr}- Retrieve artifact
Use Cases:
- Decision records (BUBBLE integration)
- Council artifacts (project documentation)
- Compliance documentation
Configuration Management
Environment Variables
Database Configuration:
WHOOSH_DATABASE_HOST=localhost
WHOOSH_DATABASE_PORT=5432
WHOOSH_DATABASE_DB_NAME=whoosh
WHOOSH_DATABASE_USERNAME=whoosh
WHOOSH_DATABASE_PASSWORD=<password>
WHOOSH_DATABASE_PASSWORD_FILE=/secrets/db_password # Alternative
WHOOSH_DATABASE_SSL_MODE=disable
WHOOSH_DATABASE_AUTO_MIGRATE=true
WHOOSH_DATABASE_MAX_OPEN_CONNS=25
WHOOSH_DATABASE_MAX_IDLE_CONNS=5
Server Configuration:
WHOOSH_SERVER_LISTEN_ADDR=:8080
WHOOSH_SERVER_READ_TIMEOUT=30s
WHOOSH_SERVER_WRITE_TIMEOUT=30s
WHOOSH_SERVER_SHUTDOWN_TIMEOUT=30s
WHOOSH_SERVER_ALLOWED_ORIGINS=http://localhost:3000,http://localhost:8080
WHOOSH_SERVER_ALLOWED_ORIGINS_FILE=/secrets/allowed_origins # Alternative
Gitea Configuration:
WHOOSH_GITEA_BASE_URL=http://ironwood:3000
WHOOSH_GITEA_TOKEN=<token>
WHOOSH_GITEA_TOKEN_FILE=/secrets/gitea_token # Alternative
WHOOSH_GITEA_WEBHOOK_PATH=/webhooks/gitea
WHOOSH_GITEA_WEBHOOK_TOKEN=<secret>
WHOOSH_GITEA_WEBHOOK_TOKEN_FILE=/secrets/webhook_token # Alternative
WHOOSH_GITEA_EAGER_FILTER=true
WHOOSH_GITEA_FULL_RESCAN=false
WHOOSH_GITEA_DEBUG_URLS=false
WHOOSH_GITEA_MAX_RETRIES=3
WHOOSH_GITEA_RETRY_DELAY=2s
Authentication Configuration:
WHOOSH_AUTH_JWT_SECRET=<secret_min_32_chars>
WHOOSH_AUTH_JWT_SECRET_FILE=/secrets/jwt_secret # Alternative
WHOOSH_AUTH_SERVICE_TOKENS=token1,token2,token3
WHOOSH_AUTH_SERVICE_TOKENS_FILE=/secrets/service_tokens # Alternative
WHOOSH_AUTH_JWT_EXPIRY=24h
Logging Configuration:
WHOOSH_LOGGING_LEVEL=debug # debug, info, warn, error
WHOOSH_LOGGING_ENVIRONMENT=development # development, production
LOG_LEVEL=info # Alternative for zerolog
ENVIRONMENT=development # Enables pretty logging
Team Composer Configuration:
# LLM-based analysis (experimental, default: false)
WHOOSH_COMPOSER_ENABLE_LLM_CLASSIFICATION=false
WHOOSH_COMPOSER_ENABLE_LLM_SKILL_ANALYSIS=false
WHOOSH_COMPOSER_ENABLE_LLM_TEAM_MATCHING=false
# Analysis features
WHOOSH_COMPOSER_ENABLE_COMPLEXITY_ANALYSIS=true
WHOOSH_COMPOSER_ENABLE_RISK_ASSESSMENT=true
WHOOSH_COMPOSER_ENABLE_ALTERNATIVE_OPTIONS=false
# Debug and monitoring
WHOOSH_COMPOSER_ENABLE_ANALYSIS_LOGGING=true
WHOOSH_COMPOSER_ENABLE_PERFORMANCE_METRICS=true
WHOOSH_COMPOSER_ENABLE_FAILSAFE_FALLBACK=true
# LLM model configuration
WHOOSH_COMPOSER_CLASSIFICATION_MODEL=llama3.1:8b
WHOOSH_COMPOSER_SKILL_ANALYSIS_MODEL=llama3.1:8b
WHOOSH_COMPOSER_MATCHING_MODEL=llama3.1:8b
# Performance settings
WHOOSH_COMPOSER_ANALYSIS_TIMEOUT_SECS=60
WHOOSH_COMPOSER_SKILL_MATCH_THRESHOLD=0.6
BACKBEAT Configuration:
WHOOSH_BACKBEAT_ENABLED=true
WHOOSH_BACKBEAT_CLUSTER_ID=chorus-production
WHOOSH_BACKBEAT_AGENT_ID=whoosh
WHOOSH_BACKBEAT_NATS_URL=nats://backbeat-nats:4222
Docker Configuration:
WHOOSH_DOCKER_ENABLED=true
WHOOSH_DOCKER_HOST=unix:///var/run/docker.sock
OpenTelemetry Configuration:
WHOOSH_OPENTELEMETRY_ENABLED=true
WHOOSH_OPENTELEMETRY_SERVICE_NAME=whoosh
WHOOSH_OPENTELEMETRY_SERVICE_VERSION=1.0.0
WHOOSH_OPENTELEMETRY_ENVIRONMENT=production
WHOOSH_OPENTELEMETRY_JAEGER_ENDPOINT=http://localhost:14268/api/traces
WHOOSH_OPENTELEMETRY_SAMPLE_RATE=1.0
N8N Configuration:
WHOOSH_N8N_BASE_URL=https://n8n.home.deepblack.cloud
Configuration Loading
Priority Order:
- Environment variables
- Secret files (if
*_FILEvariant specified) - Default values in code
Secret File Loading:
// Example: JWT secret loading
if cfg.Auth.JWTSecretFile != "" {
secret, err := readSecretFile(cfg.Auth.JWTSecretFile)
cfg.Auth.JWTSecret = secret
}
Validation:
func (c *Config) Validate() error {
- Check required fields (database password, Gitea token, etc.)
- Build database URL if not provided
- Validate CORS origins
- Ensure JWT secret meets minimum length
- Validate service tokens present
}
Security & Authentication
Authentication Mechanisms
1. JWT Authentication
- Used for user/admin API access
- Token expiry: 24 hours (configurable)
- Claims include: user_id, role, issued_at, expires_at
- Validated on protected endpoints via middleware
2. Service Token Authentication
- Used for service-to-service communication
- Static tokens configured via environment
- Required for task ingestion endpoints
- Validated via
ServiceTokenRequiredmiddleware
3. Admin Role Enforcement
- Admin-only endpoints protected via
AdminRequiredmiddleware - Role claim must be "admin" in JWT
- Used for repository management, team creation, etc.
Security Headers
Applied to all responses:
X-Content-Type-Options: nosniff
X-Frame-Options: DENY
X-XSS-Protection: 1; mode=block
Content-Security-Policy: default-src 'self'
CORS Configuration
- Allowed origins: Configured via environment
- Allowed methods: GET, POST, PUT, DELETE, OPTIONS
- Credentials: Enabled
- Max age: 300 seconds
Rate Limiting
- Default: 100 requests per minute per IP
- In-memory storage with automatic cleanup
- Applied globally via middleware
Webhook Security
- Gitea webhooks: HMAC signature verification
- Token stored securely (from file or environment)
- Signature header:
X-Gitea-Signature
Secret Management
Best Practices:
- Use
*_FILEenvironment variables for secrets - Mount secrets as files in Docker Swarm
- Never commit secrets to Git
- Rotate tokens regularly
Example Docker Secret:
secrets:
gitea_token:
file: /path/to/gitea_token.txt
services:
whoosh:
secrets:
- gitea_token
environment:
WHOOSH_GITEA_TOKEN_FILE: /run/secrets/gitea_token
Observability
Logging
Structured Logging (zerolog):
log.Info().
Str("team_id", teamID).
Int("agent_count", count).
Dur("duration", duration).
Msg("Team deployed successfully")
Log Levels:
debug: Detailed debugging informationinfo: General information messageswarn: Warning messages (recoverable errors)error: Error messages (operation failures)
Pretty Logging:
- Enabled in development mode
- Human-readable console output
- Colored output for log levels
Distributed Tracing
OpenTelemetry + Jaeger:
ctx, span := tracing.StartSpan(ctx, "operation_name")
defer span.End()
span.SetAttributes(
attribute.String("resource.id", id),
attribute.Int("resource.count", count),
)
// On error
tracing.SetSpanError(span, err)
Trace Propagation:
- Context passed through entire request lifecycle
- Spans created at key operations:
- HTTP request handling
- Database queries
- External API calls
- Docker operations
- Council/team operations
Jaeger UI:
- Access at:
http://localhost:16686 - View traces by service, operation, duration
- Analyze performance bottlenecks
- Debug distributed operations
Health Checks
Basic Health Check (/health):
{
"status": "ok",
"service": "whoosh",
"version": "0.1.0-mvp",
"backbeat": {
"enabled": true,
"connected": true,
"current_beat": 12345
}
}
Readiness Check (/health/ready):
{
"status": "ready",
"database": "connected"
}
Detailed Health (/admin/health/details):
{
"service": "whoosh",
"version": "0.1.1-debug",
"timestamp": 1696118400,
"status": "healthy",
"components": {
"database": {
"status": "healthy",
"type": "postgresql",
"statistics": {
"max_conns": 25,
"acquired_conns": 3,
"idle_conns": 5
}
},
"gitea": {
"status": "healthy",
"endpoint": "http://ironwood:3000"
},
"backbeat": {
"status": "healthy",
"connected": true,
"current_tempo": 2
},
"docker_swarm": {
"status": "unknown",
"note": "Health check not implemented"
}
}
}
Metrics
Database Metrics:
- Connection pool statistics
- Active connections
- Idle connections
- Query duration
Deployment Metrics (via ScalingMetricsCollector):
- Wave execution count
- Agent deployment success/failure rate
- Average deployment duration
- Error rate
BACKBEAT Metrics:
- Current beat index
- Tempo (BPM)
- Tempo drift
- Reconnection count
- Active operations
Development Workflow
Running Locally
Prerequisites:
# Install Go 1.22+
go version
# Install PostgreSQL 14+
psql --version
# Install Docker (for Swarm testing)
docker version
Setup:
# 1. Clone repository
git clone https://gitea.chorus.services/tony/WHOOSH.git
cd WHOOSH
# 2. Copy environment configuration
cp .env.example .env
# Edit .env with local values
# 3. Start PostgreSQL (Docker example)
docker run -d \
--name whoosh-postgres \
-e POSTGRES_DB=whoosh \
-e POSTGRES_USER=whoosh \
-e POSTGRES_PASSWORD=whoosh \
-p 5432:5432 \
postgres:15
# 4. Run migrations
make migrate
# Or manual:
# migrate -database "postgres://whoosh:whoosh@localhost:5432/whoosh?sslmode=disable" -path ./migrations up
# 5. Run the server
go run ./cmd/whoosh
# Or with hot reload:
# air (requires cosmtrek/air)
Development Commands:
# Run with live reload
air
# Run tests
go test ./...
# Run specific package tests
go test ./internal/composer/...
# Format code
go fmt ./...
# Vet code
go vet ./...
# Build binary
go build -o bin/whoosh ./cmd/whoosh
# Check version
./bin/whoosh --version
Testing
Unit Tests:
// internal/composer/service_test.go
func TestDetermineTaskType(t *testing.T) {
service := NewService(nil, nil)
taskType := service.DetermineTaskType("Fix bug in login", "...")
assert.Equal(t, TaskTypeBugFix, taskType)
}
Integration Tests:
# Requires running database
go test -tags=integration ./internal/database/...
Database Setup for Tests:
# Create test database
createdb whoosh_test
# Run migrations
migrate -database "postgres://whoosh:whoosh@localhost:5432/whoosh_test?sslmode=disable" -path ./migrations up
Building for Production
Docker Build:
# Build binary
go build -o whoosh ./cmd/whoosh
# Build Docker image
docker build -t registry.home.deepblack.cloud/whoosh:v0.1.1 .
# Push to registry
docker push registry.home.deepblack.cloud/whoosh:v0.1.1
Docker Compose:
version: '3.8'
services:
whoosh:
image: registry.home.deepblack.cloud/whoosh:v0.1.1
environment:
WHOOSH_DATABASE_HOST: postgres
WHOOSH_DATABASE_PORT: 5432
WHOOSH_DATABASE_DB_NAME: whoosh
WHOOSH_DATABASE_USERNAME: whoosh
WHOOSH_DATABASE_PASSWORD: ${DATABASE_PASSWORD}
WHOOSH_GITEA_BASE_URL: http://ironwood:3000
WHOOSH_GITEA_TOKEN: ${GITEA_TOKEN}
WHOOSH_AUTH_JWT_SECRET: ${JWT_SECRET}
ports:
- "8080:8080"
volumes:
- /var/run/docker.sock:/var/run/docker.sock
depends_on:
- postgres
networks:
- chorus_default
postgres:
image: postgres:15
environment:
POSTGRES_DB: whoosh
POSTGRES_USER: whoosh
POSTGRES_PASSWORD: ${DATABASE_PASSWORD}
volumes:
- whoosh_data:/var/lib/postgresql/data
networks:
- chorus_default
volumes:
whoosh_data:
networks:
chorus_default:
external: true
Docker Swarm Deploy:
# Create secrets
echo "my_jwt_secret" | docker secret create whoosh_jwt_secret -
echo "my_gitea_token" | docker secret create whoosh_gitea_token -
# Deploy stack
docker stack deploy -c docker-compose.swarm.yml whoosh
# Check services
docker service ls
docker service ps whoosh_whoosh
# View logs
docker service logs whoosh_whoosh -f
Debugging
Enable Debug Logging:
export WHOOSH_LOGGING_LEVEL=debug
export LOG_LEVEL=debug
go run ./cmd/whoosh
Database Query Logging:
# Set pgx log level
export WHOOSH_DATABASE_LOG_LEVEL=trace
Gitea URL Debugging:
export WHOOSH_GITEA_DEBUG_URLS=true
Trace a Request:
# View in Jaeger UI
curl -H "X-Request-ID: test-request-123" http://localhost:8080/api/v1/teams
# Find trace in Jaeger
open http://localhost:16686
# Search for: service=whoosh, tags=request.id=test-request-123
Interactive Debugging (Delve):
# Install delve
go install github.com/go-delve/delve/cmd/dlv@latest
# Debug main
dlv debug ./cmd/whoosh
# Set breakpoint
(dlv) break internal/server/server.go:200
(dlv) continue
Appendix
Directory Structure
WHOOSH/
├── cmd/
│ ├── whoosh/ # Main application entry point
│ └── test-llm/ # LLM testing utility
├── internal/
│ ├── agents/ # Agent registry service
│ ├── auth/ # Authentication & authorization
│ ├── backbeat/ # BACKBEAT timing integration
│ ├── composer/ # Team composition service
│ ├── config/ # Configuration management
│ ├── council/ # Council formation service
│ ├── database/ # Database connection & migrations
│ ├── gitea/ # Gitea API client
│ ├── licensing/ # Enterprise licensing (stub)
│ ├── monitor/ # Repository monitoring service
│ ├── orchestrator/ # Docker Swarm orchestration
│ ├── p2p/ # P2P discovery service
│ ├── server/ # HTTP server & routing
│ ├── tasks/ # Task management service
│ ├── tracing/ # OpenTelemetry tracing
│ └── validation/ # Input validation & security
├── migrations/ # Database migration files
├── ui/ # Frontend assets (if any)
├── docs/ # Documentation
├── scripts/ # Utility scripts
├── requirements/ # Requirements documents
├── BACKBEAT-prototype/ # BACKBEAT SDK integration
├── go.mod # Go module definition
├── go.sum # Go module checksums
├── .env.example # Environment variable template
├── Dockerfile # Container build definition
└── README.md # Project README
Common Issues & Solutions
Issue: Database connection failed
Error: failed to ping database: dial tcp 127.0.0.1:5432: connect: connection refused
Solution:
1. Ensure PostgreSQL is running: systemctl status postgresql
2. Check connection parameters in .env
3. Verify firewall rules allow port 5432
4. Check PostgreSQL logs: journalctl -u postgresql
Issue: Gitea API connection failed
Error: connection test failed: API request failed with status 401
Solution:
1. Verify Gitea token is correct
2. Check token has required permissions (read:repository, write:issue)
3. Verify Gitea base URL is accessible
4. Test manually: curl -H "Authorization: token YOUR_TOKEN" http://ironwood:3000/api/v1/user
Issue: Docker Swarm deployment failed
Error: failed to deploy agent service: Error response from daemon: This node is not a swarm manager
Solution:
1. Initialize Docker Swarm: docker swarm init
2. Or join existing swarm: docker swarm join --token TOKEN MANAGER_IP:2377
3. Verify swarm status: docker info | grep Swarm
Issue: Migrations not running
Error: Database migration failed: Dirty database version 5
Solution:
1. Check migration status: migrate -database "..." -path ./migrations version
2. Force version: migrate -database "..." -path ./migrations force 5
3. Re-run migrations: migrate -database "..." -path ./migrations up
Performance Tuning
Database Connection Pool:
# Increase for high concurrency
WHOOSH_DATABASE_MAX_OPEN_CONNS=50
WHOOSH_DATABASE_MAX_IDLE_CONNS=10
HTTP Server Timeouts:
# Increase for long-running operations
WHOOSH_SERVER_READ_TIMEOUT=60s
WHOOSH_SERVER_WRITE_TIMEOUT=60s
Rate Limiting:
// Adjust in server initialization
rateLimiter := auth.NewRateLimiter(200, time.Minute) // 200 req/min
Composer Analysis Timeout:
# Reduce for faster failover to heuristics
WHOOSH_COMPOSER_ANALYSIS_TIMEOUT_SECS=30
Contributing
Code Style:
- Follow standard Go conventions
- Run
go fmtbefore committing - Use
go vetto check for issues - Add comments for exported functions
- Write tests for new features
Git Workflow:
# 1. Create feature branch
git checkout -b feature/my-feature
# 2. Make changes and commit
git add .
git commit -m "Add feature: description"
# 3. Push to Gitea
git push origin feature/my-feature
# 4. Create pull request via Gitea UI
# 5. Address review comments
# 6. Merge when approved
Database Migrations:
# Create new migration
migrate create -ext sql -dir migrations -seq add_new_table
# Edit up and down files
# migrations/008_add_new_table.up.sql
# migrations/008_add_new_table.down.sql
# Test migration
migrate -database "postgres://..." -path ./migrations up
migrate -database "postgres://..." -path ./migrations down
References
- CHORUS Project: Autonomous AI agent system (depends on WHOOSH for orchestration)
- BACKBEAT: Cluster-wide timing and coordination system
- BZZZ: Distributed task system integration
- SLURP: UCXL content address system
- BUBBLE: Decision tracking and policy management
Related Documentation:
/home/tony/chorus/CLAUDE.md- Project instructions/home/tony/chorus/GEMINI.md- Cluster context/home/tony/chorus/project-queues/active/WHOOSH/README.md- Quick start/home/tony/chorus/project-queues/active/WHOOSH/docs/progress/WHOOSH-roadmap.md- Development roadmap/home/tony/chorus/project-queues/active/WHOOSH/DEVELOPMENT_PLAN.md- Implementation plan
External Resources:
- Docker Swarm Documentation: https://docs.docker.com/engine/swarm/
- PostgreSQL Documentation: https://www.postgresql.org/docs/
- Go Documentation: https://go.dev/doc/
- OpenTelemetry Go: https://opentelemetry.io/docs/instrumentation/go/
Document Version: 1.0 Generated: October 2025 Maintained by: WHOOSH Development Team