Files
WHOOSH/docs/DEPLOYMENT.md
Claude Code afccc94998
Some checks failed
WHOOSH CI / speclint (push) Has been cancelled
WHOOSH CI / contracts (push) Has been cancelled
Updated project files and configuration
- Added/updated .gitignore file
- Fixed remote URL configuration
- Updated project structure and files

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-09-17 22:51:50 +10:00

15 KiB

WHOOSH Production Deployment Guide

This guide provides comprehensive instructions for deploying WHOOSH Council Formation Engine in production environments using Docker Swarm orchestration.

📋 Prerequisites

Infrastructure Requirements

Docker Swarm Cluster

  • Docker Engine 20.10+ on all nodes
  • Docker Swarm mode initialized
  • Minimum 3 nodes for high availability (1 manager, 2+ workers)
  • Shared storage for persistent volumes (NFS recommended)

Network Configuration

  • Overlay networks for service communication
  • External network access for Gitea integration
  • SSL/TLS certificates for HTTPS endpoints
  • DNS configuration for service discovery

Resource Requirements

WHOOSH Service (per replica):
  Memory: 256MB limit, 128MB reservation
  CPU: 0.5 cores limit, 0.25 cores reservation

PostgreSQL Database:
  Memory: 512MB limit, 256MB reservation  
  CPU: 1.0 cores limit, 0.5 cores reservation
  Storage: 10GB+ persistent volume

External Dependencies

Required Services

  • Gitea Instance: Repository hosting and webhook integration
  • Traefik: Reverse proxy with SSL termination
  • BackBeat: Performance monitoring (optional but recommended)
  • NATS: Message bus for BackBeat integration

Network Connectivity

  • WHOOSH → Gitea (API access and webhook delivery)
  • WHOOSH → PostgreSQL (database connections)
  • WHOOSH → Docker Socket (agent deployment)
  • External → WHOOSH (webhook delivery and API access)

🔐 Security Setup

Docker Secrets Management

Create all required secrets before deployment:

# Database password
echo "your-secure-db-password" | docker secret create whoosh_db_password -

# Gitea API token (from Gitea settings)
echo "your-gitea-api-token" | docker secret create gitea_token -

# Webhook secret (same as configured in Gitea webhook)
echo "your-webhook-secret" | docker secret create whoosh_webhook_token -

# JWT secret (minimum 32 characters)
echo "your-strong-jwt-secret-minimum-32-chars" | docker secret create whoosh_jwt_secret -

# Service tokens (comma-separated)
echo "internal-service-token1,api-automation-token2" | docker secret create whoosh_service_tokens -

Secret Validation

Verify secrets are created correctly:

# List all WHOOSH secrets
docker secret ls | grep whoosh

# Expected output:
# whoosh_db_password
# gitea_token
# whoosh_webhook_token  
# whoosh_jwt_secret
# whoosh_service_tokens

SSL/TLS Configuration

Traefik Integration (Recommended)

# In docker-compose.swarm.yml
labels:
  - traefik.enable=true
  - traefik.http.routers.whoosh.rule=Host(`whoosh.your-domain.com`)
  - traefik.http.routers.whoosh.tls=true
  - traefik.http.routers.whoosh.tls.certresolver=letsencryptresolver
  - traefik.http.services.whoosh.loadbalancer.server.port=8080

Manual TLS Configuration

# Environment variables for direct TLS
WHOOSH_TLS_ENABLED=true
WHOOSH_TLS_CERT_FILE=/run/secrets/tls_cert
WHOOSH_TLS_KEY_FILE=/run/secrets/tls_key
WHOOSH_TLS_MIN_VERSION=1.2

📦 Image Preparation

Production Image Build

# Clone the repository
git clone https://gitea.chorus.services/tony/WHOOSH.git
cd WHOOSH

# Build with production tags
export VERSION=$(git describe --tags --abbrev=0 || echo "v1.0.0")
export COMMIT_HASH=$(git rev-parse --short HEAD)
export BUILD_DATE=$(date -u +"%Y-%m-%dT%H:%M:%SZ")

docker build \
  --build-arg VERSION=${VERSION} \
  --build-arg COMMIT_HASH=${COMMIT_HASH} \
  --build-arg BUILD_DATE=${BUILD_DATE} \
  -t anthonyrawlins/whoosh:${VERSION} .

# Push to registry
docker push anthonyrawlins/whoosh:${VERSION}

Image Verification

# Verify image integrity
docker inspect anthonyrawlins/whoosh:${VERSION}

# Test image locally
docker run --rm \
  -e WHOOSH_DATABASE_URL=postgres://test:test@localhost/test \
  anthonyrawlins/whoosh:${VERSION} --health-check

🚀 Deployment Process

Step 1: Environment Preparation

Create Networks

# Create overlay networks
docker network create -d overlay --attachable=false whoosh-backend

# Verify external networks exist
docker network ls | grep -E "(tengig|CHORUS_chorus_net)"

Prepare Persistent Storage

# Create PostgreSQL data directory
sudo mkdir -p /rust/containers/WHOOSH/postgres
sudo chown -R 999:999 /rust/containers/WHOOSH/postgres

# Create prompts directory
sudo mkdir -p /rust/containers/WHOOSH/prompts
sudo chown -R nobody:nogroup /rust/containers/WHOOSH/prompts

Step 2: Configuration Review

Update docker-compose.swarm.yml for your environment:

# Key configuration points
services:
  whoosh:
    image: anthonyrawlins/whoosh:v1.0.0  # Use specific version
    environment:
      # Database
      WHOOSH_DATABASE_DB_HOST: postgres
      WHOOSH_DATABASE_DB_SSL_MODE: require  # Enable in production
      
      # Gitea integration
      WHOOSH_GITEA_BASE_URL: https://your-gitea.domain.com
      
      # Security
      WHOOSH_CORS_ALLOWED_ORIGINS: https://your-app.domain.com
      
      # Monitoring
      WHOOSH_BACKBEAT_ENABLED: "true"
      WHOOSH_BACKBEAT_NATS_URL: "nats://your-nats:4222"
    
    # Update Traefik labels
    deploy:
      labels:
        - traefik.http.routers.whoosh.rule=Host(`your-whoosh.domain.com`)

Step 3: Production Deployment

# Deploy to Docker Swarm
docker stack deploy -c docker-compose.swarm.yml WHOOSH

# Verify deployment
docker stack services WHOOSH
docker stack ps WHOOSH

Step 4: Health Verification

# Check service health
curl -f http://localhost:8800/health || echo "Health check failed"

# Check detailed health (requires authentication)
curl -H "Authorization: Bearer ${JWT_TOKEN}" \
  https://your-whoosh.domain.com/admin/health/details

# Verify database connectivity
docker exec -it $(docker ps --filter name=WHOOSH_postgres -q) \
  psql -U whoosh -d whoosh -c "SELECT version();"

📊 Post-Deployment Configuration

Gitea Webhook Setup

Configure Repository Webhooks

  1. Navigate to repository settings in Gitea
  2. Add new webhook:
    • Target URL: https://your-whoosh.domain.com/webhooks/gitea
    • HTTP Method: POST
    • POST Content Type: application/json
    • Secret: Use same value as whoosh_webhook_token secret
    • Trigger On: Issues, Issue Comments
    • Branch Filter: Leave empty for all branches

Test Webhook Delivery

# Create test issue with chorus-entrypoint label
# Check WHOOSH logs for webhook processing
docker service logs WHOOSH_whoosh

Repository Registration

Register repositories for monitoring:

# Get JWT token (implement your auth mechanism)
JWT_TOKEN="your-admin-jwt-token"

# Register repository
curl -X POST https://your-whoosh.domain.com/api/v1/repositories \
  -H "Authorization: Bearer ${JWT_TOKEN}" \
  -H "Content-Type: application/json" \
  -d '{
    "full_name": "username/repository",
    "gitea_id": 123,
    "description": "Project repository"
  }'

Council Configuration

Role Configuration Ensure role definitions are available:

# Copy role definitions to prompts directory
sudo cp human-roles.yaml /rust/containers/WHOOSH/prompts/
sudo chown nobody:nogroup /rust/containers/WHOOSH/prompts/human-roles.yaml

Agent Image Configuration

# In deployment configuration
environment:
  WHOOSH_AGENT_IMAGE: anthonyrawlins/chorus:latest
  WHOOSH_AGENT_MEMORY_LIMIT: 2048m
  WHOOSH_AGENT_CPU_LIMIT: 1.0

🔍 Monitoring & Observability

Health Monitoring

Endpoint Monitoring

# Basic health check
curl -f https://your-whoosh.domain.com/health

# Detailed health (authenticated)
curl -H "Authorization: Bearer ${JWT_TOKEN}" \
  https://your-whoosh.domain.com/admin/health/details

Expected Health Response

{
  "status": "healthy",
  "timestamp": "2025-09-12T10:00:00Z",
  "components": {
    "database": "healthy",
    "gitea": "healthy", 
    "docker": "healthy",
    "backbeat": "healthy"
  },
  "version": "v1.0.0"
}

Metrics Collection

Prometheus Metrics

# Metrics endpoint (unauthenticated)
curl https://your-whoosh.domain.com/metrics

# Key metrics to monitor:
# - whoosh_http_requests_total
# - whoosh_council_formations_total
# - whoosh_agent_deployments_total
# - whoosh_webhook_requests_total

Log Management

Structured Logging

# View logs with correlation
docker service logs -f WHOOSH_whoosh | jq .

# Filter by correlation ID
docker service logs WHOOSH_whoosh | jq 'select(.request_id == "specific-id")'

# Monitor security events
docker service logs WHOOSH_whoosh | jq 'select(.level == "warn" or .level == "error")'

Distributed Tracing

OpenTelemetry Integration

# Add to environment configuration
WHOOSH_OTEL_ENABLED: "true"
WHOOSH_OTEL_SERVICE_NAME: "whoosh"
WHOOSH_OTEL_ENDPOINT: "http://jaeger:14268/api/traces"
WHOOSH_OTEL_SAMPLER_RATIO: "1.0"

📋 Maintenance Procedures

Regular Maintenance Tasks

Weekly Tasks

  • Review security logs and failed authentication attempts
  • Check disk space usage for PostgreSQL data
  • Verify backup integrity
  • Update security alerts monitoring

Monthly Tasks

  • Rotate JWT secrets and service tokens
  • Review and update dependency versions
  • Performance analysis and optimization review
  • Capacity planning assessment

Quarterly Tasks

  • Full security audit and penetration testing
  • Disaster recovery procedure testing
  • Documentation updates and accuracy review
  • Performance benchmarking and optimization

Update Procedures

Rolling Update Process

# 1. Build new image
docker build -t anthonyrawlins/whoosh:v1.1.0 .
docker push anthonyrawlins/whoosh:v1.1.0

# 2. Update compose file
sed -i 's/anthonyrawlins\/whoosh:v1.0.0/anthonyrawlins\/whoosh:v1.1.0/' docker-compose.swarm.yml

# 3. Deploy update (rolling update)
docker stack deploy -c docker-compose.swarm.yml WHOOSH

# 4. Monitor rollout
docker service ps WHOOSH_whoosh
docker service logs -f WHOOSH_whoosh

Rollback Procedures

# Quick rollback to previous version
docker service update --image anthonyrawlins/whoosh:v1.0.0 WHOOSH_whoosh

# Or update compose file and redeploy
git checkout HEAD~1 docker-compose.swarm.yml
docker stack deploy -c docker-compose.swarm.yml WHOOSH

Backup Procedures

Database Backup

# Automated daily backup
docker exec WHOOSH_postgres pg_dump \
  -U whoosh -d whoosh --no-password \
  > /backups/whoosh-$(date +%Y%m%d).sql

# Restore from backup
cat /backups/whoosh-20250912.sql | \
  docker exec -i WHOOSH_postgres psql -U whoosh -d whoosh

Configuration Backup

# Backup secrets (encrypted storage)
docker secret ls --filter label=whoosh > whoosh-secrets-list.txt

# Backup configuration files
tar -czf whoosh-config-$(date +%Y%m%d).tar.gz \
  docker-compose.swarm.yml \
  /rust/containers/WHOOSH/prompts/

🚨 Troubleshooting

Common Issues

Service Won't Start

# Check service status
docker service ps WHOOSH_whoosh

# Check logs for errors
docker service logs WHOOSH_whoosh | tail -50

# Common fixes:
# 1. Verify secrets exist and are accessible
# 2. Check network connectivity to dependencies
# 3. Verify volume mounts and permissions
# 4. Check resource constraints and limits

Database Connection Issues

# Test database connectivity
docker exec -it WHOOSH_postgres psql -U whoosh -d whoosh -c "\l"

# Check database logs
docker service logs WHOOSH_postgres

# Verify connection parameters
docker service inspect WHOOSH_whoosh | jq .Spec.TaskTemplate.ContainerSpec.Env

Webhook Delivery Failures

# Check webhook logs
docker service logs WHOOSH_whoosh | grep webhook

# Test webhook endpoint manually
curl -X POST https://your-whoosh.domain.com/webhooks/gitea \
  -H "Content-Type: application/json" \
  -H "X-Gitea-Signature: sha256=..." \
  -d '{"test": "payload"}'

# Verify webhook secret configuration
# Ensure Gitea webhook secret matches whoosh_webhook_token

Agent Deployment Issues

# Check Docker socket access
docker exec -it WHOOSH_whoosh ls -la /var/run/docker.sock

# Check agent deployment logs
docker service logs WHOOSH_whoosh | grep "agent deployment"

# Verify agent image availability
docker pull anthonyrawlins/chorus:latest

Performance Issues

High Memory Usage

# Check memory usage
docker stats --no-stream

# Adjust resource limits
docker service update --limit-memory 512m WHOOSH_whoosh

# Review connection pool settings
# Adjust WHOOSH_DB_MAX_OPEN_CONNS and WHOOSH_DB_MAX_IDLE_CONNS

Slow Database Queries

# Enable query logging in PostgreSQL
docker exec -it WHOOSH_postgres \
  psql -U whoosh -d whoosh -c "ALTER SYSTEM SET log_statement = 'all';"

# Review slow queries and add indexes as needed
# Check migrations/006_add_performance_indexes.up.sql

Security Issues

Authentication Failures

# Check authentication logs
docker service logs WHOOSH_whoosh | grep -i "auth\|jwt"

# Verify JWT secret integrity
# Rotate JWT secret if compromised

# Check rate limiting
docker service logs WHOOSH_whoosh | grep "rate limit"

CORS Issues

# Verify CORS configuration
curl -I -X OPTIONS \
  -H "Origin: https://your-app.domain.com" \
  -H "Access-Control-Request-Method: GET" \
  https://your-whoosh.domain.com/api/v1/councils

# Update CORS origins
docker service update \
  --env-add WHOOSH_CORS_ALLOWED_ORIGINS=https://new-domain.com \
  WHOOSH_whoosh

📚 Production Checklist

Pre-Deployment Checklist

  • All secrets created and verified
  • Network configuration tested
  • External dependencies accessible
  • SSL/TLS certificates valid
  • Resource limits configured appropriately
  • Backup procedures tested
  • Monitoring and alerting configured
  • Security configuration reviewed
  • Performance benchmarks established

Post-Deployment Checklist

  • Health endpoints responding correctly
  • Webhook delivery working from Gitea
  • Authentication and authorization working
  • Agent deployment functioning
  • Database migrations completed successfully
  • Metrics and tracing data flowing
  • Backup procedures validated
  • Security scans passed
  • Documentation updated with environment-specific details

Production Readiness Checklist

  • High availability configuration (multiple replicas)
  • Automated failover tested
  • Disaster recovery procedures documented
  • Performance monitoring and alerting active
  • Security monitoring and incident response ready
  • Staff training completed on operational procedures
  • Change management procedures defined
  • Compliance requirements validated

Deployment Status: Ready for Production
Supported Platforms: Docker Swarm, Kubernetes (with adaptations)
Security Level: Enterprise-Grade
High Availability: Supported

For additional deployment support, refer to the Configuration Guide and Security Policy.