Updated project files and configuration
- Added/updated .gitignore file - Fixed remote URL configuration - Updated project structure and files 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
581
docs/DEPLOYMENT.md
Normal file
581
docs/DEPLOYMENT.md
Normal file
@@ -0,0 +1,581 @@
|
||||
# WHOOSH Production Deployment Guide
|
||||
|
||||
This guide provides comprehensive instructions for deploying WHOOSH Council Formation Engine in production environments using Docker Swarm orchestration.
|
||||
|
||||
## 📋 Prerequisites
|
||||
|
||||
### Infrastructure Requirements
|
||||
|
||||
**Docker Swarm Cluster**
|
||||
- Docker Engine 20.10+ on all nodes
|
||||
- Docker Swarm mode initialized
|
||||
- Minimum 3 nodes for high availability (1 manager, 2+ workers)
|
||||
- Shared storage for persistent volumes (NFS recommended)
|
||||
|
||||
**Network Configuration**
|
||||
- Overlay networks for service communication
|
||||
- External network access for Gitea integration
|
||||
- SSL/TLS certificates for HTTPS endpoints
|
||||
- DNS configuration for service discovery
|
||||
|
||||
**Resource Requirements**
|
||||
```yaml
|
||||
WHOOSH Service (per replica):
|
||||
Memory: 256MB limit, 128MB reservation
|
||||
CPU: 0.5 cores limit, 0.25 cores reservation
|
||||
|
||||
PostgreSQL Database:
|
||||
Memory: 512MB limit, 256MB reservation
|
||||
CPU: 1.0 cores limit, 0.5 cores reservation
|
||||
Storage: 10GB+ persistent volume
|
||||
```
|
||||
|
||||
### External Dependencies
|
||||
|
||||
**Required Services**
|
||||
- **Gitea Instance**: Repository hosting and webhook integration
|
||||
- **Traefik**: Reverse proxy with SSL termination
|
||||
- **BackBeat**: Performance monitoring (optional but recommended)
|
||||
- **NATS**: Message bus for BackBeat integration
|
||||
|
||||
**Network Connectivity**
|
||||
- WHOOSH → Gitea (API access and webhook delivery)
|
||||
- WHOOSH → PostgreSQL (database connections)
|
||||
- WHOOSH → Docker Socket (agent deployment)
|
||||
- External → WHOOSH (webhook delivery and API access)
|
||||
|
||||
## 🔐 Security Setup
|
||||
|
||||
### Docker Secrets Management
|
||||
|
||||
Create all required secrets before deployment:
|
||||
|
||||
```bash
|
||||
# Database password
|
||||
echo "your-secure-db-password" | docker secret create whoosh_db_password -
|
||||
|
||||
# Gitea API token (from Gitea settings)
|
||||
echo "your-gitea-api-token" | docker secret create gitea_token -
|
||||
|
||||
# Webhook secret (same as configured in Gitea webhook)
|
||||
echo "your-webhook-secret" | docker secret create whoosh_webhook_token -
|
||||
|
||||
# JWT secret (minimum 32 characters)
|
||||
echo "your-strong-jwt-secret-minimum-32-chars" | docker secret create whoosh_jwt_secret -
|
||||
|
||||
# Service tokens (comma-separated)
|
||||
echo "internal-service-token1,api-automation-token2" | docker secret create whoosh_service_tokens -
|
||||
```
|
||||
|
||||
### Secret Validation
|
||||
|
||||
Verify secrets are created correctly:
|
||||
|
||||
```bash
|
||||
# List all WHOOSH secrets
|
||||
docker secret ls | grep whoosh
|
||||
|
||||
# Expected output:
|
||||
# whoosh_db_password
|
||||
# gitea_token
|
||||
# whoosh_webhook_token
|
||||
# whoosh_jwt_secret
|
||||
# whoosh_service_tokens
|
||||
```
|
||||
|
||||
### SSL/TLS Configuration
|
||||
|
||||
**Traefik Integration** (Recommended)
|
||||
```yaml
|
||||
# In docker-compose.swarm.yml
|
||||
labels:
|
||||
- traefik.enable=true
|
||||
- traefik.http.routers.whoosh.rule=Host(`whoosh.your-domain.com`)
|
||||
- traefik.http.routers.whoosh.tls=true
|
||||
- traefik.http.routers.whoosh.tls.certresolver=letsencryptresolver
|
||||
- traefik.http.services.whoosh.loadbalancer.server.port=8080
|
||||
```
|
||||
|
||||
**Manual TLS Configuration**
|
||||
```bash
|
||||
# Environment variables for direct TLS
|
||||
WHOOSH_TLS_ENABLED=true
|
||||
WHOOSH_TLS_CERT_FILE=/run/secrets/tls_cert
|
||||
WHOOSH_TLS_KEY_FILE=/run/secrets/tls_key
|
||||
WHOOSH_TLS_MIN_VERSION=1.2
|
||||
```
|
||||
|
||||
## 📦 Image Preparation
|
||||
|
||||
### Production Image Build
|
||||
|
||||
```bash
|
||||
# Clone the repository
|
||||
git clone https://gitea.chorus.services/tony/WHOOSH.git
|
||||
cd WHOOSH
|
||||
|
||||
# Build with production tags
|
||||
export VERSION=$(git describe --tags --abbrev=0 || echo "v1.0.0")
|
||||
export COMMIT_HASH=$(git rev-parse --short HEAD)
|
||||
export BUILD_DATE=$(date -u +"%Y-%m-%dT%H:%M:%SZ")
|
||||
|
||||
docker build \
|
||||
--build-arg VERSION=${VERSION} \
|
||||
--build-arg COMMIT_HASH=${COMMIT_HASH} \
|
||||
--build-arg BUILD_DATE=${BUILD_DATE} \
|
||||
-t anthonyrawlins/whoosh:${VERSION} .
|
||||
|
||||
# Push to registry
|
||||
docker push anthonyrawlins/whoosh:${VERSION}
|
||||
```
|
||||
|
||||
### Image Verification
|
||||
|
||||
```bash
|
||||
# Verify image integrity
|
||||
docker inspect anthonyrawlins/whoosh:${VERSION}
|
||||
|
||||
# Test image locally
|
||||
docker run --rm \
|
||||
-e WHOOSH_DATABASE_URL=postgres://test:test@localhost/test \
|
||||
anthonyrawlins/whoosh:${VERSION} --health-check
|
||||
```
|
||||
|
||||
## 🚀 Deployment Process
|
||||
|
||||
### Step 1: Environment Preparation
|
||||
|
||||
**Create Networks**
|
||||
```bash
|
||||
# Create overlay networks
|
||||
docker network create -d overlay --attachable=false whoosh-backend
|
||||
|
||||
# Verify external networks exist
|
||||
docker network ls | grep -E "(tengig|CHORUS_chorus_net)"
|
||||
```
|
||||
|
||||
**Prepare Persistent Storage**
|
||||
```bash
|
||||
# Create PostgreSQL data directory
|
||||
sudo mkdir -p /rust/containers/WHOOSH/postgres
|
||||
sudo chown -R 999:999 /rust/containers/WHOOSH/postgres
|
||||
|
||||
# Create prompts directory
|
||||
sudo mkdir -p /rust/containers/WHOOSH/prompts
|
||||
sudo chown -R nobody:nogroup /rust/containers/WHOOSH/prompts
|
||||
```
|
||||
|
||||
### Step 2: Configuration Review
|
||||
|
||||
Update `docker-compose.swarm.yml` for your environment:
|
||||
|
||||
```yaml
|
||||
# Key configuration points
|
||||
services:
|
||||
whoosh:
|
||||
image: anthonyrawlins/whoosh:v1.0.0 # Use specific version
|
||||
environment:
|
||||
# Database
|
||||
WHOOSH_DATABASE_DB_HOST: postgres
|
||||
WHOOSH_DATABASE_DB_SSL_MODE: require # Enable in production
|
||||
|
||||
# Gitea integration
|
||||
WHOOSH_GITEA_BASE_URL: https://your-gitea.domain.com
|
||||
|
||||
# Security
|
||||
WHOOSH_CORS_ALLOWED_ORIGINS: https://your-app.domain.com
|
||||
|
||||
# Monitoring
|
||||
WHOOSH_BACKBEAT_ENABLED: "true"
|
||||
WHOOSH_BACKBEAT_NATS_URL: "nats://your-nats:4222"
|
||||
|
||||
# Update Traefik labels
|
||||
deploy:
|
||||
labels:
|
||||
- traefik.http.routers.whoosh.rule=Host(`your-whoosh.domain.com`)
|
||||
```
|
||||
|
||||
### Step 3: Production Deployment
|
||||
|
||||
```bash
|
||||
# Deploy to Docker Swarm
|
||||
docker stack deploy -c docker-compose.swarm.yml WHOOSH
|
||||
|
||||
# Verify deployment
|
||||
docker stack services WHOOSH
|
||||
docker stack ps WHOOSH
|
||||
```
|
||||
|
||||
### Step 4: Health Verification
|
||||
|
||||
```bash
|
||||
# Check service health
|
||||
curl -f http://localhost:8800/health || echo "Health check failed"
|
||||
|
||||
# Check detailed health (requires authentication)
|
||||
curl -H "Authorization: Bearer ${JWT_TOKEN}" \
|
||||
https://your-whoosh.domain.com/admin/health/details
|
||||
|
||||
# Verify database connectivity
|
||||
docker exec -it $(docker ps --filter name=WHOOSH_postgres -q) \
|
||||
psql -U whoosh -d whoosh -c "SELECT version();"
|
||||
```
|
||||
|
||||
## 📊 Post-Deployment Configuration
|
||||
|
||||
### Gitea Webhook Setup
|
||||
|
||||
**Configure Repository Webhooks**
|
||||
1. Navigate to repository settings in Gitea
|
||||
2. Add new webhook:
|
||||
- **Target URL**: `https://your-whoosh.domain.com/webhooks/gitea`
|
||||
- **HTTP Method**: `POST`
|
||||
- **POST Content Type**: `application/json`
|
||||
- **Secret**: Use same value as `whoosh_webhook_token` secret
|
||||
- **Trigger On**: Issues, Issue Comments
|
||||
- **Branch Filter**: Leave empty for all branches
|
||||
|
||||
**Test Webhook Delivery**
|
||||
```bash
|
||||
# Create test issue with chorus-entrypoint label
|
||||
# Check WHOOSH logs for webhook processing
|
||||
docker service logs WHOOSH_whoosh
|
||||
```
|
||||
|
||||
### Repository Registration
|
||||
|
||||
Register repositories for monitoring:
|
||||
|
||||
```bash
|
||||
# Get JWT token (implement your auth mechanism)
|
||||
JWT_TOKEN="your-admin-jwt-token"
|
||||
|
||||
# Register repository
|
||||
curl -X POST https://your-whoosh.domain.com/api/v1/repositories \
|
||||
-H "Authorization: Bearer ${JWT_TOKEN}" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"full_name": "username/repository",
|
||||
"gitea_id": 123,
|
||||
"description": "Project repository"
|
||||
}'
|
||||
```
|
||||
|
||||
### Council Configuration
|
||||
|
||||
**Role Configuration**
|
||||
Ensure role definitions are available:
|
||||
```bash
|
||||
# Copy role definitions to prompts directory
|
||||
sudo cp human-roles.yaml /rust/containers/WHOOSH/prompts/
|
||||
sudo chown nobody:nogroup /rust/containers/WHOOSH/prompts/human-roles.yaml
|
||||
```
|
||||
|
||||
**Agent Image Configuration**
|
||||
```yaml
|
||||
# In deployment configuration
|
||||
environment:
|
||||
WHOOSH_AGENT_IMAGE: anthonyrawlins/chorus:latest
|
||||
WHOOSH_AGENT_MEMORY_LIMIT: 2048m
|
||||
WHOOSH_AGENT_CPU_LIMIT: 1.0
|
||||
```
|
||||
|
||||
## 🔍 Monitoring & Observability
|
||||
|
||||
### Health Monitoring
|
||||
|
||||
**Endpoint Monitoring**
|
||||
```bash
|
||||
# Basic health check
|
||||
curl -f https://your-whoosh.domain.com/health
|
||||
|
||||
# Detailed health (authenticated)
|
||||
curl -H "Authorization: Bearer ${JWT_TOKEN}" \
|
||||
https://your-whoosh.domain.com/admin/health/details
|
||||
```
|
||||
|
||||
**Expected Health Response**
|
||||
```json
|
||||
{
|
||||
"status": "healthy",
|
||||
"timestamp": "2025-09-12T10:00:00Z",
|
||||
"components": {
|
||||
"database": "healthy",
|
||||
"gitea": "healthy",
|
||||
"docker": "healthy",
|
||||
"backbeat": "healthy"
|
||||
},
|
||||
"version": "v1.0.0"
|
||||
}
|
||||
```
|
||||
|
||||
### Metrics Collection
|
||||
|
||||
**Prometheus Metrics**
|
||||
```bash
|
||||
# Metrics endpoint (unauthenticated)
|
||||
curl https://your-whoosh.domain.com/metrics
|
||||
|
||||
# Key metrics to monitor:
|
||||
# - whoosh_http_requests_total
|
||||
# - whoosh_council_formations_total
|
||||
# - whoosh_agent_deployments_total
|
||||
# - whoosh_webhook_requests_total
|
||||
```
|
||||
|
||||
### Log Management
|
||||
|
||||
**Structured Logging**
|
||||
```bash
|
||||
# View logs with correlation
|
||||
docker service logs -f WHOOSH_whoosh | jq .
|
||||
|
||||
# Filter by correlation ID
|
||||
docker service logs WHOOSH_whoosh | jq 'select(.request_id == "specific-id")'
|
||||
|
||||
# Monitor security events
|
||||
docker service logs WHOOSH_whoosh | jq 'select(.level == "warn" or .level == "error")'
|
||||
```
|
||||
|
||||
### Distributed Tracing
|
||||
|
||||
**OpenTelemetry Integration**
|
||||
```yaml
|
||||
# Add to environment configuration
|
||||
WHOOSH_OTEL_ENABLED: "true"
|
||||
WHOOSH_OTEL_SERVICE_NAME: "whoosh"
|
||||
WHOOSH_OTEL_ENDPOINT: "http://jaeger:14268/api/traces"
|
||||
WHOOSH_OTEL_SAMPLER_RATIO: "1.0"
|
||||
```
|
||||
|
||||
## 📋 Maintenance Procedures
|
||||
|
||||
### Regular Maintenance Tasks
|
||||
|
||||
**Weekly Tasks**
|
||||
- Review security logs and failed authentication attempts
|
||||
- Check disk space usage for PostgreSQL data
|
||||
- Verify backup integrity
|
||||
- Update security alerts monitoring
|
||||
|
||||
**Monthly Tasks**
|
||||
- Rotate JWT secrets and service tokens
|
||||
- Review and update dependency versions
|
||||
- Performance analysis and optimization review
|
||||
- Capacity planning assessment
|
||||
|
||||
**Quarterly Tasks**
|
||||
- Full security audit and penetration testing
|
||||
- Disaster recovery procedure testing
|
||||
- Documentation updates and accuracy review
|
||||
- Performance benchmarking and optimization
|
||||
|
||||
### Update Procedures
|
||||
|
||||
**Rolling Update Process**
|
||||
```bash
|
||||
# 1. Build new image
|
||||
docker build -t anthonyrawlins/whoosh:v1.1.0 .
|
||||
docker push anthonyrawlins/whoosh:v1.1.0
|
||||
|
||||
# 2. Update compose file
|
||||
sed -i 's/anthonyrawlins\/whoosh:v1.0.0/anthonyrawlins\/whoosh:v1.1.0/' docker-compose.swarm.yml
|
||||
|
||||
# 3. Deploy update (rolling update)
|
||||
docker stack deploy -c docker-compose.swarm.yml WHOOSH
|
||||
|
||||
# 4. Monitor rollout
|
||||
docker service ps WHOOSH_whoosh
|
||||
docker service logs -f WHOOSH_whoosh
|
||||
```
|
||||
|
||||
**Rollback Procedures**
|
||||
```bash
|
||||
# Quick rollback to previous version
|
||||
docker service update --image anthonyrawlins/whoosh:v1.0.0 WHOOSH_whoosh
|
||||
|
||||
# Or update compose file and redeploy
|
||||
git checkout HEAD~1 docker-compose.swarm.yml
|
||||
docker stack deploy -c docker-compose.swarm.yml WHOOSH
|
||||
```
|
||||
|
||||
### Backup Procedures
|
||||
|
||||
**Database Backup**
|
||||
```bash
|
||||
# Automated daily backup
|
||||
docker exec WHOOSH_postgres pg_dump \
|
||||
-U whoosh -d whoosh --no-password \
|
||||
> /backups/whoosh-$(date +%Y%m%d).sql
|
||||
|
||||
# Restore from backup
|
||||
cat /backups/whoosh-20250912.sql | \
|
||||
docker exec -i WHOOSH_postgres psql -U whoosh -d whoosh
|
||||
```
|
||||
|
||||
**Configuration Backup**
|
||||
```bash
|
||||
# Backup secrets (encrypted storage)
|
||||
docker secret ls --filter label=whoosh > whoosh-secrets-list.txt
|
||||
|
||||
# Backup configuration files
|
||||
tar -czf whoosh-config-$(date +%Y%m%d).tar.gz \
|
||||
docker-compose.swarm.yml \
|
||||
/rust/containers/WHOOSH/prompts/
|
||||
```
|
||||
|
||||
## 🚨 Troubleshooting
|
||||
|
||||
### Common Issues
|
||||
|
||||
**Service Won't Start**
|
||||
```bash
|
||||
# Check service status
|
||||
docker service ps WHOOSH_whoosh
|
||||
|
||||
# Check logs for errors
|
||||
docker service logs WHOOSH_whoosh | tail -50
|
||||
|
||||
# Common fixes:
|
||||
# 1. Verify secrets exist and are accessible
|
||||
# 2. Check network connectivity to dependencies
|
||||
# 3. Verify volume mounts and permissions
|
||||
# 4. Check resource constraints and limits
|
||||
```
|
||||
|
||||
**Database Connection Issues**
|
||||
```bash
|
||||
# Test database connectivity
|
||||
docker exec -it WHOOSH_postgres psql -U whoosh -d whoosh -c "\l"
|
||||
|
||||
# Check database logs
|
||||
docker service logs WHOOSH_postgres
|
||||
|
||||
# Verify connection parameters
|
||||
docker service inspect WHOOSH_whoosh | jq .Spec.TaskTemplate.ContainerSpec.Env
|
||||
```
|
||||
|
||||
**Webhook Delivery Failures**
|
||||
```bash
|
||||
# Check webhook logs
|
||||
docker service logs WHOOSH_whoosh | grep webhook
|
||||
|
||||
# Test webhook endpoint manually
|
||||
curl -X POST https://your-whoosh.domain.com/webhooks/gitea \
|
||||
-H "Content-Type: application/json" \
|
||||
-H "X-Gitea-Signature: sha256=..." \
|
||||
-d '{"test": "payload"}'
|
||||
|
||||
# Verify webhook secret configuration
|
||||
# Ensure Gitea webhook secret matches whoosh_webhook_token
|
||||
```
|
||||
|
||||
**Agent Deployment Issues**
|
||||
```bash
|
||||
# Check Docker socket access
|
||||
docker exec -it WHOOSH_whoosh ls -la /var/run/docker.sock
|
||||
|
||||
# Check agent deployment logs
|
||||
docker service logs WHOOSH_whoosh | grep "agent deployment"
|
||||
|
||||
# Verify agent image availability
|
||||
docker pull anthonyrawlins/chorus:latest
|
||||
```
|
||||
|
||||
### Performance Issues
|
||||
|
||||
**High Memory Usage**
|
||||
```bash
|
||||
# Check memory usage
|
||||
docker stats --no-stream
|
||||
|
||||
# Adjust resource limits
|
||||
docker service update --limit-memory 512m WHOOSH_whoosh
|
||||
|
||||
# Review connection pool settings
|
||||
# Adjust WHOOSH_DB_MAX_OPEN_CONNS and WHOOSH_DB_MAX_IDLE_CONNS
|
||||
```
|
||||
|
||||
**Slow Database Queries**
|
||||
```bash
|
||||
# Enable query logging in PostgreSQL
|
||||
docker exec -it WHOOSH_postgres \
|
||||
psql -U whoosh -d whoosh -c "ALTER SYSTEM SET log_statement = 'all';"
|
||||
|
||||
# Review slow queries and add indexes as needed
|
||||
# Check migrations/006_add_performance_indexes.up.sql
|
||||
```
|
||||
|
||||
### Security Issues
|
||||
|
||||
**Authentication Failures**
|
||||
```bash
|
||||
# Check authentication logs
|
||||
docker service logs WHOOSH_whoosh | grep -i "auth\|jwt"
|
||||
|
||||
# Verify JWT secret integrity
|
||||
# Rotate JWT secret if compromised
|
||||
|
||||
# Check rate limiting
|
||||
docker service logs WHOOSH_whoosh | grep "rate limit"
|
||||
```
|
||||
|
||||
**CORS Issues**
|
||||
```bash
|
||||
# Verify CORS configuration
|
||||
curl -I -X OPTIONS \
|
||||
-H "Origin: https://your-app.domain.com" \
|
||||
-H "Access-Control-Request-Method: GET" \
|
||||
https://your-whoosh.domain.com/api/v1/councils
|
||||
|
||||
# Update CORS origins
|
||||
docker service update \
|
||||
--env-add WHOOSH_CORS_ALLOWED_ORIGINS=https://new-domain.com \
|
||||
WHOOSH_whoosh
|
||||
```
|
||||
|
||||
## 📚 Production Checklist
|
||||
|
||||
### Pre-Deployment Checklist
|
||||
|
||||
- [ ] All secrets created and verified
|
||||
- [ ] Network configuration tested
|
||||
- [ ] External dependencies accessible
|
||||
- [ ] SSL/TLS certificates valid
|
||||
- [ ] Resource limits configured appropriately
|
||||
- [ ] Backup procedures tested
|
||||
- [ ] Monitoring and alerting configured
|
||||
- [ ] Security configuration reviewed
|
||||
- [ ] Performance benchmarks established
|
||||
|
||||
### Post-Deployment Checklist
|
||||
|
||||
- [ ] Health endpoints responding correctly
|
||||
- [ ] Webhook delivery working from Gitea
|
||||
- [ ] Authentication and authorization working
|
||||
- [ ] Agent deployment functioning
|
||||
- [ ] Database migrations completed successfully
|
||||
- [ ] Metrics and tracing data flowing
|
||||
- [ ] Backup procedures validated
|
||||
- [ ] Security scans passed
|
||||
- [ ] Documentation updated with environment-specific details
|
||||
|
||||
### Production Readiness Checklist
|
||||
|
||||
- [ ] High availability configuration (multiple replicas)
|
||||
- [ ] Automated failover tested
|
||||
- [ ] Disaster recovery procedures documented
|
||||
- [ ] Performance monitoring and alerting active
|
||||
- [ ] Security monitoring and incident response ready
|
||||
- [ ] Staff training completed on operational procedures
|
||||
- [ ] Change management procedures defined
|
||||
- [ ] Compliance requirements validated
|
||||
|
||||
---
|
||||
|
||||
**Deployment Status**: Ready for Production ✅
|
||||
**Supported Platforms**: Docker Swarm, Kubernetes (with adaptations)
|
||||
**Security Level**: Enterprise-Grade
|
||||
**High Availability**: Supported
|
||||
|
||||
For additional deployment support, refer to the [Configuration Guide](CONFIGURATION.md) and [Security Policy](../SECURITY.md).
|
||||
Reference in New Issue
Block a user