🎉 MAJOR MILESTONE: Complete BZZZ Phase 2B documentation and core implementation ## Documentation Suite (7,000+ lines) - ✅ User Manual: Comprehensive guide with practical examples - ✅ API Reference: Complete REST API documentation - ✅ SDK Documentation: Multi-language SDK guide (Go, Python, JS, Rust) - ✅ Developer Guide: Development setup and contribution procedures - ✅ Architecture Documentation: Detailed system design with ASCII diagrams - ✅ Technical Report: Performance analysis and benchmarks - ✅ Security Documentation: Comprehensive security model - ✅ Operations Guide: Production deployment and monitoring - ✅ Documentation Index: Cross-referenced navigation system ## SDK Examples & Integration - 🔧 Go SDK: Simple client, event streaming, crypto operations - 🐍 Python SDK: Async client with comprehensive examples - 📜 JavaScript SDK: Collaborative agent implementation - 🦀 Rust SDK: High-performance monitoring system - 📖 Multi-language README with setup instructions ## Core Implementation - 🔐 Age encryption implementation (pkg/crypto/age_crypto.go) - 🗂️ Shamir secret sharing (pkg/crypto/shamir.go) - 💾 DHT encrypted storage (pkg/dht/encrypted_storage.go) - 📤 UCXL decision publisher (pkg/ucxl/decision_publisher.go) - 🔄 Updated main.go with Phase 2B integration ## Project Organization - 📂 Moved legacy docs to old-docs/ directory - 🎯 Comprehensive README.md update with modern structure - 🔗 Full cross-reference system between all documentation - 📊 Production-ready deployment procedures ## Quality Assurance - ✅ All documentation cross-referenced and validated - ✅ Working code examples in multiple languages - ✅ Production deployment procedures tested - ✅ Security best practices implemented - ✅ Performance benchmarks documented Ready for production deployment and community adoption. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
12 KiB
12 KiB
BZZZ Operations Guide
Version 2.0 - Phase 2B Edition
Deployment, monitoring, and maintenance procedures
Quick Reference
- Docker Deployment - Containerized deployment
- Production Setup - Production-ready configuration
- Monitoring - Metrics and alerting
- Maintenance - Routine maintenance tasks
- Troubleshooting - Common issues and solutions
Docker Deployment
Single Node Development
# Clone repository
git clone https://github.com/anthonyrawlins/bzzz.git
cd bzzz
# Build Docker image
docker build -t bzzz:latest .
# Run single node
docker run -d \
--name bzzz-node \
-p 8080:8080 \
-p 4001:4001 \
-v $(pwd)/config:/app/config \
-v bzzz-data:/app/data \
bzzz:latest
Docker Compose Cluster
# docker-compose.yml
version: '3.8'
services:
bzzz-node-1:
build: .
ports:
- "8080:8080"
- "4001:4001"
environment:
- BZZZ_NODE_ID=node-1
- BZZZ_ROLE=backend_developer
volumes:
- ./config:/app/config
- bzzz-data-1:/app/data
networks:
- bzzz-network
bzzz-node-2:
build: .
ports:
- "8081:8080"
- "4002:4001"
environment:
- BZZZ_NODE_ID=node-2
- BZZZ_ROLE=senior_software_architect
- BZZZ_BOOTSTRAP_PEERS=/dns/bzzz-node-1/tcp/4001
volumes:
- ./config:/app/config
- bzzz-data-2:/app/data
networks:
- bzzz-network
depends_on:
- bzzz-node-1
networks:
bzzz-network:
driver: bridge
volumes:
bzzz-data-1:
bzzz-data-2:
Docker Swarm Production
# docker-compose.swarm.yml
version: '3.8'
services:
bzzz:
image: bzzz:latest
deploy:
replicas: 3
placement:
constraints:
- node.role == worker
preferences:
- spread: node.id
resources:
limits:
memory: 512M
cpus: '1.0'
reservations:
memory: 256M
cpus: '0.5'
ports:
- "8080:8080"
environment:
- BZZZ_CLUSTER_MODE=true
networks:
- bzzz-overlay
volumes:
- bzzz-config:/app/config
- bzzz-data:/app/data
networks:
bzzz-overlay:
driver: overlay
encrypted: true
volumes:
bzzz-config:
external: true
bzzz-data:
external: true
Production Configuration
Environment Variables
# Core configuration
export BZZZ_NODE_ID="production-node-01"
export BZZZ_AGENT_ID="prod-agent-backend"
export BZZZ_ROLE="backend_developer"
# Network configuration
export BZZZ_API_HOST="0.0.0.0"
export BZZZ_API_PORT="8080"
export BZZZ_P2P_PORT="4001"
# Security configuration
export BZZZ_ADMIN_KEY_SHARES="5"
export BZZZ_ADMIN_KEY_THRESHOLD="3"
# Performance tuning
export BZZZ_DHT_CACHE_SIZE="1000"
export BZZZ_DHT_REPLICATION_FACTOR="3"
export BZZZ_MAX_CONNECTIONS="500"
Production config.yaml
node:
id: "${BZZZ_NODE_ID}"
data_dir: "/app/data"
agent:
id: "${BZZZ_AGENT_ID}"
role: "${BZZZ_ROLE}"
max_tasks: 10
api:
host: "${BZZZ_API_HOST}"
port: ${BZZZ_API_PORT}
cors_enabled: false
rate_limit: 1000
timeout: "30s"
p2p:
port: ${BZZZ_P2P_PORT}
bootstrap_peers:
- "/dns/bootstrap-1.bzzz.network/tcp/4001"
- "/dns/bootstrap-2.bzzz.network/tcp/4001"
max_connections: ${BZZZ_MAX_CONNECTIONS}
dht:
cache_size: ${BZZZ_DHT_CACHE_SIZE}
cache_ttl: "1h"
replication_factor: ${BZZZ_DHT_REPLICATION_FACTOR}
security:
admin_election_timeout: "30s"
heartbeat_interval: "5s"
shamir_shares: ${BZZZ_ADMIN_KEY_SHARES}
shamir_threshold: ${BZZZ_ADMIN_KEY_THRESHOLD}
logging:
level: "info"
format: "json"
file: "/app/logs/bzzz.log"
max_size: "100MB"
max_files: 10
Monitoring & Observability
Health Check Endpoint
# Basic health check
curl http://localhost:8080/health
# Detailed status
curl http://localhost:8080/api/agent/status
# DHT metrics
curl http://localhost:8080/api/dht/metrics
Prometheus Metrics
Add to prometheus.yml:
scrape_configs:
- job_name: 'bzzz'
static_configs:
- targets: ['localhost:8080']
metrics_path: '/metrics'
scrape_interval: 15s
Grafana Dashboard
Import the BZZZ dashboard from monitoring/grafana-dashboard.json:
Key metrics to monitor:
- Decision throughput - Decisions published per minute
- DHT performance - Storage/retrieval latency
- P2P connectivity - Connected peers count
- Memory usage - Go runtime metrics
- Election events - Admin election frequency
Log Aggregation
ELK Stack Configuration
# filebeat.yml
filebeat.inputs:
- type: log
enabled: true
paths:
- /app/logs/bzzz.log
json.keys_under_root: true
json.add_error_key: true
output.elasticsearch:
hosts: ["elasticsearch:9200"]
index: "bzzz-%{+yyyy.MM.dd}"
logging.level: info
Structured Logging Query Examples
# Find all admin elections
{
"query": {
"bool": {
"must": [
{"match": {"level": "info"}},
{"match": {"component": "election"}},
{"range": {"timestamp": {"gte": "now-1h"}}}
]
}
}
}
# Find encryption errors
{
"query": {
"bool": {
"must": [
{"match": {"level": "error"}},
{"match": {"component": "crypto"}}
]
}
}
}
Maintenance Procedures
Regular Maintenance Tasks
Daily Checks
#!/bin/bash
# daily-check.sh
echo "BZZZ Daily Health Check - $(date)"
# Check service status
echo "=== Service Status ==="
docker ps | grep bzzz
# Check API health
echo "=== API Health ==="
curl -s http://localhost:8080/health | jq .
# Check peer connectivity
echo "=== Peer Status ==="
curl -s http://localhost:8080/api/agent/peers | jq '.connected_peers | length'
# Check recent errors
echo "=== Recent Errors ==="
docker logs bzzz-node --since=24h | grep ERROR | tail -5
echo "Daily check completed"
Weekly Tasks
#!/bin/bash
# weekly-maintenance.sh
echo "BZZZ Weekly Maintenance - $(date)"
# Rotate logs
docker exec bzzz-node logrotate /app/config/logrotate.conf
# Check disk usage
echo "=== Disk Usage ==="
docker exec bzzz-node df -h /app/data
# DHT metrics review
echo "=== DHT Metrics ==="
curl -s http://localhost:8080/api/dht/metrics | jq '.stored_items, .cache_hit_rate'
# Database cleanup (if needed)
docker exec bzzz-node /app/scripts/cleanup-old-data.sh
echo "Weekly maintenance completed"
Monthly Tasks
#!/bin/bash
# monthly-maintenance.sh
echo "BZZZ Monthly Maintenance - $(date)"
# Full backup
./backup-bzzz-data.sh
# Performance review
echo "=== Performance Metrics ==="
curl -s http://localhost:8080/api/debug/status | jq '.performance'
# Security audit
echo "=== Security Check ==="
./scripts/security-audit.sh
# Update dependencies (if needed)
echo "=== Dependency Check ==="
docker exec bzzz-node go list -m -u all
echo "Monthly maintenance completed"
Backup Procedures
Data Backup Script
#!/bin/bash
# backup-bzzz-data.sh
BACKUP_DIR="/backup/bzzz"
DATE=$(date +%Y%m%d_%H%M%S)
NODE_ID=$(docker exec bzzz-node cat /app/config/node_id)
echo "Starting backup for node: $NODE_ID"
# Create backup directory
mkdir -p "$BACKUP_DIR/$DATE"
# Backup configuration
docker cp bzzz-node:/app/config "$BACKUP_DIR/$DATE/config"
# Backup data directory
docker cp bzzz-node:/app/data "$BACKUP_DIR/$DATE/data"
# Backup logs
docker cp bzzz-node:/app/logs "$BACKUP_DIR/$DATE/logs"
# Create manifest
cat > "$BACKUP_DIR/$DATE/manifest.json" << EOF
{
"node_id": "$NODE_ID",
"backup_date": "$(date -u +%Y-%m-%dT%H:%M:%SZ)",
"version": "2.0",
"components": ["config", "data", "logs"]
}
EOF
# Compress backup
cd "$BACKUP_DIR"
tar -czf "bzzz-backup-$NODE_ID-$DATE.tar.gz" "$DATE"
rm -rf "$DATE"
echo "Backup completed: bzzz-backup-$NODE_ID-$DATE.tar.gz"
Restore Procedure
#!/bin/bash
# restore-bzzz-data.sh
BACKUP_FILE="$1"
if [ -z "$BACKUP_FILE" ]; then
echo "Usage: $0 <backup-file.tar.gz>"
exit 1
fi
echo "Restoring from: $BACKUP_FILE"
# Stop service
docker stop bzzz-node
# Extract backup
tar -xzf "$BACKUP_FILE" -C /tmp/
# Find extracted directory
BACKUP_DIR=$(find /tmp -maxdepth 1 -type d -name "202*" | head -1)
# Restore configuration
docker cp "$BACKUP_DIR/config" bzzz-node:/app/
# Restore data
docker cp "$BACKUP_DIR/data" bzzz-node:/app/
# Start service
docker start bzzz-node
echo "Restore completed. Check service status."
Troubleshooting
Common Issues
Service Won't Start
# Check logs
docker logs bzzz-node
# Check configuration
docker exec bzzz-node /app/bzzz --config /app/config/config.yaml --validate
# Check permissions
docker exec bzzz-node ls -la /app/data
High Memory Usage
# Check Go memory stats
curl http://localhost:8080/api/debug/status | jq '.memory'
# Check DHT cache size
curl http://localhost:8080/api/dht/metrics | jq '.cache_size'
# Restart with memory limit
docker update --memory=512m bzzz-node
docker restart bzzz-node
Peer Connectivity Issues
# Check P2P status
curl http://localhost:8080/api/agent/peers
# Check network connectivity
docker exec bzzz-node netstat -an | grep 4001
# Check firewall rules
sudo ufw status | grep 4001
# Test bootstrap peers
docker exec bzzz-node ping bootstrap-1.bzzz.network
DHT Storage Problems
# Check DHT metrics
curl http://localhost:8080/api/dht/metrics
# Clear DHT cache
curl -X POST http://localhost:8080/api/debug/clear-cache
# Check disk space
docker exec bzzz-node df -h /app/data
Performance Tuning
High Load Optimization
# config.yaml adjustments for high load
dht:
cache_size: 10000 # Increase cache
cache_ttl: "30m" # Shorter TTL for fresher data
replication_factor: 5 # Higher replication
p2p:
max_connections: 1000 # More connections
api:
rate_limit: 5000 # Higher rate limit
timeout: "60s" # Longer timeout
Low Resource Optimization
# config.yaml adjustments for resource-constrained environments
dht:
cache_size: 100 # Smaller cache
cache_ttl: "2h" # Longer TTL
replication_factor: 2 # Lower replication
p2p:
max_connections: 50 # Fewer connections
logging:
level: "warn" # Less verbose logging
Security Hardening
Production Security Checklist
- Change default ports
- Enable TLS for API endpoints
- Configure firewall rules
- Set up log monitoring
- Enable audit logging
- Rotate Age keys regularly
- Monitor for unusual admin elections
- Implement rate limiting
- Use non-root Docker user
- Regular security updates
Network Security
# Firewall configuration
sudo ufw allow 22 # SSH
sudo ufw allow 8080/tcp # BZZZ API
sudo ufw allow 4001/tcp # P2P networking
sudo ufw enable
# Docker security
docker run --security-opt no-new-privileges \
--read-only \
--tmpfs /tmp:rw,noexec,nosuid,size=1g \
bzzz:latest
Cross-References
- User Manual - Basic usage and configuration
- Developer Guide - Development and testing procedures
- Architecture Documentation - System design and deployment patterns
- Technical Report - Performance characteristics and scaling
- Security Documentation - Security best practices
BZZZ Operations Guide v2.0 - Production deployment and maintenance procedures for Phase 2B unified architecture.