Major updates and improvements to BZZZ system

- Updated configuration and deployment files
- Improved system architecture and components
- Enhanced documentation and testing
- Fixed various issues and added new features

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
anthonyrawlins
2025-09-17 18:06:57 +10:00
parent 4e6140de03
commit f5f96ba505
71 changed files with 664 additions and 3823 deletions

View File

@@ -0,0 +1,326 @@
# BZZZ Installation System
A comprehensive one-command installation system for BZZZ distributed AI coordination platform, similar to Ollama's approach.
## Overview
The BZZZ installation system provides:
- **One-command installation**: `curl -fsSL https://chorus.services/install.sh | sh`
- **Automated system detection**: Hardware, OS, and network configuration
- **GPU-aware setup**: Detects NVIDIA/AMD GPUs and recommends Parallama for multi-GPU systems
- **Web-based configuration**: Beautiful React-based setup wizard
- **Production-ready deployment**: Systemd services, monitoring, and security
## Installation Architecture
### Phase 1: System Detection & Installation
1. **System Requirements Check**
- OS compatibility (Ubuntu, Debian, CentOS, RHEL, Fedora)
- Architecture support (amd64, arm64, armv7)
- Minimum resources (2GB RAM, 10GB disk)
2. **Hardware Detection**
- CPU cores and model
- Available memory
- Storage capacity
- GPU configuration (NVIDIA/AMD)
- Network interfaces
3. **Dependency Installation**
- Docker and Docker Compose
- System utilities (curl, wget, jq, etc.)
- GPU drivers (if applicable)
4. **AI Model Platform Choice**
- **Parallama (Recommended for Multi-GPU)**: Our multi-GPU fork of Ollama
- **Standard Ollama**: Traditional single-GPU Ollama
- **Skip**: Configure later via web UI
### Phase 2: BZZZ Installation
1. **Binary Installation**
- Download architecture-specific binaries
- Install to `/opt/bzzz/`
- Create symlinks in `/usr/local/bin/`
2. **System Setup**
- Create `bzzz` system user
- Setup directories (`/etc/bzzz`, `/var/log/bzzz`, `/var/lib/bzzz`)
- Configure permissions
3. **Service Installation**
- Systemd service files for BZZZ Go service and MCP server
- Automatic startup configuration
- Log rotation setup
### Phase 3: Web-Based Configuration
1. **Configuration Server**
- Starts BZZZ service with minimal config
- Launches React-based configuration UI
- Accessible at `http://[node-ip]:8080/setup`
2. **8-Step Configuration Wizard**
- System Detection & Validation
- Network Configuration
- Security Setup
- AI Integration
- Resource Allocation
- Service Deployment
- Cluster Formation
- Testing & Validation
## Required User Information
### 1. Cluster Infrastructure
- **Network Configuration**
- Subnet IP range (auto-detected, user can override)
- Primary network interface selection
- Port assignments (BZZZ: 8080, MCP: 3000, WebUI: 8080)
- Firewall configuration preferences
### 2. Security Settings
- **SSH Key Management**
- Generate new SSH keys
- Upload existing keys
- SSH username and port
- Key distribution to cluster nodes
- **Authentication**
- TLS/SSL certificate setup
- Authentication method (token, OAuth2, LDAP)
- Security policy configuration
### 3. AI Integration
- **OpenAI Configuration**
- API key (secure input with validation)
- Default model selection (GPT-5)
- Cost limits (daily/monthly)
- Usage monitoring preferences
- **Local AI Models**
- Ollama/Parallama endpoint configuration
- Model distribution strategy
- GPU allocation for Parallama
- Automatic model pulling
### 4. Resource Management
- **Hardware Allocation**
- CPU cores allocation
- Memory limits per service
- Storage paths and quotas
- GPU assignment (for Parallama)
- **Service Configuration**
- Container resource limits
- Auto-scaling policies
- Monitoring and alerting
- Backup and recovery
### 5. Cluster Topology
- **Node Roles**
- Coordinator vs Worker designation
- High availability setup
- Load balancing configuration
- Failover preferences
## Installation Flow
### Command Execution
```bash
curl -fsSL https://chorus.services/install.sh | sh
```
### Interactive Prompts
1. **GPU Detection Response**
```
🚀 Multi-GPU Setup Detected (4 NVIDIA GPUs)
Parallama is RECOMMENDED for optimal multi-GPU performance!
Options:
1. Install Parallama (recommended for GPU setups)
2. Install standard Ollama
3. Skip Ollama installation (configure later)
```
2. **Installation Progress**
```
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
🔥 BZZZ Distributed AI Coordination Platform
Installer v1.0
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
[INFO] Detected OS: Ubuntu 22.04
[INFO] Detected architecture: amd64
[SUCCESS] System requirements check passed
[INFO] Detected 4 NVIDIA GPU(s)
[SUCCESS] Dependencies installed successfully
[SUCCESS] Parallama installed successfully
[SUCCESS] BZZZ binaries installed successfully
[SUCCESS] Configuration server started
```
3. **Completion Message**
```
🚀 Next Steps:
1. Complete your cluster configuration:
👉 Open: http://192.168.1.100:8080/setup
2. Useful commands:
• Check status: bzzz status
• View logs: sudo journalctl -u bzzz -f
• Start/Stop: sudo systemctl [start|stop] bzzz
📚 Docs: https://docs.chorus.services/bzzz
💬 Support: https://discord.gg/chorus-services
```
### Web Configuration Flow
#### Step 1: System Detection
- Display detected hardware configuration
- Show GPU setup and capabilities
- Validate software requirements
- System readiness check
#### Step 2: Network Configuration
- Network interface selection
- Subnet configuration
- Port assignment
- Firewall rule setup
- Connectivity testing
#### Step 3: Security Setup
- SSH key generation/upload
- TLS certificate configuration
- Authentication method selection
- Security policy setup
#### Step 4: AI Integration
- OpenAI API key configuration
- Model preferences and costs
- Ollama/Parallama setup
- Local model management
#### Step 5: Resource Allocation
- CPU/Memory allocation sliders
- Storage path configuration
- GPU assignment (Parallama)
- Resource monitoring setup
#### Step 6: Service Deployment
- Service configuration review
- Container deployment
- Health check setup
- Monitoring configuration
#### Step 7: Cluster Formation
- Create new cluster or join existing
- Network discovery
- Node role assignment
- Cluster validation
#### Step 8: Testing & Validation
- Connectivity tests
- AI model verification
- Performance benchmarks
- Configuration validation
## Files Structure
```
/home/tony/chorus/project-queues/active/BZZZ/install/
├── install.sh # Main installation script
├── config-ui/ # React configuration interface
│ ├── package.json # Dependencies and scripts
│ ├── next.config.js # Next.js configuration
│ ├── tailwind.config.js # Tailwind CSS config
│ ├── tsconfig.json # TypeScript config
│ ├── postcss.config.js # PostCSS config
│ └── app/ # Next.js app directory
│ ├── globals.css # Global styles
│ ├── layout.tsx # Root layout
│ ├── page.tsx # Home page (redirects to setup)
│ └── setup/
│ ├── page.tsx # Main setup wizard
│ └── components/ # Setup step components
│ ├── SystemDetection.tsx
│ ├── NetworkConfiguration.tsx
│ ├── SecuritySetup.tsx
│ ├── AIConfiguration.tsx
│ ├── ResourceAllocation.tsx
│ ├── ServiceDeployment.tsx
│ ├── ClusterFormation.tsx
│ └── TestingValidation.tsx
├── requirements.md # Detailed requirements
└── INSTALLATION_SYSTEM.md # This document
```
## Key Features
### 1. Intelligent GPU Detection
- Automatic detection of NVIDIA/AMD GPUs
- Multi-GPU topology analysis
- Recommends Parallama for multi-GPU setups
- Fallback to standard Ollama for single GPU
- CPU-only mode support
### 2. Comprehensive System Validation
- Hardware requirements checking
- Software dependency validation
- Network connectivity testing
- Security configuration verification
### 3. Production-Ready Setup
- Systemd service integration
- Proper user/permission management
- Log rotation and monitoring
- Security best practices
- Automatic startup configuration
### 4. Beautiful User Experience
- Modern React-based interface
- Progressive setup wizard
- Real-time validation feedback
- Mobile-responsive design
- Comprehensive help and documentation
### 5. Enterprise Features
- SSH key distribution
- TLS/SSL configuration
- LDAP/AD integration support
- Cost management and monitoring
- Multi-node cluster orchestration
## Next Implementation Steps
1. **Backend API Development**
- Go-based configuration API
- System detection endpoints
- Configuration validation
- Service management
2. **Enhanced Components**
- Complete all setup step components
- Real-time validation
- Progress tracking
- Error handling
3. **Cluster Management**
- Node discovery protocols
- Automated SSH setup
- Service distribution
- Health monitoring
4. **Security Hardening**
- Certificate management
- Secure key distribution
- Network encryption
- Access control
5. **Testing & Validation**
- Integration test suite
- Performance benchmarking
- Security auditing
- User acceptance testing
This installation system provides a seamless, professional-grade setup experience that rivals major infrastructure platforms while specifically optimizing for AI workloads and multi-GPU configurations.