Major Improvements: - Added retry deployment buttons in machine list for failed deployments - Added retry button in SSH console modal footer for enhanced UX - Enhanced deployment process with comprehensive cleanup of existing services - Improved binary installation with password-based sudo authentication - Updated configuration generation to include all required sections (agent, ai, network, security) - Fixed deployment verification and error handling Security Enhancements: - Enhanced verifiedStopExistingServices with thorough cleanup process - Improved binary copying with proper sudo authentication - Added comprehensive configuration validation UX Improvements: - Users can retry deployments without re-running machine discovery - Retry buttons available from both machine list and console modal - Real-time deployment progress with detailed console output - Clear error states with actionable retry options Technical Changes: - Modified ServiceDeployment.tsx with retry button components - Enhanced api/setup_manager.go with improved deployment functions - Updated main.go with command line argument support (--config, --setup) - Added comprehensive zero-trust security validation system 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
1046 lines
30 KiB
Markdown
1046 lines
30 KiB
Markdown
# BZZZ Human Agent Portal (HAP) - Phase 1 Technical Specification
|
|
|
|
**Version**: 1.0
|
|
**Author**: Senior Software Architect
|
|
**Date**: 2025-08-29
|
|
|
|
## Executive Summary
|
|
|
|
This specification defines the detailed technical architecture for transforming the BZZZ autonomous agent system from a monolithic single-binary architecture into a dual-binary system supporting both autonomous agents (`bzzz-agent`) and human agent portals (`bzzz-hap`) while maintaining all existing functionality and P2P infrastructure.
|
|
|
|
---
|
|
|
|
## 1. Current Architecture Analysis
|
|
|
|
### 1.1 Existing Monolithic Structure
|
|
```
|
|
BZZZ/
|
|
├── main.go # Single entry point (1,663 lines)
|
|
├── pkg/ # 14 major subsystems
|
|
│ ├── agentid/ # Agent identity and crypto
|
|
│ ├── config/ # Configuration management
|
|
│ ├── crypto/ # Age encryption, Shamir shares
|
|
│ ├── dht/ # Distributed hash table
|
|
│ ├── election/ # Admin election system
|
|
│ ├── health/ # Health monitoring
|
|
│ ├── slurp/ # Task coordination (7 submodules)
|
|
│ ├── ucxi/ # Context resolution server
|
|
│ ├── ucxl/ # Universal Context eXchange Language
|
|
│ └── [9 other subsystems]
|
|
├── p2p/ # libp2p networking
|
|
├── pubsub/ # HMMM collaborative messaging
|
|
├── api/ # HTTP API server
|
|
└── coordinator/ # Task coordination
|
|
```
|
|
|
|
### 1.2 Key Shared Infrastructure Components
|
|
- **P2P Mesh**: libp2p with mDNS discovery
|
|
- **Agent Identity**: Cryptographic agent records with role-based access
|
|
- **HMMM Messaging**: Collaborative reasoning protocol integration
|
|
- **DHT Storage**: Distributed storage with Age encryption
|
|
- **UCXL System**: Context resolution and addressing
|
|
- **SLURP Coordination**: Task distribution and leadership election
|
|
- **Configuration**: YAML-based role definitions and capabilities
|
|
|
|
---
|
|
|
|
## 2. Multi-Binary Architecture Design
|
|
|
|
### 2.1 Target Structure
|
|
```
|
|
BZZZ/
|
|
├── cmd/
|
|
│ ├── agent/
|
|
│ │ └── main.go # Autonomous agent binary entry point
|
|
│ └── hap/
|
|
│ └── main.go # Human agent portal binary entry point
|
|
├── internal/
|
|
│ └── common/
|
|
│ └── runtime/ # Shared initialization and runtime components
|
|
│ ├── agent.go # Agent identity and role initialization
|
|
│ ├── config.go # Configuration loading and validation
|
|
│ ├── p2p.go # P2P node initialization
|
|
│ ├── services.go # Core service initialization
|
|
│ ├── storage.go # DHT and encrypted storage setup
|
|
│ └── shutdown.go # Graceful shutdown management
|
|
├── internal/
|
|
│ ├── agent/ # Autonomous agent specific code
|
|
│ │ ├── runner.go # Agent execution loop
|
|
│ │ └── handlers.go # Autonomous task handlers
|
|
│ └── hap/ # Human agent portal specific code
|
|
│ ├── terminal/ # Terminal interface
|
|
│ ├── forms/ # Message composition templates
|
|
│ ├── context/ # UCXL browsing interface
|
|
│ └── prompts/ # Human interaction prompts
|
|
├── pkg/ # Unchanged - shared libraries
|
|
└── [existing directories] # Unchanged
|
|
```
|
|
|
|
### 2.2 Build System Enhancement
|
|
```makefile
|
|
# Updated Makefile targets
|
|
build-agent: build-ui embed-ui
|
|
CGO_ENABLED=0 go build -ldflags="-s -w" -o $(BUILD_DIR)/bzzz-agent ./cmd/agent
|
|
|
|
build-hap: build-ui embed-ui
|
|
CGO_ENABLED=0 go build -ldflags="-s -w" -o $(BUILD_DIR)/bzzz-hap ./cmd/hap
|
|
|
|
build: build-agent build-hap
|
|
```
|
|
|
|
---
|
|
|
|
## 3. Shared Runtime Architecture
|
|
|
|
### 3.1 Runtime Initialization Pipeline
|
|
|
|
```go
|
|
// internal/common/runtime/services.go
|
|
type RuntimeServices struct {
|
|
Config *config.Config
|
|
Node *p2p.Node
|
|
PubSub *pubsub.PubSub
|
|
DHT *dht.LibP2PDHT
|
|
EncryptedStorage *dht.EncryptedDHTStorage
|
|
ElectionManager *election.ElectionManager
|
|
HealthManager *health.Manager
|
|
ShutdownManager *shutdown.Manager
|
|
DecisionPublisher *ucxl.DecisionPublisher
|
|
UCXIServer *ucxi.Server
|
|
HTTPServer *api.HTTPServer
|
|
Logger logging.Logger
|
|
}
|
|
|
|
type RuntimeConfig struct {
|
|
ConfigPath string
|
|
BinaryType BinaryType // Agent or HAP
|
|
EnableSetupMode bool
|
|
CustomPorts PortConfig
|
|
}
|
|
|
|
type BinaryType int
|
|
const (
|
|
BinaryTypeAgent BinaryType = iota
|
|
BinaryTypeHAP
|
|
)
|
|
```
|
|
|
|
### 3.2 Core Runtime Interface
|
|
|
|
```go
|
|
// internal/common/runtime/runtime.go
|
|
type Runtime interface {
|
|
Initialize(ctx context.Context, cfg RuntimeConfig) (*RuntimeServices, error)
|
|
Start(ctx context.Context, services *RuntimeServices) error
|
|
Stop(ctx context.Context, services *RuntimeServices) error
|
|
GetHealthStatus() health.Status
|
|
}
|
|
|
|
type StandardRuntime struct {
|
|
services *RuntimeServices
|
|
logger logging.Logger
|
|
}
|
|
|
|
func NewRuntime(logger logging.Logger) Runtime {
|
|
return &StandardRuntime{logger: logger}
|
|
}
|
|
```
|
|
|
|
### 3.3 Initialization Sequence
|
|
|
|
```go
|
|
// internal/common/runtime/services.go
|
|
func (r *StandardRuntime) Initialize(ctx context.Context, cfg RuntimeConfig) (*RuntimeServices, error) {
|
|
services := &RuntimeServices{}
|
|
|
|
// Phase 1: Configuration
|
|
if err := r.initializeConfig(cfg.ConfigPath, &services); err != nil {
|
|
return nil, fmt.Errorf("config initialization failed: %w", err)
|
|
}
|
|
|
|
// Phase 2: P2P Infrastructure
|
|
if err := r.initializeP2P(ctx, services); err != nil {
|
|
return nil, fmt.Errorf("P2P initialization failed: %w", err)
|
|
}
|
|
|
|
// Phase 3: Core Services
|
|
if err := r.initializeCoreServices(ctx, services); err != nil {
|
|
return nil, fmt.Errorf("core services initialization failed: %w", err)
|
|
}
|
|
|
|
// Phase 4: Binary-specific configuration
|
|
if err := r.applyBinarySpecificConfig(cfg.BinaryType, services); err != nil {
|
|
return nil, fmt.Errorf("binary-specific config failed: %w", err)
|
|
}
|
|
|
|
// Phase 5: Health and Monitoring
|
|
if err := r.initializeMonitoring(services); err != nil {
|
|
return nil, fmt.Errorf("monitoring initialization failed: %w", err)
|
|
}
|
|
|
|
return services, nil
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## 4. Binary-Specific Implementations
|
|
|
|
### 4.1 Autonomous Agent Binary (`cmd/agent/main.go`)
|
|
|
|
```go
|
|
package main
|
|
|
|
import (
|
|
"context"
|
|
"fmt"
|
|
"log"
|
|
"os"
|
|
"os/signal"
|
|
"syscall"
|
|
|
|
"chorus.services/bzzz/internal/agent"
|
|
"chorus.services/bzzz/internal/common/runtime"
|
|
"chorus.services/bzzz/pkg/logging"
|
|
)
|
|
|
|
func main() {
|
|
ctx, cancel := context.WithCancel(context.Background())
|
|
defer cancel()
|
|
|
|
logger := logging.NewStandardLogger("bzzz-agent")
|
|
rt := runtime.NewRuntime(logger)
|
|
|
|
// Initialize shared runtime
|
|
runtimeConfig := runtime.RuntimeConfig{
|
|
ConfigPath: getConfigPath(),
|
|
BinaryType: runtime.BinaryTypeAgent,
|
|
EnableSetupMode: needsSetup(),
|
|
}
|
|
|
|
services, err := rt.Initialize(ctx, runtimeConfig)
|
|
if err != nil {
|
|
log.Fatalf("Failed to initialize runtime: %v", err)
|
|
}
|
|
|
|
// Start shared services
|
|
if err := rt.Start(ctx, services); err != nil {
|
|
log.Fatalf("Failed to start runtime: %v", err)
|
|
}
|
|
|
|
// Initialize agent-specific components
|
|
agentRunner := agent.NewRunner(services, logger)
|
|
if err := agentRunner.Start(ctx); err != nil {
|
|
log.Fatalf("Failed to start agent runner: %v", err)
|
|
}
|
|
|
|
logger.Info("🤖 BZZZ Autonomous Agent started successfully")
|
|
logger.Info("📍 Node ID: %s", services.Node.ID().ShortString())
|
|
logger.Info("🎯 Agent ID: %s", services.Config.Agent.ID)
|
|
|
|
// Wait for shutdown signals
|
|
sigChan := make(chan os.Signal, 1)
|
|
signal.Notify(sigChan, syscall.SIGINT, syscall.SIGTERM)
|
|
<-sigChan
|
|
|
|
logger.Info("🛑 Shutting down agent...")
|
|
if err := rt.Stop(ctx, services); err != nil {
|
|
logger.Error("Shutdown error: %v", err)
|
|
}
|
|
}
|
|
```
|
|
|
|
### 4.2 Human Agent Portal Binary (`cmd/hap/main.go`)
|
|
|
|
```go
|
|
package main
|
|
|
|
import (
|
|
"context"
|
|
"fmt"
|
|
"log"
|
|
"os"
|
|
"os/signal"
|
|
"syscall"
|
|
|
|
"chorus.services/bzzz/internal/hap"
|
|
"chorus.services/bzzz/internal/common/runtime"
|
|
"chorus.services/bzzz/pkg/logging"
|
|
)
|
|
|
|
func main() {
|
|
ctx, cancel := context.WithCancel(context.Background())
|
|
defer cancel()
|
|
|
|
logger := logging.NewStandardLogger("bzzz-hap")
|
|
rt := runtime.NewRuntime(logger)
|
|
|
|
// Initialize shared runtime
|
|
runtimeConfig := runtime.RuntimeConfig{
|
|
ConfigPath: getConfigPath(),
|
|
BinaryType: runtime.BinaryTypeHAP,
|
|
EnableSetupMode: needsSetup(),
|
|
CustomPorts: runtime.PortConfig{
|
|
HTTPPort: 8090, // Different from agent to avoid conflicts
|
|
HealthPort: 8091,
|
|
},
|
|
}
|
|
|
|
services, err := rt.Initialize(ctx, runtimeConfig)
|
|
if err != nil {
|
|
log.Fatalf("Failed to initialize runtime: %v", err)
|
|
}
|
|
|
|
// Start shared services
|
|
if err := rt.Start(ctx, services); err != nil {
|
|
log.Fatalf("Failed to start runtime: %v", err)
|
|
}
|
|
|
|
// Initialize HAP-specific components
|
|
hapInterface := hap.NewTerminalInterface(services, logger)
|
|
if err := hapInterface.Start(ctx); err != nil {
|
|
log.Fatalf("Failed to start HAP interface: %v", err)
|
|
}
|
|
|
|
logger.Info("👤 BZZZ Human Agent Portal started successfully")
|
|
logger.Info("📍 Node ID: %s", services.Node.ID().ShortString())
|
|
logger.Info("🎯 Agent ID: %s", services.Config.Agent.ID)
|
|
logger.Info("💬 Terminal interface ready for human interaction")
|
|
|
|
// Wait for shutdown signals
|
|
sigChan := make(chan os.Signal, 1)
|
|
signal.Notify(sigChan, syscall.SIGINT, syscall.SIGTERM)
|
|
<-sigChan
|
|
|
|
logger.Info("🛑 Shutting down HAP...")
|
|
if err := rt.Stop(ctx, services); err != nil {
|
|
logger.Error("Shutdown error: %v", err)
|
|
}
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## 5. Interface Contracts and API Boundaries
|
|
|
|
### 5.1 Runtime Service Interface
|
|
|
|
```go
|
|
// internal/common/runtime/interfaces.go
|
|
type RuntimeService interface {
|
|
Name() string
|
|
Initialize(ctx context.Context, config *config.Config) error
|
|
Start(ctx context.Context) error
|
|
Stop(ctx context.Context) error
|
|
IsHealthy() bool
|
|
Dependencies() []string
|
|
}
|
|
|
|
type ServiceManager interface {
|
|
Register(service RuntimeService)
|
|
Start(ctx context.Context) error
|
|
Stop(ctx context.Context) error
|
|
GetService(name string) RuntimeService
|
|
GetHealthStatus() map[string]bool
|
|
}
|
|
```
|
|
|
|
### 5.2 Binary-Specific Execution Interface
|
|
|
|
```go
|
|
// internal/common/runtime/execution.go
|
|
type ExecutionMode interface {
|
|
Run(ctx context.Context, services *RuntimeServices) error
|
|
Stop(ctx context.Context) error
|
|
GetType() BinaryType
|
|
}
|
|
|
|
// Agent implementation
|
|
type AgentExecution struct {
|
|
services *RuntimeServices
|
|
runner *agent.Runner
|
|
}
|
|
|
|
// HAP implementation
|
|
type HAPExecution struct {
|
|
services *RuntimeServices
|
|
interface *hap.TerminalInterface
|
|
}
|
|
```
|
|
|
|
### 5.3 Shared P2P Participation
|
|
|
|
Both binaries participate identically in the P2P mesh:
|
|
|
|
```go
|
|
// Binary-agnostic P2P participation
|
|
type P2PParticipant interface {
|
|
JoinMesh(ctx context.Context) error
|
|
PublishMessage(topic string, data interface{}) error
|
|
SubscribeToTopic(topic string, handler MessageHandler) error
|
|
GetPeerID() string
|
|
GetConnectedPeers() []string
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## 6. Configuration Strategy
|
|
|
|
### 6.1 Shared Configuration Structure
|
|
|
|
```go
|
|
// pkg/config/binary_config.go
|
|
type BinarySpecificConfig struct {
|
|
BinaryType string `yaml:"binary_type"` // "agent" or "hap"
|
|
Ports PortConfiguration `yaml:"ports"`
|
|
Interface InterfaceConfig `yaml:"interface"`
|
|
Capabilities []string `yaml:"capabilities"`
|
|
}
|
|
|
|
type PortConfiguration struct {
|
|
HTTPPort int `yaml:"http_port"`
|
|
HealthPort int `yaml:"health_port"`
|
|
UCXIPort int `yaml:"ucxi_port"`
|
|
AdminUIPort int `yaml:"admin_ui_port,omitempty"`
|
|
}
|
|
|
|
type InterfaceConfig struct {
|
|
Mode string `yaml:"mode"` // "terminal", "web", "headless"
|
|
AutoStartInterface bool `yaml:"auto_start_interface"`
|
|
MessageTemplates string `yaml:"message_templates_path,omitempty"`
|
|
PromptLibrary string `yaml:"prompt_library_path,omitempty"`
|
|
}
|
|
```
|
|
|
|
### 6.2 Configuration Validation
|
|
|
|
```go
|
|
// internal/common/runtime/config.go
|
|
type ConfigValidator struct {
|
|
binaryType BinaryType
|
|
}
|
|
|
|
func (v *ConfigValidator) ValidateForBinary(cfg *config.Config) error {
|
|
// Common validation
|
|
if err := v.validateCommonConfig(cfg); err != nil {
|
|
return fmt.Errorf("common config validation failed: %w", err)
|
|
}
|
|
|
|
// Binary-specific validation
|
|
switch v.binaryType {
|
|
case BinaryTypeAgent:
|
|
return v.validateAgentConfig(cfg)
|
|
case BinaryTypeHAP:
|
|
return v.validateHAPConfig(cfg)
|
|
default:
|
|
return fmt.Errorf("unknown binary type: %v", v.binaryType)
|
|
}
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## 7. Data Flow Architecture
|
|
|
|
### 7.1 Message Flow Between Binaries
|
|
|
|
```mermaid
|
|
graph TB
|
|
subgraph "Autonomous Agent (bzzz-agent)"
|
|
AA[Agent Runner]
|
|
AT[Task Processor]
|
|
AM[Message Handler]
|
|
end
|
|
|
|
subgraph "Human Agent Portal (bzzz-hap)"
|
|
HI[Terminal Interface]
|
|
HF[Form Templates]
|
|
HP[Prompt Engine]
|
|
end
|
|
|
|
subgraph "Shared P2P Infrastructure"
|
|
PS[PubSub System]
|
|
DHT[Distributed Storage]
|
|
EL[Election System]
|
|
end
|
|
|
|
AA --> PS
|
|
AT --> DHT
|
|
AM --> PS
|
|
|
|
HI --> PS
|
|
HF --> DHT
|
|
HP --> PS
|
|
|
|
PS --> AA
|
|
PS --> HI
|
|
DHT --> AT
|
|
DHT --> HF
|
|
```
|
|
|
|
### 7.2 Shared State Management
|
|
|
|
```go
|
|
// internal/common/runtime/state.go
|
|
type SharedState struct {
|
|
ActiveTasks map[string]*TaskInfo
|
|
PeerRegistry map[string]*PeerInfo
|
|
ElectionState *ElectionInfo
|
|
ConfigSnapshot *config.Config
|
|
HealthStatus *SystemHealth
|
|
mutex sync.RWMutex
|
|
}
|
|
|
|
func (s *SharedState) UpdateTaskState(taskID string, state TaskState) error {
|
|
s.mutex.Lock()
|
|
defer s.mutex.Unlock()
|
|
|
|
if task, exists := s.ActiveTasks[taskID]; exists {
|
|
task.State = state
|
|
task.LastUpdated = time.Now()
|
|
return nil
|
|
}
|
|
return fmt.Errorf("task not found: %s", taskID)
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## 8. Security and Access Control
|
|
|
|
### 8.1 Shared Cryptographic Identity
|
|
|
|
Both binaries use identical agent identity and cryptographic systems:
|
|
|
|
```go
|
|
// pkg/agentid/shared_identity.go
|
|
type SharedAgentIdentity struct {
|
|
AgentID string
|
|
PrivateKey crypto.PrivateKey
|
|
PublicKey crypto.PublicKey
|
|
Role string
|
|
Capabilities []string
|
|
BinaryType BinaryType // Added to distinguish binary type in P2P
|
|
}
|
|
|
|
func (id *SharedAgentIdentity) SignMessage(message []byte) ([]byte, error) {
|
|
// Identical signing for both binaries
|
|
return crypto.Sign(id.PrivateKey, message)
|
|
}
|
|
|
|
func (id *SharedAgentIdentity) CreateAgentRecord() (*agentid.AgentRecord, error) {
|
|
return &agentid.AgentRecord{
|
|
ID: id.AgentID,
|
|
PublicKey: id.PublicKey,
|
|
Role: id.Role,
|
|
Capabilities: id.Capabilities,
|
|
BinaryType: id.BinaryType.String(), // New field for P2P identification
|
|
Timestamp: time.Now(),
|
|
}, nil
|
|
}
|
|
```
|
|
|
|
### 8.2 Role-Based Access Control
|
|
|
|
```go
|
|
// pkg/crypto/rbac.go
|
|
type RoleBasedAccess struct {
|
|
agentRole string
|
|
binaryType BinaryType
|
|
capabilities []string
|
|
}
|
|
|
|
func (r *RoleBasedAccess) CanAccessResource(resource string, operation string) bool {
|
|
// Both binaries use same RBAC rules
|
|
return r.checkPermission(resource, operation, r.agentRole, r.capabilities)
|
|
}
|
|
|
|
func (r *RoleBasedAccess) GetEncryptionRecipients(contentType string) ([]string, error) {
|
|
// Same encryption recipients for both binaries
|
|
return crypto.GetRecipientsForRole(r.agentRole, contentType)
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## 9. Error Handling and Resilience
|
|
|
|
### 9.1 Shared Error Handling Strategy
|
|
|
|
```go
|
|
// internal/common/runtime/errors.go
|
|
type RuntimeError struct {
|
|
Code ErrorCode
|
|
Message string
|
|
BinaryType BinaryType
|
|
ServiceName string
|
|
Timestamp time.Time
|
|
Cause error
|
|
}
|
|
|
|
type ErrorCode int
|
|
const (
|
|
ErrConfigInvalid ErrorCode = iota
|
|
ErrP2PInitFailed
|
|
ErrDHTUnavailable
|
|
ErrElectionFailed
|
|
ErrServiceStartFailed
|
|
)
|
|
|
|
func NewRuntimeError(code ErrorCode, service string, binType BinaryType, msg string, cause error) *RuntimeError {
|
|
return &RuntimeError{
|
|
Code: code,
|
|
Message: msg,
|
|
BinaryType: binType,
|
|
ServiceName: service,
|
|
Timestamp: time.Now(),
|
|
Cause: cause,
|
|
}
|
|
}
|
|
```
|
|
|
|
### 9.2 Circuit Breaker Pattern
|
|
|
|
```go
|
|
// internal/common/runtime/resilience.go
|
|
type ServiceCircuitBreaker struct {
|
|
serviceName string
|
|
failureCount int
|
|
lastFailureTime time.Time
|
|
state CircuitState
|
|
maxFailures int
|
|
timeout time.Duration
|
|
}
|
|
|
|
func (cb *ServiceCircuitBreaker) Call(operation func() error) error {
|
|
if cb.state == CircuitOpen {
|
|
if time.Since(cb.lastFailureTime) > cb.timeout {
|
|
cb.state = CircuitHalfOpen
|
|
} else {
|
|
return fmt.Errorf("circuit breaker open for service: %s", cb.serviceName)
|
|
}
|
|
}
|
|
|
|
err := operation()
|
|
if err != nil {
|
|
cb.recordFailure()
|
|
return err
|
|
}
|
|
|
|
cb.recordSuccess()
|
|
return nil
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## 10. Testing Strategy
|
|
|
|
### 10.1 Shared Runtime Testing
|
|
|
|
```go
|
|
// internal/common/runtime/runtime_test.go
|
|
func TestRuntimeInitialization(t *testing.T) {
|
|
tests := []struct {
|
|
name string
|
|
binaryType BinaryType
|
|
configPath string
|
|
wantError bool
|
|
}{
|
|
{
|
|
name: "Agent runtime initialization",
|
|
binaryType: BinaryTypeAgent,
|
|
configPath: "testdata/agent_config.yaml",
|
|
wantError: false,
|
|
},
|
|
{
|
|
name: "HAP runtime initialization",
|
|
binaryType: BinaryTypeHAP,
|
|
configPath: "testdata/hap_config.yaml",
|
|
wantError: false,
|
|
},
|
|
}
|
|
|
|
for _, tt := range tests {
|
|
t.Run(tt.name, func(t *testing.T) {
|
|
ctx := context.Background()
|
|
logger := logging.NewTestLogger(t)
|
|
runtime := NewRuntime(logger)
|
|
|
|
cfg := RuntimeConfig{
|
|
ConfigPath: tt.configPath,
|
|
BinaryType: tt.binaryType,
|
|
}
|
|
|
|
services, err := runtime.Initialize(ctx, cfg)
|
|
|
|
if tt.wantError && err == nil {
|
|
t.Error("expected error but got none")
|
|
}
|
|
if !tt.wantError && err != nil {
|
|
t.Errorf("unexpected error: %v", err)
|
|
}
|
|
if !tt.wantError && services == nil {
|
|
t.Error("expected services but got nil")
|
|
}
|
|
})
|
|
}
|
|
}
|
|
```
|
|
|
|
### 10.2 Integration Testing
|
|
|
|
```go
|
|
// test/integration/dual_binary_test.go
|
|
func TestDualBinaryP2PInteraction(t *testing.T) {
|
|
// Start agent binary
|
|
agentCtx, agentCancel := context.WithCancel(context.Background())
|
|
defer agentCancel()
|
|
|
|
agentRuntime := startTestAgent(t, agentCtx, "testdata/agent_config.yaml")
|
|
defer agentRuntime.Shutdown()
|
|
|
|
// Start HAP binary
|
|
hapCtx, hapCancel := context.WithCancel(context.Background())
|
|
defer hapCancel()
|
|
|
|
hapRuntime := startTestHAP(t, hapCtx, "testdata/hap_config.yaml")
|
|
defer hapRuntime.Shutdown()
|
|
|
|
// Wait for P2P mesh formation
|
|
waitForPeerConnection(t, agentRuntime, hapRuntime, 10*time.Second)
|
|
|
|
// Test message exchange
|
|
testMessage := "test collaboration message"
|
|
err := hapRuntime.SendMessage("coordination", testMessage)
|
|
assert.NoError(t, err)
|
|
|
|
// Verify agent receives message
|
|
receivedMsg := waitForMessage(t, agentRuntime, 5*time.Second)
|
|
assert.Equal(t, testMessage, receivedMsg)
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## 11. Deployment Strategy
|
|
|
|
### 11.1 Docker Multi-Stage Build
|
|
|
|
```dockerfile
|
|
# Dockerfile.multi-stage
|
|
FROM golang:1.21-alpine AS builder
|
|
WORKDIR /app
|
|
COPY . .
|
|
|
|
# Build both binaries
|
|
RUN go mod download
|
|
RUN CGO_ENABLED=0 go build -ldflags="-s -w" -o bzzz-agent ./cmd/agent
|
|
RUN CGO_ENABLED=0 go build -ldflags="-s -w" -o bzzz-hap ./cmd/hap
|
|
|
|
FROM alpine:latest
|
|
RUN apk --no-cache add ca-certificates
|
|
WORKDIR /root/
|
|
|
|
# Copy both binaries
|
|
COPY --from=builder /app/bzzz-agent .
|
|
COPY --from=builder /app/bzzz-hap .
|
|
|
|
# Default to agent mode, can be overridden
|
|
CMD ["./bzzz-agent"]
|
|
```
|
|
|
|
### 11.2 Kubernetes Deployment
|
|
|
|
```yaml
|
|
# deployments/kubernetes/agent-deployment.yaml
|
|
apiVersion: apps/v1
|
|
kind: Deployment
|
|
metadata:
|
|
name: bzzz-agent
|
|
spec:
|
|
replicas: 3
|
|
selector:
|
|
matchLabels:
|
|
app: bzzz-agent
|
|
template:
|
|
metadata:
|
|
labels:
|
|
app: bzzz-agent
|
|
binary-type: agent
|
|
spec:
|
|
containers:
|
|
- name: bzzz-agent
|
|
image: bzzz:latest
|
|
command: ["./bzzz-agent"]
|
|
ports:
|
|
- containerPort: 8080
|
|
- containerPort: 8081
|
|
env:
|
|
- name: BZZZ_CONFIG_PATH
|
|
value: "/config/agent-config.yaml"
|
|
volumeMounts:
|
|
- name: config
|
|
mountPath: /config
|
|
volumes:
|
|
- name: config
|
|
configMap:
|
|
name: bzzz-agent-config
|
|
|
|
---
|
|
# deployments/kubernetes/hap-deployment.yaml
|
|
apiVersion: apps/v1
|
|
kind: Deployment
|
|
metadata:
|
|
name: bzzz-hap
|
|
spec:
|
|
replicas: 1
|
|
selector:
|
|
matchLabels:
|
|
app: bzzz-hap
|
|
template:
|
|
metadata:
|
|
labels:
|
|
app: bzzz-hap
|
|
binary-type: hap
|
|
spec:
|
|
containers:
|
|
- name: bzzz-hap
|
|
image: bzzz:latest
|
|
command: ["./bzzz-hap"]
|
|
ports:
|
|
- containerPort: 8090
|
|
- containerPort: 8091
|
|
env:
|
|
- name: BZZZ_CONFIG_PATH
|
|
value: "/config/hap-config.yaml"
|
|
volumeMounts:
|
|
- name: config
|
|
mountPath: /config
|
|
volumes:
|
|
- name: config
|
|
configMap:
|
|
name: bzzz-hap-config
|
|
```
|
|
|
|
---
|
|
|
|
## 12. Risk Analysis and Mitigation
|
|
|
|
### 12.1 Identified Architectural Risks
|
|
|
|
| Risk Category | Risk Description | Impact | Probability | Mitigation Strategy |
|
|
|---------------|------------------|--------|-------------|-------------------|
|
|
| **Configuration Drift** | Agent and HAP configs diverge, causing P2P incompatibility | High | Medium | Shared config validation, integration tests |
|
|
| **Port Conflicts** | Both binaries try to use same ports when co-deployed | Medium | High | Binary-specific default ports, config validation |
|
|
| **Shared State Race Conditions** | Concurrent access to DHT/PubSub from both binaries | High | Medium | Proper locking, message deduplication |
|
|
| **P2P Identity Collision** | Same agent ID used by both binaries simultaneously | High | Low | Startup checks, unique binary identifiers |
|
|
| **Resource Contention** | Memory/CPU competition when both binaries run on same node | Medium | Medium | Resource monitoring, circuit breakers |
|
|
| **Deployment Complexity** | Increased operational complexity with two binaries | Medium | High | Docker multi-stage builds, K8s deployments |
|
|
|
|
### 12.2 Specific Mitigation Strategies
|
|
|
|
#### Configuration Validation
|
|
```go
|
|
// internal/common/runtime/validation.go
|
|
func ValidateMultiBinaryDeployment(agentConfig, hapConfig *config.Config) error {
|
|
validators := []func(*config.Config, *config.Config) error{
|
|
validateP2PCompatibility,
|
|
validatePortAssignments,
|
|
validateAgentIdentities,
|
|
validateEncryptionKeys,
|
|
}
|
|
|
|
for _, validator := range validators {
|
|
if err := validator(agentConfig, hapConfig); err != nil {
|
|
return err
|
|
}
|
|
}
|
|
return nil
|
|
}
|
|
```
|
|
|
|
#### Startup Collision Detection
|
|
```go
|
|
// internal/common/runtime/collision.go
|
|
func CheckForRunningInstance(agentID string, binaryType BinaryType) error {
|
|
lockFile := fmt.Sprintf("/tmp/bzzz-%s-%s.lock", agentID, binaryType)
|
|
|
|
if _, err := os.Stat(lockFile); err == nil {
|
|
return fmt.Errorf("instance already running: %s %s", binaryType, agentID)
|
|
}
|
|
|
|
// Create lock file
|
|
return os.WriteFile(lockFile, []byte(fmt.Sprintf("%d", os.Getpid())), 0644)
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## 13. Performance Considerations
|
|
|
|
### 13.1 Memory Usage Optimization
|
|
|
|
```go
|
|
// internal/common/runtime/optimization.go
|
|
type ResourceOptimizer struct {
|
|
binaryType BinaryType
|
|
maxMemoryMB int64
|
|
gcPercent int
|
|
}
|
|
|
|
func (o *ResourceOptimizer) OptimizeForBinary() {
|
|
switch o.binaryType {
|
|
case BinaryTypeAgent:
|
|
// Agent needs more memory for task processing
|
|
debug.SetGCPercent(100)
|
|
debug.SetMemoryLimit(o.maxMemoryMB * 1024 * 1024)
|
|
case BinaryTypeHAP:
|
|
// HAP can be more memory conservative
|
|
debug.SetGCPercent(50)
|
|
debug.SetMemoryLimit((o.maxMemoryMB/2) * 1024 * 1024)
|
|
}
|
|
}
|
|
```
|
|
|
|
### 13.2 P2P Message Optimization
|
|
|
|
```go
|
|
// internal/common/runtime/p2p_optimization.go
|
|
func OptimizePubSubForBinary(ps *pubsub.PubSub, binaryType BinaryType) {
|
|
switch binaryType {
|
|
case BinaryTypeAgent:
|
|
// Agents need fast task coordination
|
|
ps.SetMessageTimeout(5 * time.Second)
|
|
ps.SetMaxMessageSize(1024 * 1024) // 1MB
|
|
case BinaryTypeHAP:
|
|
// HAP can tolerate slower human-paced interaction
|
|
ps.SetMessageTimeout(30 * time.Second)
|
|
ps.SetMaxMessageSize(512 * 1024) // 512KB
|
|
}
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## 14. Success Metrics and Validation
|
|
|
|
### 14.1 Phase 1 Success Criteria
|
|
|
|
| Criteria | Measurement | Target | Validation Method |
|
|
|----------|-------------|---------|------------------|
|
|
| **Build Success** | Both binaries compile without errors | 100% | CI/CD pipeline |
|
|
| **Runtime Compatibility** | Agent maintains existing functionality | 100% | Regression test suite |
|
|
| **P2P Interoperability** | Both binaries join same mesh | 100% | Integration tests |
|
|
| **Resource Isolation** | No port/resource conflicts | 100% | Co-deployment tests |
|
|
| **Configuration Validation** | Invalid configs rejected | 100% | Unit tests |
|
|
| **Graceful Shutdown** | Clean shutdown under load | 100% | Stress tests |
|
|
|
|
### 14.2 Performance Benchmarks
|
|
|
|
```go
|
|
// test/benchmarks/runtime_bench_test.go
|
|
func BenchmarkAgentStartup(b *testing.B) {
|
|
for i := 0; i < b.N; i++ {
|
|
ctx := context.Background()
|
|
runtime := NewRuntime(logging.NewNullLogger())
|
|
|
|
start := time.Now()
|
|
_, err := runtime.Initialize(ctx, RuntimeConfig{
|
|
BinaryType: BinaryTypeAgent,
|
|
ConfigPath: "testdata/agent_config.yaml",
|
|
})
|
|
duration := time.Since(start)
|
|
|
|
if err != nil {
|
|
b.Fatalf("initialization failed: %v", err)
|
|
}
|
|
|
|
// Target: < 5 seconds startup
|
|
if duration > 5*time.Second {
|
|
b.Errorf("startup too slow: %v", duration)
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## 15. Implementation Roadmap
|
|
|
|
### 15.1 Development Phases
|
|
|
|
#### Phase 1.1: Infrastructure Setup (Week 1)
|
|
- [ ] Create `cmd/` directory structure
|
|
- [ ] Create `internal/common/runtime/` package structure
|
|
- [ ] Move existing `main.go` to `cmd/agent/main.go`
|
|
- [ ] Update Makefile for dual-binary builds
|
|
- [ ] Basic smoke tests for both binaries
|
|
|
|
#### Phase 1.2: Runtime Extraction (Week 2)
|
|
- [ ] Extract shared initialization logic to `runtime/services.go`
|
|
- [ ] Extract configuration loading to `runtime/config.go`
|
|
- [ ] Extract P2P initialization to `runtime/p2p.go`
|
|
- [ ] Extract health monitoring to `runtime/monitoring.go`
|
|
- [ ] Comprehensive unit tests for runtime package
|
|
|
|
#### Phase 1.3: HAP Binary Implementation (Week 3)
|
|
- [ ] Implement `cmd/hap/main.go`
|
|
- [ ] Create stub HAP interface in `internal/hap/`
|
|
- [ ] Implement basic terminal interaction
|
|
- [ ] P2P mesh participation tests
|
|
- [ ] Message send/receive validation
|
|
|
|
#### Phase 1.4: Integration & Validation (Week 4)
|
|
- [ ] Dual-binary integration tests
|
|
- [ ] Performance benchmarking
|
|
- [ ] Resource conflict validation
|
|
- [ ] Documentation updates
|
|
- [ ] Deployment guide creation
|
|
|
|
### 15.2 Testing Strategy
|
|
|
|
```bash
|
|
# Phase 1 Testing Commands
|
|
make test-unit # Unit tests for all packages
|
|
make test-integration # Integration tests between binaries
|
|
make test-performance # Performance benchmarks
|
|
make test-deployment # Docker/K8s deployment tests
|
|
make test-regression # Ensure existing functionality unchanged
|
|
```
|
|
|
|
---
|
|
|
|
## 16. Documentation Requirements
|
|
|
|
### 16.1 Developer Documentation
|
|
|
|
- **Architecture Overview**: Updated system architecture diagrams
|
|
- **API Reference**: Runtime service interfaces and contracts
|
|
- **Configuration Guide**: Binary-specific configuration examples
|
|
- **Testing Guide**: How to test dual-binary scenarios
|
|
- **Troubleshooting**: Common issues and solutions
|
|
|
|
### 16.2 Operations Documentation
|
|
|
|
- **Deployment Guide**: Docker and Kubernetes deployment patterns
|
|
- **Monitoring Setup**: Health check endpoints and metrics
|
|
- **Performance Tuning**: Resource optimization recommendations
|
|
- **Security Configuration**: Role-based access control setup
|
|
|
|
---
|
|
|
|
## 17. Conclusion
|
|
|
|
This technical specification provides a comprehensive blueprint for transforming BZZZ from a monolithic single-binary system into a dual-binary architecture that supports both autonomous agents and human agent portals. The design maintains all existing functionality while enabling new human interaction capabilities through a shared runtime infrastructure.
|
|
|
|
**Key Benefits**:
|
|
- **Zero Regression**: Autonomous agents retain 100% existing functionality
|
|
- **Shared Infrastructure**: Maximum code reuse and consistency
|
|
- **Operational Flexibility**: Deploy agents and HAP independently
|
|
- **Future Extensibility**: Architecture supports additional binary types
|
|
|
|
**Implementation Priority**: This specification focuses on Phase 1 structural reorganization, which is marked as HIGH PRIORITY in the HAP Action Plan. Successful completion of Phase 1 will provide a solid foundation for subsequent phases that add sophisticated human interaction features.
|
|
|
|
The architecture balances complexity with maintainability, ensuring that the dual-binary system is operationally manageable while providing the flexibility needed for human-agent collaboration in the BZZZ ecosystem. |