Major security, observability, and configuration improvements:
## Security Hardening
- Implemented configurable CORS (no more wildcards)
- Added comprehensive auth middleware for admin endpoints
- Enhanced webhook HMAC validation
- Added input validation and rate limiting
- Security headers and CSP policies
## Configuration Management
- Made N8N webhook URL configurable (WHOOSH_N8N_BASE_URL)
- Replaced all hardcoded endpoints with environment variables
- Added feature flags for LLM vs heuristic composition
- Gitea fetch hardening with EAGER_FILTER and FULL_RESCAN options
## API Completeness
- Implemented GetCouncilComposition function
- Added GET /api/v1/councils/{id} endpoint
- Council artifacts API (POST/GET /api/v1/councils/{id}/artifacts)
- /admin/health/details endpoint with component status
- Database lookup for repository URLs (no hardcoded fallbacks)
## Observability & Performance
- Added OpenTelemetry distributed tracing with goal/pulse correlation
- Performance optimization database indexes
- Comprehensive health monitoring
- Enhanced logging and error handling
## Infrastructure
- Production-ready P2P discovery (replaces mock implementation)
- Removed unused Redis configuration
- Enhanced Docker Swarm integration
- Added migration files for performance indexes
## Code Quality
- Comprehensive input validation
- Graceful error handling and failsafe fallbacks
- Backwards compatibility maintained
- Following security best practices
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
110 lines
3.5 KiB
Go
110 lines
3.5 KiB
Go
// Package sdk provides the BACKBEAT Go SDK for enabling CHORUS services
|
|
// to become BACKBEAT-aware with beat synchronization and status emission.
|
|
//
|
|
// The BACKBEAT SDK enables services to:
|
|
// - Subscribe to cluster-wide beat events with jitter tolerance
|
|
// - Emit status claims with automatic metadata population
|
|
// - Use beat budgets for timeout management
|
|
// - Operate in local degradation mode when pulse unavailable
|
|
// - Integrate comprehensive observability and health reporting
|
|
//
|
|
// # Quick Start
|
|
//
|
|
// config := sdk.DefaultConfig()
|
|
// config.ClusterID = "chorus-dev"
|
|
// config.AgentID = "my-service"
|
|
// config.NATSUrl = "nats://localhost:4222"
|
|
//
|
|
// client := sdk.NewClient(config)
|
|
//
|
|
// client.OnBeat(func(beat sdk.BeatFrame) {
|
|
// // Called every beat
|
|
// client.EmitStatusClaim(sdk.StatusClaim{
|
|
// State: "executing",
|
|
// Progress: 0.5,
|
|
// Notes: "Processing data",
|
|
// })
|
|
// })
|
|
//
|
|
// ctx := context.Background()
|
|
// client.Start(ctx)
|
|
// defer client.Stop()
|
|
//
|
|
// # Beat Subscription
|
|
//
|
|
// Register callbacks for beat and downbeat events:
|
|
//
|
|
// client.OnBeat(func(beat sdk.BeatFrame) {
|
|
// // Called every beat (~1-4 times per second depending on tempo)
|
|
// fmt.Printf("Beat %d\n", beat.BeatIndex)
|
|
// })
|
|
//
|
|
// client.OnDownbeat(func(beat sdk.BeatFrame) {
|
|
// // Called at the start of each bar (every 4 beats typically)
|
|
// fmt.Printf("Bar started: %s\n", beat.WindowID)
|
|
// })
|
|
//
|
|
// # Status Emission
|
|
//
|
|
// Emit status claims to report current state and progress:
|
|
//
|
|
// err := client.EmitStatusClaim(sdk.StatusClaim{
|
|
// State: "executing", // executing|planning|waiting|review|done|failed
|
|
// BeatsLeft: 10, // estimated beats remaining
|
|
// Progress: 0.75, // progress ratio (0.0-1.0)
|
|
// Notes: "Processing batch 5/10",
|
|
// })
|
|
//
|
|
// # Beat Budgets
|
|
//
|
|
// Execute functions with beat-based timeouts:
|
|
//
|
|
// err := client.WithBeatBudget(10, func() error {
|
|
// // This function has 10 beats to complete
|
|
// return performLongRunningTask()
|
|
// })
|
|
//
|
|
// if err != nil {
|
|
// // Handle timeout or task error
|
|
// log.Printf("Task failed or exceeded budget: %v", err)
|
|
// }
|
|
//
|
|
// # Health and Observability
|
|
//
|
|
// Monitor client health and metrics:
|
|
//
|
|
// health := client.Health()
|
|
// fmt.Printf("Connected: %v\n", health.Connected)
|
|
// fmt.Printf("Last Beat: %d\n", health.LastBeat)
|
|
// fmt.Printf("Reconnects: %d\n", health.ReconnectCount)
|
|
//
|
|
// # Local Degradation
|
|
//
|
|
// The SDK automatically handles network issues by entering local degradation mode:
|
|
// - Generates synthetic beats when pulse service unavailable
|
|
// - Uses fallback timing to maintain callback schedules
|
|
// - Automatically recovers when pulse service returns
|
|
// - Provides seamless operation during network partitions
|
|
//
|
|
// # Security
|
|
//
|
|
// The SDK implements BACKBEAT security requirements:
|
|
// - Ed25519 signing of all status claims when key provided
|
|
// - Required x-window-id and x-hlc headers
|
|
// - Agent identification for proper message routing
|
|
//
|
|
// # Performance
|
|
//
|
|
// Designed for production use with:
|
|
// - Beat callback latency target ≤5ms
|
|
// - Timer drift ≤1% over 1 hour without leader
|
|
// - Goroutine-safe concurrent operations
|
|
// - Bounded memory usage for metrics and errors
|
|
//
|
|
// # Examples
|
|
//
|
|
// See the examples subdirectory for complete usage patterns:
|
|
// - examples/simple_agent.go: Basic integration
|
|
// - examples/task_processor.go: Beat budget usage
|
|
// - examples/service_monitor.go: Health monitoring
|
|
package sdk |