Implement initial scan logic and council formation for WHOOSH project kickoffs
- Replace incremental sync with full scan for new repositories - Add initial_scan status to bypass Since parameter filtering - Implement council formation detection for Design Brief issues - Add version display to WHOOSH UI header for debugging - Fix Docker token authentication with trailing newline removal - Add comprehensive council orchestration with Docker Swarm integration - Include BACKBEAT prototype integration for distributed timing - Support council-specific agent roles and deployment strategies - Transition repositories to active status after content discovery Key architectural improvements: - Full scan approach for new project detection vs incremental sync - Council formation triggered by chorus-entrypoint labeled Design Briefs - Proper token handling and authentication for Gitea API calls - Support for both initial discovery and ongoing task monitoring This enables autonomous project kickoff workflows where Design Brief issues automatically trigger formation of specialized agent councils for new projects. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
366
BACKBEAT-prototype/contracts/README.md
Normal file
366
BACKBEAT-prototype/contracts/README.md
Normal file
@@ -0,0 +1,366 @@
|
||||
# BACKBEAT Contracts Package
|
||||
|
||||
[](https://github.com/chorus-services/backbeat/actions/workflows/contracts.yml)
|
||||
[](schemas/)
|
||||
[](LICENSE)
|
||||
|
||||
The authoritative contract definitions and validation tools for BACKBEAT distributed orchestration across the CHORUS 2.0.0 ecosystem.
|
||||
|
||||
## 🎯 Overview
|
||||
|
||||
BACKBEAT provides synchronized distributed execution through three core message interfaces:
|
||||
|
||||
- **INT-A (BeatFrame)**: 🥁 Rhythm coordination from Pulse → All Services
|
||||
- **INT-B (StatusClaim)**: 📊 Agent status reporting from Agents → Reverb
|
||||
- **INT-C (BarReport)**: 📈 Periodic summaries from Reverb → All Services
|
||||
|
||||
This contracts package ensures all CHORUS 2.0.0 projects can reliably integrate with BACKBEAT through:
|
||||
|
||||
✅ **JSON Schema Validation** - Semver-versioned schemas for all interfaces
|
||||
✅ **Conformance Testing** - Comprehensive test suites with valid/invalid examples
|
||||
✅ **CI Integration** - Drop-in validation for any CI pipeline
|
||||
✅ **Documentation** - Complete integration guides and best practices
|
||||
|
||||
## 🚀 Quick Start
|
||||
|
||||
### 1. Validate Your Messages
|
||||
|
||||
```bash
|
||||
# Clone the contracts repository
|
||||
git clone https://github.com/chorus-services/backbeat.git
|
||||
cd backbeat/contracts
|
||||
|
||||
# Build the validation tool
|
||||
cd tests/integration && make build
|
||||
|
||||
# Validate your BACKBEAT messages
|
||||
./backbeat-validate --schemas ../../schemas --dir /path/to/your/messages --exit-code
|
||||
```
|
||||
|
||||
### 2. Add to CI Pipeline
|
||||
|
||||
#### GitHub Actions
|
||||
```yaml
|
||||
- name: Validate BACKBEAT Contracts
|
||||
run: |
|
||||
git clone https://github.com/chorus-services/backbeat.git
|
||||
cd backbeat/contracts/tests/integration
|
||||
make build
|
||||
./backbeat-validate --schemas ../../schemas --dir ${{ github.workspace }}/messages --exit-code
|
||||
```
|
||||
|
||||
#### GitLab CI
|
||||
```yaml
|
||||
validate-backbeat:
|
||||
script:
|
||||
- git clone https://github.com/chorus-services/backbeat.git
|
||||
- cd backbeat/contracts/tests/integration && make build
|
||||
- ./backbeat-validate --schemas ../../schemas --dir messages --exit-code
|
||||
```
|
||||
|
||||
### 3. Integrate with Your Project
|
||||
|
||||
Add to your `Makefile`:
|
||||
```makefile
|
||||
validate-backbeat:
|
||||
@git clone https://github.com/chorus-services/backbeat.git .backbeat 2>/dev/null || true
|
||||
@cd .backbeat/contracts/tests/integration && make build
|
||||
@.backbeat/contracts/tests/integration/backbeat-validate --schemas .backbeat/contracts/schemas --dir messages --exit-code
|
||||
```
|
||||
|
||||
## 📁 Package Structure
|
||||
|
||||
```
|
||||
contracts/
|
||||
├── schemas/ # JSON Schema definitions
|
||||
│ ├── beatframe-v1.schema.json # INT-A: Pulse → All Services
|
||||
│ ├── statusclaim-v1.schema.json # INT-B: Agents → Reverb
|
||||
│ └── barreport-v1.schema.json # INT-C: Reverb → All Services
|
||||
├── tests/
|
||||
│ ├── conformance_test.go # Go conformance test suite
|
||||
│ ├── examples/ # Valid/invalid message examples
|
||||
│ │ ├── beatframe-valid.json
|
||||
│ │ ├── beatframe-invalid.json
|
||||
│ │ ├── statusclaim-valid.json
|
||||
│ │ ├── statusclaim-invalid.json
|
||||
│ │ ├── barreport-valid.json
|
||||
│ │ └── barreport-invalid.json
|
||||
│ └── integration/ # CI integration helpers
|
||||
│ ├── validator.go # Message validation library
|
||||
│ ├── ci_helper.go # CI integration utilities
|
||||
│ ├── cmd/backbeat-validate/ # CLI validation tool
|
||||
│ └── Makefile # Build and test automation
|
||||
├── docs/
|
||||
│ ├── integration-guide.md # How to BACKBEAT-enable services
|
||||
│ ├── schema-evolution.md # Versioning and compatibility
|
||||
│ └── tempo-guide.md # Beat timing recommendations
|
||||
└── README.md # This file
|
||||
```
|
||||
|
||||
## 🔧 Core Interfaces
|
||||
|
||||
### INT-A: BeatFrame (Pulse → All Services)
|
||||
|
||||
Synchronization messages broadcast every beat:
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "backbeat.beatframe.v1",
|
||||
"cluster_id": "chorus-prod",
|
||||
"beat_index": 1337,
|
||||
"downbeat": false,
|
||||
"phase": "execute",
|
||||
"hlc": "7ffd:0001:abcd",
|
||||
"deadline_at": "2025-09-05T12:30:00Z",
|
||||
"tempo_bpm": 2.0,
|
||||
"window_id": "7e9b0e6c4c9a4e59b7f2d9a3c1b2e4d5"
|
||||
}
|
||||
```
|
||||
|
||||
**Key Fields:**
|
||||
- `beat_index`: Monotonic counter since cluster start
|
||||
- `phase`: `"plan"`, `"execute"`, or `"review"`
|
||||
- `tempo_bpm`: Current beats per minute (default: 2.0 = 30-second beats)
|
||||
- `deadline_at`: When this phase must complete
|
||||
|
||||
### INT-B: StatusClaim (Agents → Reverb)
|
||||
|
||||
Agent status reports during beat execution:
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "backbeat.statusclaim.v1",
|
||||
"agent_id": "search-indexer:worker-03",
|
||||
"task_id": "index-batch:20250905-120",
|
||||
"beat_index": 1337,
|
||||
"state": "executing",
|
||||
"beats_left": 3,
|
||||
"progress": 0.65,
|
||||
"notes": "processing batch 120/200",
|
||||
"hlc": "7ffd:0001:beef"
|
||||
}
|
||||
```
|
||||
|
||||
**Key Fields:**
|
||||
- `state`: `"idle"`, `"planning"`, `"executing"`, `"reviewing"`, `"completed"`, `"failed"`, `"blocked"`, `"helping"`
|
||||
- `beats_left`: Estimated beats to completion
|
||||
- `progress`: Completion percentage (0.0 - 1.0)
|
||||
|
||||
### INT-C: BarReport (Reverb → All Services)
|
||||
|
||||
Periodic cluster health summaries:
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "backbeat.barreport.v1",
|
||||
"window_id": "7e9b0e6c4c9a4e59b7f2d9a3c1b2e4d5",
|
||||
"from_beat": 240,
|
||||
"to_beat": 359,
|
||||
"agents_reporting": 978,
|
||||
"on_time_reviews": 942,
|
||||
"help_promises_fulfilled": 87,
|
||||
"secret_rotations_ok": true,
|
||||
"tempo_drift_ms": 7.3,
|
||||
"issues": []
|
||||
}
|
||||
```
|
||||
|
||||
**Key Fields:**
|
||||
- `agents_reporting`: Total active agents in window
|
||||
- `on_time_reviews`: Agents completing review phase on time
|
||||
- `tempo_drift_ms`: Timing drift (positive = behind, negative = ahead)
|
||||
|
||||
## 🛠️ Usage Examples
|
||||
|
||||
### Validate Single Message
|
||||
|
||||
```bash
|
||||
# Validate from file
|
||||
./backbeat-validate --schemas ../schemas --file message.json
|
||||
|
||||
# Validate from stdin
|
||||
echo '{"type":"backbeat.beatframe.v1",...}' | ./backbeat-validate --schemas ../schemas --message -
|
||||
|
||||
# Get JSON output for programmatic use
|
||||
./backbeat-validate --schemas ../schemas --file message.json --json
|
||||
```
|
||||
|
||||
### Validate Directory
|
||||
|
||||
```bash
|
||||
# Validate all JSON files in directory
|
||||
./backbeat-validate --schemas ../schemas --dir messages/
|
||||
|
||||
# Quiet mode (only errors)
|
||||
./backbeat-validate --schemas ../schemas --dir messages/ --quiet
|
||||
|
||||
# Exit with error code on validation failures
|
||||
./backbeat-validate --schemas ../schemas --dir messages/ --exit-code
|
||||
```
|
||||
|
||||
### Go Integration
|
||||
|
||||
```go
|
||||
import "github.com/chorus-services/backbeat/contracts/tests/integration"
|
||||
|
||||
// Create validator
|
||||
validator, err := integration.NewMessageValidator("./schemas")
|
||||
if err != nil {
|
||||
log.Fatal(err)
|
||||
}
|
||||
|
||||
// Validate message
|
||||
result, err := validator.ValidateMessageString(`{"type":"backbeat.beatframe.v1",...}`)
|
||||
if err != nil {
|
||||
log.Fatal(err)
|
||||
}
|
||||
|
||||
if !result.Valid {
|
||||
log.Errorf("Validation failed: %v", result.Errors)
|
||||
}
|
||||
```
|
||||
|
||||
## 📊 Tempo Recommendations
|
||||
|
||||
| Use Case | Tempo (BPM) | Beat Duration | Example Services |
|
||||
|----------|-------------|---------------|------------------|
|
||||
| **Development** | 0.1 - 0.5 | 2-10 minutes | Testing, debugging |
|
||||
| **Batch Processing** | 0.5 - 2.0 | 30s - 2 minutes | ETL, data warehouses |
|
||||
| **Standard Services** | 2.0 - 10.0 | 6-30 seconds | APIs, web apps |
|
||||
| **Responsive Apps** | 10.0 - 60.0 | 1-6 seconds | Dashboards, monitoring |
|
||||
| **High-Frequency** | 60+ | <1 second | Trading, IoT processing |
|
||||
|
||||
**Default**: 2.0 BPM (30-second beats) works well for most CHORUS services.
|
||||
|
||||
## 📋 Integration Checklist
|
||||
|
||||
- [ ] **Message Validation**: Add schema validation to your CI pipeline
|
||||
- [ ] **BeatFrame Handler**: Implement INT-A message consumption
|
||||
- [ ] **StatusClaim Publisher**: Implement INT-B message publishing (if you have agents)
|
||||
- [ ] **BarReport Consumer**: Implement INT-C message consumption (optional)
|
||||
- [ ] **Tempo Selection**: Choose appropriate BPM for your workload
|
||||
- [ ] **Error Handling**: Handle validation failures and timing issues
|
||||
- [ ] **Monitoring**: Track beat processing latency and deadline misses
|
||||
- [ ] **Load Testing**: Verify performance at production tempo
|
||||
|
||||
## 🔄 Schema Versioning
|
||||
|
||||
Schemas follow [Semantic Versioning](https://semver.org/):
|
||||
|
||||
- **MAJOR** (1.0.0 → 2.0.0): Breaking changes requiring code updates
|
||||
- **MINOR** (1.0.0 → 1.1.0): Backward-compatible additions
|
||||
- **PATCH** (1.0.0 → 1.0.1): Documentation and example updates
|
||||
|
||||
Current versions:
|
||||
- **BeatFrame**: v1.0.0 (`backbeat.beatframe.v1`)
|
||||
- **StatusClaim**: v1.0.0 (`backbeat.statusclaim.v1`)
|
||||
- **BarReport**: v1.0.0 (`backbeat.barreport.v1`)
|
||||
|
||||
See [schema-evolution.md](docs/schema-evolution.md) for migration strategies.
|
||||
|
||||
## 🧪 Running Tests
|
||||
|
||||
```bash
|
||||
# Run all tests
|
||||
make test
|
||||
|
||||
# Test schemas are valid JSON
|
||||
make test-schemas
|
||||
|
||||
# Test example messages
|
||||
make test-examples
|
||||
|
||||
# Run Go integration tests
|
||||
make test-integration
|
||||
|
||||
# Validate built-in examples
|
||||
make validate-examples
|
||||
```
|
||||
|
||||
## 🏗️ Building
|
||||
|
||||
```bash
|
||||
# Build CLI validation tool
|
||||
make build
|
||||
|
||||
# Install Go dependencies
|
||||
make deps
|
||||
|
||||
# Format code
|
||||
make fmt
|
||||
|
||||
# Run linter
|
||||
make lint
|
||||
|
||||
# Generate CI configuration examples
|
||||
make examples
|
||||
```
|
||||
|
||||
## 📚 Documentation
|
||||
|
||||
- **[Integration Guide](docs/integration-guide.md)**: Complete guide for CHORUS 2.0.0 projects
|
||||
- **[Schema Evolution](docs/schema-evolution.md)**: Versioning and compatibility management
|
||||
- **[Tempo Guide](docs/tempo-guide.md)**: Beat timing and performance optimization
|
||||
|
||||
## 🤝 Contributing
|
||||
|
||||
1. **Fork** this repository
|
||||
2. **Create** a feature branch: `git checkout -b feature/amazing-feature`
|
||||
3. **Add** tests for your changes
|
||||
4. **Run** `make test` to ensure everything passes
|
||||
5. **Commit** your changes: `git commit -m 'Add amazing feature'`
|
||||
6. **Push** to the branch: `git push origin feature/amazing-feature`
|
||||
7. **Open** a Pull Request
|
||||
|
||||
### Schema Changes
|
||||
|
||||
- **Minor changes** (new optional fields): Create PR with updated schema
|
||||
- **Major changes** (breaking): Discuss in issue first, follow migration process
|
||||
- **All changes**: Update examples and tests accordingly
|
||||
|
||||
## 🔍 Troubleshooting
|
||||
|
||||
### Common Validation Errors
|
||||
|
||||
| Error | Cause | Fix |
|
||||
|-------|-------|-----|
|
||||
| `type field is required` | Missing `type` field | Add correct message type |
|
||||
| `hlc must match pattern` | Invalid HLC format | Use `XXXX:XXXX:XXXX` hex format |
|
||||
| `window_id must be exactly 32 hex characters` | Wrong window ID | Use 32-character hex string |
|
||||
| `phase must be one of: plan, execute, review` | Invalid phase | Use exact phase names |
|
||||
| `tempo_bpm must be at least 0.1` | Tempo too low | Use tempo ≥ 0.1 BPM |
|
||||
|
||||
### Performance Issues
|
||||
|
||||
- **Beat processing too slow**: Reduce tempo or optimize code
|
||||
- **High CPU usage**: Consider lower tempo or horizontal scaling
|
||||
- **Network saturation**: Reduce message frequency or size
|
||||
- **Memory leaks**: Ensure proper cleanup in beat handlers
|
||||
|
||||
### Getting Help
|
||||
|
||||
- **Issues**: [GitHub Issues](https://github.com/chorus-services/backbeat/issues)
|
||||
- **Discussions**: [GitHub Discussions](https://github.com/chorus-services/backbeat/discussions)
|
||||
- **Documentation**: Check the [docs/](docs/) directory
|
||||
- **Examples**: See [tests/examples/](tests/examples/) for message samples
|
||||
|
||||
## 📜 License
|
||||
|
||||
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
|
||||
|
||||
## 🎵 About BACKBEAT
|
||||
|
||||
BACKBEAT provides the rhythmic heartbeat that synchronizes distributed systems across CHORUS 2.0.0. Just as musicians use a metronome to stay in time, BACKBEAT keeps your services coordinated and responsive.
|
||||
|
||||
**Key Benefits:**
|
||||
- 🎯 **Predictable Timing**: Know exactly when coordination happens
|
||||
- 🔄 **Graceful Coordination**: Services sync without tight coupling
|
||||
- 📊 **Health Visibility**: Real-time insight into cluster performance
|
||||
- 🛡️ **Fault Tolerance**: Detect and recover from failures quickly
|
||||
- ⚡ **Scalable**: Works from development (0.1 BPM) to high-frequency (1000+ BPM)
|
||||
|
||||
---
|
||||
|
||||
**Made with ❤️ by the CHORUS 2.0.0 team**
|
||||
|
||||
*"In rhythm there is coordination, in coordination there is reliability."*
|
||||
436
BACKBEAT-prototype/contracts/docs/integration-guide.md
Normal file
436
BACKBEAT-prototype/contracts/docs/integration-guide.md
Normal file
@@ -0,0 +1,436 @@
|
||||
# BACKBEAT Integration Guide for CHORUS 2.0.0 Projects
|
||||
|
||||
This guide explains how to integrate BACKBEAT contract validation into your CHORUS 2.0.0 project for guaranteed compatibility with the distributed orchestration system.
|
||||
|
||||
## Overview
|
||||
|
||||
BACKBEAT provides three core interfaces for coordinated distributed execution:
|
||||
|
||||
- **INT-A (BeatFrame)**: Rhythm coordination from Pulse service to all agents
|
||||
- **INT-B (StatusClaim)**: Agent status reporting to Reverb service
|
||||
- **INT-C (BarReport)**: Periodic summary reports from Reverb to all services
|
||||
|
||||
All messages must conform to the published JSON schemas to ensure reliable operation across the CHORUS ecosystem.
|
||||
|
||||
## Quick Start
|
||||
|
||||
### 1. Add Contract Validation to Your CI Pipeline
|
||||
|
||||
#### GitHub Actions
|
||||
```yaml
|
||||
name: BACKBEAT Contract Validation
|
||||
|
||||
on: [push, pull_request]
|
||||
|
||||
jobs:
|
||||
validate-backbeat:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
|
||||
- name: Checkout BACKBEAT contracts
|
||||
uses: actions/checkout@v4
|
||||
with:
|
||||
repository: 'chorus-services/backbeat'
|
||||
path: 'backbeat-contracts'
|
||||
|
||||
- name: Set up Go
|
||||
uses: actions/setup-go@v4
|
||||
with:
|
||||
go-version: '1.22'
|
||||
|
||||
- name: Validate BACKBEAT messages
|
||||
run: |
|
||||
cd backbeat-contracts/contracts/tests/integration
|
||||
make build
|
||||
./backbeat-validate \
|
||||
--schemas ../../schemas \
|
||||
--dir ../../../your-messages-directory \
|
||||
--exit-code
|
||||
```
|
||||
|
||||
#### GitLab CI
|
||||
```yaml
|
||||
validate-backbeat:
|
||||
stage: test
|
||||
image: golang:1.22
|
||||
before_script:
|
||||
- git clone https://github.com/chorus-services/backbeat.git /tmp/backbeat
|
||||
- cd /tmp/backbeat/contracts/tests/integration && make build
|
||||
script:
|
||||
- /tmp/backbeat/contracts/tests/integration/backbeat-validate
|
||||
--schemas /tmp/backbeat/contracts/schemas
|
||||
--dir $CI_PROJECT_DIR/messages
|
||||
--exit-code
|
||||
```
|
||||
|
||||
### 2. Project Makefile Integration
|
||||
|
||||
Add to your project's `Makefile`:
|
||||
|
||||
```makefile
|
||||
# BACKBEAT contract validation
|
||||
BACKBEAT_REPO = https://github.com/chorus-services/backbeat.git
|
||||
BACKBEAT_DIR = .backbeat-contracts
|
||||
|
||||
$(BACKBEAT_DIR):
|
||||
git clone $(BACKBEAT_REPO) $(BACKBEAT_DIR)
|
||||
|
||||
validate-backbeat: $(BACKBEAT_DIR)
|
||||
cd $(BACKBEAT_DIR)/contracts/tests/integration && make build
|
||||
$(BACKBEAT_DIR)/contracts/tests/integration/backbeat-validate \
|
||||
--schemas $(BACKBEAT_DIR)/contracts/schemas \
|
||||
--dir messages \
|
||||
--exit-code
|
||||
|
||||
.PHONY: validate-backbeat
|
||||
```
|
||||
|
||||
## Message Implementation
|
||||
|
||||
### Implementing BeatFrame Consumer (INT-A)
|
||||
|
||||
Your service should subscribe to beat frames from the Pulse service and respond appropriately:
|
||||
|
||||
```go
|
||||
// Example Go implementation
|
||||
type BeatFrameHandler struct {
|
||||
currentBeat int64
|
||||
phase string
|
||||
}
|
||||
|
||||
func (h *BeatFrameHandler) HandleBeatFrame(frame BeatFrame) {
|
||||
// Validate the beat frame
|
||||
if err := validateBeatFrame(frame); err != nil {
|
||||
log.Errorf("Invalid beat frame: %v", err)
|
||||
return
|
||||
}
|
||||
|
||||
// Update internal state
|
||||
h.currentBeat = frame.BeatIndex
|
||||
h.phase = frame.Phase
|
||||
|
||||
// Execute phase-appropriate actions
|
||||
switch frame.Phase {
|
||||
case "plan":
|
||||
h.planPhase(frame)
|
||||
case "execute":
|
||||
h.executePhase(frame)
|
||||
case "review":
|
||||
h.reviewPhase(frame)
|
||||
}
|
||||
}
|
||||
|
||||
func validateBeatFrame(frame BeatFrame) error {
|
||||
if frame.Type != "backbeat.beatframe.v1" {
|
||||
return fmt.Errorf("invalid message type: %s", frame.Type)
|
||||
}
|
||||
if frame.TempoBPM < 0.1 || frame.TempoBPM > 1000 {
|
||||
return fmt.Errorf("invalid tempo: %f", frame.TempoBPM)
|
||||
}
|
||||
// Add more validation as needed
|
||||
return nil
|
||||
}
|
||||
```
|
||||
|
||||
### Implementing StatusClaim Publisher (INT-B)
|
||||
|
||||
Your agents should publish status claims to the Reverb service:
|
||||
|
||||
```go
|
||||
func (agent *Agent) PublishStatusClaim(beatIndex int64, state string) error {
|
||||
claim := StatusClaim{
|
||||
Type: "backbeat.statusclaim.v1",
|
||||
AgentID: agent.ID,
|
||||
BeatIndex: beatIndex,
|
||||
State: state,
|
||||
HLC: agent.generateHLC(),
|
||||
Progress: agent.calculateProgress(),
|
||||
Notes: agent.getCurrentStatus(),
|
||||
}
|
||||
|
||||
// Validate before sending
|
||||
if err := validateStatusClaim(claim); err != nil {
|
||||
return fmt.Errorf("invalid status claim: %w", err)
|
||||
}
|
||||
|
||||
return agent.publisher.Publish("backbeat.statusclaims", claim)
|
||||
}
|
||||
|
||||
func validateStatusClaim(claim StatusClaim) error {
|
||||
validStates := []string{"idle", "planning", "executing", "reviewing", "completed", "failed", "blocked", "helping"}
|
||||
for _, valid := range validStates {
|
||||
if claim.State == valid {
|
||||
return nil
|
||||
}
|
||||
}
|
||||
return fmt.Errorf("invalid state: %s", claim.State)
|
||||
}
|
||||
```
|
||||
|
||||
### Implementing BarReport Consumer (INT-C)
|
||||
|
||||
Services should consume bar reports for cluster health awareness:
|
||||
|
||||
```go
|
||||
func (service *Service) HandleBarReport(report BarReport) {
|
||||
// Validate the bar report
|
||||
if err := validateBarReport(report); err != nil {
|
||||
log.Errorf("Invalid bar report: %v", err)
|
||||
return
|
||||
}
|
||||
|
||||
// Update cluster health metrics
|
||||
service.updateClusterHealth(report)
|
||||
|
||||
// React to issues
|
||||
if len(report.Issues) > 0 {
|
||||
service.handleClusterIssues(report.Issues)
|
||||
}
|
||||
|
||||
// Store performance metrics
|
||||
service.storePerformanceMetrics(report.Performance)
|
||||
}
|
||||
|
||||
func (service *Service) updateClusterHealth(report BarReport) {
|
||||
service.clusterMetrics.AgentsReporting = report.AgentsReporting
|
||||
service.clusterMetrics.OnTimeRate = float64(report.OnTimeReviews) / float64(report.AgentsReporting)
|
||||
service.clusterMetrics.TempoDrift = report.TempoDriftMS
|
||||
service.clusterMetrics.SecretRotationsOK = report.SecretRotationsOK
|
||||
}
|
||||
```
|
||||
|
||||
## Message Format Requirements
|
||||
|
||||
### Common Patterns
|
||||
|
||||
All BACKBEAT messages share these patterns:
|
||||
|
||||
1. **Type Field**: Must exactly match the schema constant
|
||||
2. **HLC Timestamps**: Format `XXXX:XXXX:XXXX` (hex digits)
|
||||
3. **Beat Indices**: Monotonically increasing integers ≥ 0
|
||||
4. **Window IDs**: 32-character hexadecimal strings
|
||||
5. **Agent IDs**: Pattern `service:instance` or `agent:identifier`
|
||||
|
||||
### Validation Best Practices
|
||||
|
||||
1. **Always validate messages before processing**
|
||||
2. **Use schema validation in tests**
|
||||
3. **Handle validation errors gracefully**
|
||||
4. **Log validation failures for debugging**
|
||||
|
||||
Example validation function:
|
||||
|
||||
```go
|
||||
func ValidateMessage(messageBytes []byte, expectedType string) error {
|
||||
// Parse and check type
|
||||
var msg map[string]interface{}
|
||||
if err := json.Unmarshal(messageBytes, &msg); err != nil {
|
||||
return fmt.Errorf("invalid JSON: %w", err)
|
||||
}
|
||||
|
||||
msgType, ok := msg["type"].(string)
|
||||
if !ok || msgType != expectedType {
|
||||
return fmt.Errorf("expected type %s, got %s", expectedType, msgType)
|
||||
}
|
||||
|
||||
// Use schema validation
|
||||
return validateWithSchema(messageBytes, expectedType)
|
||||
}
|
||||
```
|
||||
|
||||
## Tempo and Timing Considerations
|
||||
|
||||
### Understanding Tempo
|
||||
|
||||
- **Default Tempo**: 2 BPM (30-second beats)
|
||||
- **Minimum Tempo**: 0.1 BPM (10-minute beats for batch processing)
|
||||
- **Maximum Tempo**: 1000 BPM (60ms beats for high-frequency trading)
|
||||
|
||||
### Phase Timing
|
||||
|
||||
Each beat consists of three phases with equal time allocation:
|
||||
|
||||
```
|
||||
Beat Duration = 60 / TempoBPM seconds
|
||||
Phase Duration = Beat Duration / 3
|
||||
|
||||
Plan Phase: [0, Beat Duration / 3)
|
||||
Execute Phase: [Beat Duration / 3, 2 * Beat Duration / 3)
|
||||
Review Phase: [2 * Beat Duration / 3, Beat Duration)
|
||||
```
|
||||
|
||||
### Implementation Guidelines
|
||||
|
||||
1. **Respect Deadlines**: Always complete phase work before `deadline_at`
|
||||
2. **Handle Tempo Changes**: Pulse may adjust tempo based on cluster performance
|
||||
3. **Plan for Latency**: Factor in network and processing delays
|
||||
4. **Implement Backpressure**: Report when unable to keep up with tempo
|
||||
|
||||
## Error Handling
|
||||
|
||||
### Schema Validation Failures
|
||||
|
||||
```go
|
||||
func HandleInvalidMessage(err error, messageBytes []byte) {
|
||||
log.Errorf("Schema validation failed: %v", err)
|
||||
log.Debugf("Invalid message: %s", string(messageBytes))
|
||||
|
||||
// Send to dead letter queue or error handler
|
||||
errorHandler.HandleInvalidMessage(messageBytes, err)
|
||||
|
||||
// Update metrics
|
||||
metrics.InvalidMessageCounter.Inc()
|
||||
}
|
||||
```
|
||||
|
||||
### Network and Timing Issues
|
||||
|
||||
```go
|
||||
func (agent *Agent) HandleMissedBeat(expectedBeat int64) {
|
||||
// Report missed beat
|
||||
claim := StatusClaim{
|
||||
Type: "backbeat.statusclaim.v1",
|
||||
AgentID: agent.ID,
|
||||
BeatIndex: expectedBeat,
|
||||
State: "blocked",
|
||||
Notes: "missed beat due to network issues",
|
||||
HLC: agent.generateHLC(),
|
||||
}
|
||||
|
||||
// Try to catch up
|
||||
agent.attemptResynchronization()
|
||||
}
|
||||
```
|
||||
|
||||
## Testing Your Integration
|
||||
|
||||
### Unit Tests
|
||||
|
||||
```go
|
||||
func TestBeatFrameValidation(t *testing.T) {
|
||||
validFrame := BeatFrame{
|
||||
Type: "backbeat.beatframe.v1",
|
||||
ClusterID: "test",
|
||||
BeatIndex: 100,
|
||||
Downbeat: false,
|
||||
Phase: "execute",
|
||||
HLC: "7ffd:0001:abcd",
|
||||
DeadlineAt: time.Now().Add(30 * time.Second),
|
||||
TempoBPM: 2.0,
|
||||
WindowID: "7e9b0e6c4c9a4e59b7f2d9a3c1b2e4d5",
|
||||
}
|
||||
|
||||
err := validateBeatFrame(validFrame)
|
||||
assert.NoError(t, err)
|
||||
}
|
||||
```
|
||||
|
||||
### Integration Tests
|
||||
|
||||
Use the BACKBEAT validation tools:
|
||||
|
||||
```bash
|
||||
# Test your message files
|
||||
backbeat-validate --schemas /path/to/backbeat/schemas --dir messages/
|
||||
|
||||
# Test individual messages
|
||||
echo '{"type":"backbeat.beatframe.v1",...}' | backbeat-validate --schemas /path/to/backbeat/schemas --message -
|
||||
```
|
||||
|
||||
### Load Testing
|
||||
|
||||
Consider tempo and message volume in your load tests:
|
||||
|
||||
```go
|
||||
func TestHighTempoHandling(t *testing.T) {
|
||||
// Simulate 10 BPM (6-second beats)
|
||||
tempo := 10.0
|
||||
beatInterval := time.Duration(60/tempo) * time.Second
|
||||
|
||||
for i := 0; i < 100; i++ {
|
||||
frame := generateBeatFrame(i, tempo)
|
||||
handler.HandleBeatFrame(frame)
|
||||
time.Sleep(beatInterval)
|
||||
}
|
||||
|
||||
// Verify no beats were dropped
|
||||
assert.Equal(t, 100, handler.processedBeats)
|
||||
}
|
||||
```
|
||||
|
||||
## Production Deployment
|
||||
|
||||
### Monitoring
|
||||
|
||||
Monitor these key metrics:
|
||||
|
||||
1. **Message Validation Rate**: Percentage of valid messages received
|
||||
2. **Beat Processing Latency**: Time to process each beat phase
|
||||
3. **Missed Beat Count**: Number of beats that couldn't be processed on time
|
||||
4. **Schema Version Compatibility**: Ensure all services use compatible versions
|
||||
|
||||
### Alerting
|
||||
|
||||
Set up alerts for:
|
||||
|
||||
- Schema validation failures > 1%
|
||||
- Beat processing latency > 90% of phase duration
|
||||
- Missed beats > 5% in any 10-minute window
|
||||
- HLC timestamp drift > 5 seconds
|
||||
|
||||
### Gradual Rollout
|
||||
|
||||
1. **Validate in CI**: Ensure all messages pass schema validation
|
||||
2. **Deploy to dev**: Test with low tempo (0.5 BPM)
|
||||
3. **Staging validation**: Use production-like tempo and load
|
||||
4. **Canary deployment**: Roll out to small percentage of production traffic
|
||||
5. **Full production**: Monitor closely and be ready to rollback
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Common Issues
|
||||
|
||||
1. **Wrong Message Type**: Ensure `type` field exactly matches schema
|
||||
2. **HLC Format**: Must be `XXXX:XXXX:XXXX` format with hex digits
|
||||
3. **Window ID Length**: Must be exactly 32 hex characters
|
||||
4. **Enum Values**: States, phases, severities must match schema exactly
|
||||
5. **Numeric Ranges**: Check min/max constraints (tempo, beat_index, etc.)
|
||||
|
||||
### Debug Tools
|
||||
|
||||
```bash
|
||||
# Validate specific message
|
||||
backbeat-validate --schemas ./schemas --message '{"type":"backbeat.beatframe.v1",...}'
|
||||
|
||||
# Get detailed validation errors
|
||||
backbeat-validate --schemas ./schemas --file message.json --json
|
||||
|
||||
# Validate entire directory with detailed output
|
||||
backbeat-validate --schemas ./schemas --dir messages/ --json > validation-report.json
|
||||
```
|
||||
|
||||
## Schema Evolution
|
||||
|
||||
See [schema-evolution.md](schema-evolution.md) for details on:
|
||||
|
||||
- Semantic versioning for schemas
|
||||
- Backward compatibility requirements
|
||||
- Migration strategies for schema updates
|
||||
- Version compatibility matrix
|
||||
|
||||
## Performance Guidelines
|
||||
|
||||
See [tempo-guide.md](tempo-guide.md) for details on:
|
||||
|
||||
- Choosing appropriate tempo for your workload
|
||||
- Optimizing beat processing performance
|
||||
- Handling tempo changes gracefully
|
||||
- Resource utilization best practices
|
||||
|
||||
## Support
|
||||
|
||||
- **Documentation**: This contracts package contains the authoritative reference
|
||||
- **Examples**: See `contracts/tests/examples/` for valid/invalid message samples
|
||||
- **Issues**: Report integration problems to the BACKBEAT team
|
||||
- **Updates**: Monitor the contracts repository for schema updates
|
||||
507
BACKBEAT-prototype/contracts/docs/schema-evolution.md
Normal file
507
BACKBEAT-prototype/contracts/docs/schema-evolution.md
Normal file
@@ -0,0 +1,507 @@
|
||||
# BACKBEAT Schema Evolution and Versioning
|
||||
|
||||
This document defines how BACKBEAT message schemas evolve over time while maintaining compatibility across the CHORUS 2.0.0 ecosystem.
|
||||
|
||||
## Versioning Strategy
|
||||
|
||||
### Semantic Versioning for Schemas
|
||||
|
||||
BACKBEAT schemas follow semantic versioning (SemVer) with CHORUS-specific interpretations:
|
||||
|
||||
- **MAJOR** (`X.0.0`): Breaking changes that require code updates
|
||||
- **MINOR** (`X.Y.0`): Backward-compatible additions (new optional fields, enum values)
|
||||
- **PATCH** (`X.Y.Z`): Documentation updates, constraint clarifications, examples
|
||||
|
||||
### Schema Identification
|
||||
|
||||
Each schema includes version information:
|
||||
|
||||
```json
|
||||
{
|
||||
"$schema": "http://json-schema.org/draft-07/schema#",
|
||||
"$id": "https://chorus.services/schemas/backbeat/beatframe/v1.2.0",
|
||||
"title": "BACKBEAT BeatFrame (INT-A)",
|
||||
"version": "1.2.0"
|
||||
}
|
||||
```
|
||||
|
||||
### Message Type Versioning
|
||||
|
||||
Message types embed version information:
|
||||
|
||||
- `backbeat.beatframe.v1` → Schema version 1.x.x
|
||||
- `backbeat.beatframe.v2` → Schema version 2.x.x
|
||||
|
||||
Only **major** version changes require new message type identifiers.
|
||||
|
||||
## Compatibility Matrix
|
||||
|
||||
### Current Schema Versions
|
||||
|
||||
| Interface | Schema Version | Message Type | Status |
|
||||
|-----------|----------------|--------------|--------|
|
||||
| INT-A (BeatFrame) | 1.0.0 | `backbeat.beatframe.v1` | Active |
|
||||
| INT-B (StatusClaim) | 1.0.0 | `backbeat.statusclaim.v1` | Active |
|
||||
| INT-C (BarReport) | 1.0.0 | `backbeat.barreport.v1` | Active |
|
||||
|
||||
### Version Compatibility Rules
|
||||
|
||||
1. **Minor/Patch Updates**: All v1.x.x schemas are compatible with `backbeat.*.v1` messages
|
||||
2. **Major Updates**: Require new message type (e.g., `backbeat.beatframe.v2`)
|
||||
3. **Transition Period**: Both old and new versions supported during migration
|
||||
4. **Deprecation**: 6-month notice before removing support for old major versions
|
||||
|
||||
## Change Categories
|
||||
|
||||
### Minor Version Changes (Backward Compatible)
|
||||
|
||||
These changes increment the minor version (1.0.0 → 1.1.0):
|
||||
|
||||
#### 1. Adding Optional Fields
|
||||
|
||||
```json
|
||||
// Before (v1.0.0)
|
||||
{
|
||||
"required": ["type", "cluster_id", "beat_index"],
|
||||
"properties": {
|
||||
"type": {...},
|
||||
"cluster_id": {...},
|
||||
"beat_index": {...}
|
||||
}
|
||||
}
|
||||
|
||||
// After (v1.1.0) - adds optional field
|
||||
{
|
||||
"required": ["type", "cluster_id", "beat_index"],
|
||||
"properties": {
|
||||
"type": {...},
|
||||
"cluster_id": {...},
|
||||
"beat_index": {...},
|
||||
"priority": {
|
||||
"type": "integer",
|
||||
"minimum": 1,
|
||||
"maximum": 10,
|
||||
"description": "Optional processing priority (1=low, 10=high)"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### 2. Adding Enum Values
|
||||
|
||||
```json
|
||||
// Before (v1.0.0)
|
||||
{
|
||||
"properties": {
|
||||
"phase": {
|
||||
"enum": ["plan", "execute", "review"]
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// After (v1.1.0) - adds new phase
|
||||
{
|
||||
"properties": {
|
||||
"phase": {
|
||||
"enum": ["plan", "execute", "review", "cleanup"]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### 3. Relaxing Constraints
|
||||
|
||||
```json
|
||||
// Before (v1.0.0)
|
||||
{
|
||||
"properties": {
|
||||
"notes": {
|
||||
"type": "string",
|
||||
"maxLength": 256
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// After (v1.1.0) - allows longer notes
|
||||
{
|
||||
"properties": {
|
||||
"notes": {
|
||||
"type": "string",
|
||||
"maxLength": 512
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### 4. Adding Properties to Objects
|
||||
|
||||
```json
|
||||
// Before (v1.0.0)
|
||||
{
|
||||
"properties": {
|
||||
"metadata": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"version": {"type": "string"}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// After (v1.1.0) - adds new metadata field
|
||||
{
|
||||
"properties": {
|
||||
"metadata": {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"version": {"type": "string"},
|
||||
"source": {"type": "string"}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Major Version Changes (Breaking)
|
||||
|
||||
These changes increment the major version (1.x.x → 2.0.0):
|
||||
|
||||
#### 1. Removing Required Fields
|
||||
|
||||
```json
|
||||
// v1.x.x
|
||||
{
|
||||
"required": ["type", "cluster_id", "beat_index", "deprecated_field"]
|
||||
}
|
||||
|
||||
// v2.0.0
|
||||
{
|
||||
"required": ["type", "cluster_id", "beat_index"]
|
||||
}
|
||||
```
|
||||
|
||||
#### 2. Changing Field Types
|
||||
|
||||
```json
|
||||
// v1.x.x
|
||||
{
|
||||
"properties": {
|
||||
"beat_index": {"type": "integer"}
|
||||
}
|
||||
}
|
||||
|
||||
// v2.0.0
|
||||
{
|
||||
"properties": {
|
||||
"beat_index": {"type": "string"}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### 3. Removing Enum Values
|
||||
|
||||
```json
|
||||
// v1.x.x
|
||||
{
|
||||
"properties": {
|
||||
"state": {
|
||||
"enum": ["idle", "executing", "deprecated_state"]
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// v2.0.0
|
||||
{
|
||||
"properties": {
|
||||
"state": {
|
||||
"enum": ["idle", "executing"]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### 4. Tightening Constraints
|
||||
|
||||
```json
|
||||
// v1.x.x
|
||||
{
|
||||
"properties": {
|
||||
"agent_id": {
|
||||
"type": "string",
|
||||
"maxLength": 256
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// v2.0.0
|
||||
{
|
||||
"properties": {
|
||||
"agent_id": {
|
||||
"type": "string",
|
||||
"maxLength": 128
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Patch Version Changes (Non-Breaking)
|
||||
|
||||
These changes increment the patch version (1.0.0 → 1.0.1):
|
||||
|
||||
1. **Documentation updates**
|
||||
2. **Example additions**
|
||||
3. **Description clarifications**
|
||||
4. **Comment additions**
|
||||
|
||||
## Migration Strategies
|
||||
|
||||
### Minor Version Migration
|
||||
|
||||
Services automatically benefit from minor version updates:
|
||||
|
||||
```go
|
||||
// This code works with both v1.0.0 and v1.1.0
|
||||
func handleBeatFrame(frame BeatFrame) {
|
||||
// Core fields always present
|
||||
log.Printf("Beat %d in phase %s", frame.BeatIndex, frame.Phase)
|
||||
|
||||
// New optional fields checked safely
|
||||
if frame.Priority != nil {
|
||||
log.Printf("Priority: %d", *frame.Priority)
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Major Version Migration
|
||||
|
||||
Requires explicit handling of both versions during transition:
|
||||
|
||||
```go
|
||||
func handleMessage(messageBytes []byte) error {
|
||||
var msgType struct {
|
||||
Type string `json:"type"`
|
||||
}
|
||||
|
||||
if err := json.Unmarshal(messageBytes, &msgType); err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
switch msgType.Type {
|
||||
case "backbeat.beatframe.v1":
|
||||
return handleBeatFrameV1(messageBytes)
|
||||
case "backbeat.beatframe.v2":
|
||||
return handleBeatFrameV2(messageBytes)
|
||||
default:
|
||||
return fmt.Errorf("unsupported message type: %s", msgType.Type)
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Gradual Migration Process
|
||||
|
||||
1. **Preparation Phase** (Months 1-2)
|
||||
- Announce upcoming major version change
|
||||
- Publish v2.0.0 schemas alongside v1.x.x
|
||||
- Update documentation and examples
|
||||
- Provide migration tools and guides
|
||||
|
||||
2. **Dual Support Phase** (Months 3-4)
|
||||
- Services support both v1 and v2 message types
|
||||
- New services prefer v2 messages
|
||||
- Monitoring tracks v1 vs v2 usage
|
||||
|
||||
3. **Migration Phase** (Months 5-6)
|
||||
- All services updated to send v2 messages
|
||||
- Services still accept v1 for backward compatibility
|
||||
- Warnings logged for v1 message reception
|
||||
|
||||
4. **Cleanup Phase** (Month 7+)
|
||||
- Drop support for v1 messages
|
||||
- Remove v1 handling code
|
||||
- Update schemas to mark v1 as deprecated
|
||||
|
||||
## Implementation Guidelines
|
||||
|
||||
### Schema Development
|
||||
|
||||
1. **Start Conservative**: Begin with strict constraints, relax later if needed
|
||||
2. **Plan for Growth**: Design extensible structures with optional metadata objects
|
||||
3. **Document Thoroughly**: Include clear descriptions and examples
|
||||
4. **Test Extensively**: Validate with real-world data before releasing
|
||||
|
||||
### Version Detection
|
||||
|
||||
Services should detect schema versions:
|
||||
|
||||
```go
|
||||
type SchemaInfo struct {
|
||||
Version string `json:"version"`
|
||||
MessageType string `json:"message_type"`
|
||||
IsSupported bool `json:"is_supported"`
|
||||
}
|
||||
|
||||
func detectSchemaVersion(messageType string) SchemaInfo {
|
||||
switch messageType {
|
||||
case "backbeat.beatframe.v1":
|
||||
return SchemaInfo{
|
||||
Version: "1.x.x",
|
||||
MessageType: messageType,
|
||||
IsSupported: true,
|
||||
}
|
||||
case "backbeat.beatframe.v2":
|
||||
return SchemaInfo{
|
||||
Version: "2.x.x",
|
||||
MessageType: messageType,
|
||||
IsSupported: true,
|
||||
}
|
||||
default:
|
||||
return SchemaInfo{
|
||||
MessageType: messageType,
|
||||
IsSupported: false,
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Validation Strategy
|
||||
|
||||
```go
|
||||
func validateWithVersionFallback(messageBytes []byte) error {
|
||||
// Try latest version first
|
||||
if err := validateV2(messageBytes); err == nil {
|
||||
return nil
|
||||
}
|
||||
|
||||
// Fall back to previous version
|
||||
if err := validateV1(messageBytes); err == nil {
|
||||
log.Warn("Received v1 message, consider upgrading sender")
|
||||
return nil
|
||||
}
|
||||
|
||||
return fmt.Errorf("message does not match any supported schema version")
|
||||
}
|
||||
```
|
||||
|
||||
## Testing Schema Evolution
|
||||
|
||||
### Compatibility Tests
|
||||
|
||||
```go
|
||||
func TestSchemaBackwardCompatibility(t *testing.T) {
|
||||
// Test that v1.1.0 accepts all valid v1.0.0 messages
|
||||
v100Messages := loadTestMessages("v1.0.0")
|
||||
v110Schema := loadSchema("beatframe-v1.1.0.schema.json")
|
||||
|
||||
for _, msg := range v100Messages {
|
||||
err := validateAgainstSchema(msg, v110Schema)
|
||||
assert.NoError(t, err, "v1.1.0 should accept v1.0.0 messages")
|
||||
}
|
||||
}
|
||||
|
||||
func TestSchemaForwardCompatibility(t *testing.T) {
|
||||
// Test that v1.0.0 code gracefully handles v1.1.0 messages
|
||||
v110Message := loadTestMessage("beatframe-v1.1.0-with-new-fields.json")
|
||||
|
||||
var beatFrame BeatFrameV1
|
||||
err := json.Unmarshal(v110Message, &beatFrame)
|
||||
assert.NoError(t, err, "v1.0.0 struct should parse v1.1.0 messages")
|
||||
|
||||
// Core fields should be populated
|
||||
assert.NotEmpty(t, beatFrame.Type)
|
||||
assert.NotEmpty(t, beatFrame.ClusterID)
|
||||
}
|
||||
```
|
||||
|
||||
### Migration Tests
|
||||
|
||||
```go
|
||||
func TestDualVersionSupport(t *testing.T) {
|
||||
handler := NewMessageHandler()
|
||||
|
||||
v1Message := generateBeatFrameV1()
|
||||
v2Message := generateBeatFrameV2()
|
||||
|
||||
// Both versions should be handled correctly
|
||||
err1 := handler.HandleMessage(v1Message)
|
||||
err2 := handler.HandleMessage(v2Message)
|
||||
|
||||
assert.NoError(t, err1)
|
||||
assert.NoError(t, err2)
|
||||
}
|
||||
```
|
||||
|
||||
## Deprecation Process
|
||||
|
||||
### Marking Deprecated Features
|
||||
|
||||
```json
|
||||
{
|
||||
"properties": {
|
||||
"legacy_field": {
|
||||
"type": "string",
|
||||
"description": "DEPRECATED: Use new_field instead. Will be removed in v2.0.0",
|
||||
"deprecated": true
|
||||
},
|
||||
"new_field": {
|
||||
"type": "string",
|
||||
"description": "Replacement for legacy_field"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Communication Timeline
|
||||
|
||||
1. **6 months before**: Announce deprecation in release notes
|
||||
2. **3 months before**: Add deprecation warnings to schemas
|
||||
3. **1 month before**: Final migration reminder
|
||||
4. **Release day**: Remove deprecated features
|
||||
|
||||
### Tooling Support
|
||||
|
||||
```bash
|
||||
# Check for deprecated schema usage
|
||||
backbeat-validate --schemas ./schemas --dir messages/ --check-deprecated
|
||||
|
||||
# Migration helper
|
||||
backbeat-migrate --from v1 --to v2 --dir messages/
|
||||
```
|
||||
|
||||
## Best Practices
|
||||
|
||||
### For Schema Authors
|
||||
|
||||
1. **Communicate Early**: Announce changes well in advance
|
||||
2. **Provide Tools**: Create migration utilities and documentation
|
||||
3. **Monitor Usage**: Track which versions are being used
|
||||
4. **Be Conservative**: Prefer minor over major version changes
|
||||
|
||||
### For Service Developers
|
||||
|
||||
1. **Stay Updated**: Subscribe to schema change notifications
|
||||
2. **Plan for Migration**: Build version handling into your services
|
||||
3. **Test Thoroughly**: Validate against multiple schema versions
|
||||
4. **Monitor Compatibility**: Alert on unsupported message versions
|
||||
|
||||
### For Operations Teams
|
||||
|
||||
1. **Version Tracking**: Monitor which schema versions are active
|
||||
2. **Migration Planning**: Coordinate major version migrations
|
||||
3. **Rollback Capability**: Be prepared to revert if migrations fail
|
||||
4. **Performance Impact**: Monitor schema validation performance
|
||||
|
||||
## Future Considerations
|
||||
|
||||
### Planned Enhancements
|
||||
|
||||
1. **Schema Registry**: Centralized schema version management
|
||||
2. **Auto-Migration**: Tools to automatically update message formats
|
||||
3. **Version Negotiation**: Services negotiate supported versions
|
||||
4. **Schema Analytics**: Usage metrics and compatibility reporting
|
||||
|
||||
### Long-term Vision
|
||||
|
||||
- **Continuous Evolution**: Schemas evolve without breaking existing services
|
||||
- **Zero-Downtime Updates**: Schema changes deploy without service interruption
|
||||
- **Automated Testing**: CI/CD pipelines validate schema compatibility
|
||||
- **Self-Healing**: Services automatically adapt to schema changes
|
||||
610
BACKBEAT-prototype/contracts/docs/tempo-guide.md
Normal file
610
BACKBEAT-prototype/contracts/docs/tempo-guide.md
Normal file
@@ -0,0 +1,610 @@
|
||||
# BACKBEAT Tempo Guide: Beat Timing and Performance Recommendations
|
||||
|
||||
This guide provides comprehensive recommendations for choosing optimal tempo settings, implementing beat processing, and achieving optimal performance in BACKBEAT-enabled CHORUS 2.0.0 services.
|
||||
|
||||
## Understanding BACKBEAT Tempo
|
||||
|
||||
### Tempo Basics
|
||||
|
||||
BACKBEAT tempo is measured in **Beats Per Minute (BPM)**, similar to musical tempo:
|
||||
|
||||
- **1 BPM** = 60-second beats (good for batch processing)
|
||||
- **2 BPM** = 30-second beats (**default**, good for most services)
|
||||
- **4 BPM** = 15-second beats (good for responsive services)
|
||||
- **60 BPM** = 1-second beats (good for high-frequency operations)
|
||||
|
||||
### Beat Structure
|
||||
|
||||
Each beat consists of three equal phases:
|
||||
|
||||
```
|
||||
Beat Duration = 60 / TempoBPM seconds
|
||||
Phase Duration = Beat Duration / 3
|
||||
|
||||
┌─────────────┬─────────────┬─────────────┐
|
||||
│ PLAN │ EXECUTE │ REVIEW │
|
||||
│ Phase 1 │ Phase 2 │ Phase 3 │
|
||||
└─────────────┴─────────────┴─────────────┘
|
||||
│←────────── Beat Duration ──────────────→│
|
||||
```
|
||||
|
||||
### Tempo Ranges and Use Cases
|
||||
|
||||
| Tempo Range | Beat Duration | Use Cases | Examples |
|
||||
|-------------|---------------|-----------|----------|
|
||||
| 0.1 - 0.5 BPM | 2-10 minutes | Large batch jobs, ETL | Data warehouse loads, ML training |
|
||||
| 0.5 - 2 BPM | 30s - 2 minutes | Standard operations | API services, web apps |
|
||||
| 2 - 10 BPM | 6-30 seconds | Responsive services | Real-time dashboards, monitoring |
|
||||
| 10 - 60 BPM | 1-6 seconds | High-frequency | Trading systems, IoT data processing |
|
||||
| 60+ BPM | <1 second | Ultra-high-frequency | Hardware control, real-time gaming |
|
||||
|
||||
## Choosing the Right Tempo
|
||||
|
||||
### Workload Analysis
|
||||
|
||||
Before selecting tempo, analyze your workload characteristics:
|
||||
|
||||
1. **Task Duration**: How long do typical operations take?
|
||||
2. **Coordination Needs**: How often do services need to synchronize?
|
||||
3. **Resource Requirements**: How much CPU/memory/I/O does work consume?
|
||||
4. **Latency Tolerance**: How quickly must the system respond to changes?
|
||||
5. **Error Recovery**: How quickly should the system detect and recover from failures?
|
||||
|
||||
### Tempo Selection Guidelines
|
||||
|
||||
#### Rule 1: Task Duration Constraint
|
||||
```
|
||||
Recommended Tempo ≤ 60 / (Average Task Duration × 3)
|
||||
```
|
||||
|
||||
**Example**: If tasks take 5 seconds on average:
|
||||
- Maximum recommended tempo = 60 / (5 × 3) = 4 BPM
|
||||
- Use 2-4 BPM for safe operation
|
||||
|
||||
#### Rule 2: Coordination Frequency
|
||||
```
|
||||
Coordination Tempo = 60 / Desired Sync Interval
|
||||
```
|
||||
|
||||
**Example**: If services should sync every 2 minutes:
|
||||
- Recommended tempo = 60 / 120 = 0.5 BPM
|
||||
|
||||
#### Rule 3: Resource Utilization
|
||||
```
|
||||
Sustainable Tempo = 60 / (Task Duration + Recovery Time)
|
||||
```
|
||||
|
||||
**Example**: 10s tasks with 5s recovery time:
|
||||
- Maximum sustainable tempo = 60 / (10 + 5) = 4 BPM
|
||||
|
||||
### Common Tempo Patterns
|
||||
|
||||
#### Development/Testing: 0.1-0.5 BPM
|
||||
```json
|
||||
{
|
||||
"tempo_bpm": 0.2,
|
||||
"beat_duration": "5 minutes",
|
||||
"use_case": "Development and debugging",
|
||||
"advantages": ["Easy to observe", "Time to investigate issues"],
|
||||
"disadvantages": ["Slow feedback", "Not production realistic"]
|
||||
}
|
||||
```
|
||||
|
||||
#### Standard Services: 1-4 BPM
|
||||
```json
|
||||
{
|
||||
"tempo_bpm": 2.0,
|
||||
"beat_duration": "30 seconds",
|
||||
"use_case": "Most production services",
|
||||
"advantages": ["Good balance", "Reasonable coordination", "Error recovery"],
|
||||
"disadvantages": ["May be slow for real-time needs"]
|
||||
}
|
||||
```
|
||||
|
||||
#### Responsive Applications: 4-20 BPM
|
||||
```json
|
||||
{
|
||||
"tempo_bpm": 10.0,
|
||||
"beat_duration": "6 seconds",
|
||||
"use_case": "Interactive applications",
|
||||
"advantages": ["Quick response", "Fast error detection"],
|
||||
"disadvantages": ["Higher overhead", "More network traffic"]
|
||||
}
|
||||
```
|
||||
|
||||
#### High-Frequency Systems: 20+ BPM
|
||||
```json
|
||||
{
|
||||
"tempo_bpm": 60.0,
|
||||
"beat_duration": "1 second",
|
||||
"use_case": "Real-time trading, IoT",
|
||||
"advantages": ["Ultra-responsive", "Immediate coordination"],
|
||||
"disadvantages": ["High resource usage", "Network intensive"]
|
||||
}
|
||||
```
|
||||
|
||||
## Implementation Guidelines
|
||||
|
||||
### Beat Processing Architecture
|
||||
|
||||
#### Single-Threaded Processing
|
||||
Best for low-medium tempo (≤10 BPM):
|
||||
|
||||
```go
|
||||
type BeatProcessor struct {
|
||||
currentBeat int64
|
||||
phase string
|
||||
workQueue chan Task
|
||||
}
|
||||
|
||||
func (p *BeatProcessor) ProcessBeat(frame BeatFrame) {
|
||||
// Update state
|
||||
p.currentBeat = frame.BeatIndex
|
||||
p.phase = frame.Phase
|
||||
|
||||
// Process phase synchronously
|
||||
switch frame.Phase {
|
||||
case "plan":
|
||||
p.planPhase(frame)
|
||||
case "execute":
|
||||
p.executePhase(frame)
|
||||
case "review":
|
||||
p.reviewPhase(frame)
|
||||
}
|
||||
|
||||
// Report status before deadline
|
||||
p.reportStatus(frame.BeatIndex, "completed")
|
||||
}
|
||||
```
|
||||
|
||||
#### Pipelined Processing
|
||||
Best for high tempo (>10 BPM):
|
||||
|
||||
```go
|
||||
type PipelinedProcessor struct {
|
||||
planQueue chan BeatFrame
|
||||
executeQueue chan BeatFrame
|
||||
reviewQueue chan BeatFrame
|
||||
}
|
||||
|
||||
func (p *PipelinedProcessor) Start() {
|
||||
// Separate goroutines for each phase
|
||||
go p.planWorker()
|
||||
go p.executeWorker()
|
||||
go p.reviewWorker()
|
||||
}
|
||||
|
||||
func (p *PipelinedProcessor) ProcessBeat(frame BeatFrame) {
|
||||
switch frame.Phase {
|
||||
case "plan":
|
||||
p.planQueue <- frame
|
||||
case "execute":
|
||||
p.executeQueue <- frame
|
||||
case "review":
|
||||
p.reviewQueue <- frame
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Timing Implementation
|
||||
|
||||
#### Deadline Management
|
||||
|
||||
```go
|
||||
func (p *BeatProcessor) executeWithDeadline(frame BeatFrame, work func() error) error {
|
||||
// Calculate remaining time
|
||||
remainingTime := time.Until(frame.DeadlineAt)
|
||||
|
||||
// Create timeout context
|
||||
ctx, cancel := context.WithTimeout(context.Background(), remainingTime)
|
||||
defer cancel()
|
||||
|
||||
// Execute with timeout
|
||||
done := make(chan error, 1)
|
||||
go func() {
|
||||
done <- work()
|
||||
}()
|
||||
|
||||
select {
|
||||
case err := <-done:
|
||||
return err
|
||||
case <-ctx.Done():
|
||||
return fmt.Errorf("work timed out after %v", remainingTime)
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### Adaptive Processing
|
||||
|
||||
```go
|
||||
type AdaptiveProcessor struct {
|
||||
processingTimes []time.Duration
|
||||
targetUtilization float64 // 0.8 = use 80% of available time
|
||||
}
|
||||
|
||||
func (p *AdaptiveProcessor) shouldProcessWork(frame BeatFrame) bool {
|
||||
// Calculate phase time available
|
||||
phaseTime := time.Duration(60/frame.TempoBPM*1000/3) * time.Millisecond
|
||||
|
||||
// Estimate processing time based on history
|
||||
avgProcessingTime := p.calculateAverage()
|
||||
|
||||
// Only process if we have enough time
|
||||
requiredTime := time.Duration(float64(avgProcessingTime) / p.targetUtilization)
|
||||
return phaseTime >= requiredTime
|
||||
}
|
||||
```
|
||||
|
||||
### Performance Optimization
|
||||
|
||||
#### Batch Processing within Beats
|
||||
|
||||
```go
|
||||
func (p *BeatProcessor) executePhase(frame BeatFrame) error {
|
||||
// Calculate optimal batch size based on tempo
|
||||
phaseDuration := time.Duration(60/frame.TempoBPM*1000/3) * time.Millisecond
|
||||
targetTime := time.Duration(float64(phaseDuration) * 0.8) // Use 80% of time
|
||||
|
||||
// Process work in batches
|
||||
batchSize := p.calculateOptimalBatchSize(targetTime)
|
||||
|
||||
for p.hasWork() && time.Until(frame.DeadlineAt) > time.Second {
|
||||
batch := p.getWorkBatch(batchSize)
|
||||
if err := p.processBatch(batch); err != nil {
|
||||
return err
|
||||
}
|
||||
}
|
||||
|
||||
return nil
|
||||
}
|
||||
```
|
||||
|
||||
#### Caching and Pre-computation
|
||||
|
||||
```go
|
||||
type SmartProcessor struct {
|
||||
cache map[string]interface{}
|
||||
precomputed map[int64]interface{} // Keyed by beat index
|
||||
}
|
||||
|
||||
func (p *SmartProcessor) planPhase(frame BeatFrame) {
|
||||
// Pre-compute work for future beats during plan phase
|
||||
nextBeat := frame.BeatIndex + 1
|
||||
if _, exists := p.precomputed[nextBeat]; !exists {
|
||||
p.precomputed[nextBeat] = p.precomputeWork(nextBeat)
|
||||
}
|
||||
|
||||
// Cache frequently accessed data
|
||||
p.cacheRelevantData(frame)
|
||||
}
|
||||
|
||||
func (p *SmartProcessor) executePhase(frame BeatFrame) {
|
||||
// Use pre-computed results if available
|
||||
if precomputed, exists := p.precomputed[frame.BeatIndex]; exists {
|
||||
return p.usePrecomputedWork(precomputed)
|
||||
}
|
||||
|
||||
// Fall back to real-time computation
|
||||
return p.computeWork(frame)
|
||||
}
|
||||
```
|
||||
|
||||
## Performance Monitoring
|
||||
|
||||
### Key Metrics
|
||||
|
||||
Track these metrics for tempo optimization:
|
||||
|
||||
```go
|
||||
type TempoMetrics struct {
|
||||
// Timing metrics
|
||||
BeatProcessingLatency time.Duration // How long beats take to process
|
||||
PhaseCompletionRate float64 // % of phases completed on time
|
||||
DeadlineMissRate float64 // % of deadlines missed
|
||||
|
||||
// Resource metrics
|
||||
CPUUtilization float64 // CPU usage during beats
|
||||
MemoryUtilization float64 // Memory usage
|
||||
NetworkBandwidth int64 // Bytes/sec for BACKBEAT messages
|
||||
|
||||
// Throughput metrics
|
||||
TasksPerBeat int // Work completed per beat
|
||||
BeatsPerSecond float64 // Effective beat processing rate
|
||||
TempoDriftMS float64 // How far behind/ahead we're running
|
||||
}
|
||||
```
|
||||
|
||||
### Performance Alerts
|
||||
|
||||
```go
|
||||
func (m *TempoMetrics) checkAlerts() []Alert {
|
||||
var alerts []Alert
|
||||
|
||||
// Beat processing taking too long
|
||||
if m.BeatProcessingLatency > m.phaseDuration() * 0.9 {
|
||||
alerts = append(alerts, Alert{
|
||||
Level: "warning",
|
||||
Message: "Beat processing approaching deadline",
|
||||
Recommendation: "Consider reducing tempo or optimizing processing",
|
||||
})
|
||||
}
|
||||
|
||||
// Missing too many deadlines
|
||||
if m.DeadlineMissRate > 0.05 { // 5%
|
||||
alerts = append(alerts, Alert{
|
||||
Level: "critical",
|
||||
Message: "High deadline miss rate",
|
||||
Recommendation: "Reduce tempo immediately or scale resources",
|
||||
})
|
||||
}
|
||||
|
||||
// Resource exhaustion
|
||||
if m.CPUUtilization > 0.9 {
|
||||
alerts = append(alerts, Alert{
|
||||
Level: "warning",
|
||||
Message: "High CPU utilization",
|
||||
Recommendation: "Scale up or reduce workload per beat",
|
||||
})
|
||||
}
|
||||
|
||||
return alerts
|
||||
}
|
||||
```
|
||||
|
||||
### Adaptive Tempo Adjustment
|
||||
|
||||
```go
|
||||
type TempoController struct {
|
||||
currentTempo float64
|
||||
targetLatency time.Duration
|
||||
adjustmentRate float64 // How aggressively to adjust
|
||||
}
|
||||
|
||||
func (tc *TempoController) adjustTempo(metrics TempoMetrics) float64 {
|
||||
// Calculate desired tempo based on performance
|
||||
if metrics.DeadlineMissRate > 0.02 { // 2% miss rate
|
||||
// Slow down
|
||||
tc.currentTempo *= (1.0 - tc.adjustmentRate)
|
||||
} else if metrics.PhaseCompletionRate > 0.95 && metrics.CPUUtilization < 0.7 {
|
||||
// Speed up
|
||||
tc.currentTempo *= (1.0 + tc.adjustmentRate)
|
||||
}
|
||||
|
||||
// Apply constraints
|
||||
tc.currentTempo = math.Max(0.1, tc.currentTempo) // Minimum 0.1 BPM
|
||||
tc.currentTempo = math.Min(1000, tc.currentTempo) // Maximum 1000 BPM
|
||||
|
||||
return tc.currentTempo
|
||||
}
|
||||
```
|
||||
|
||||
## Load Testing and Capacity Planning
|
||||
|
||||
### Beat Load Testing
|
||||
|
||||
```go
|
||||
func TestBeatProcessingUnderLoad(t *testing.T) {
|
||||
processor := NewBeatProcessor()
|
||||
tempo := 10.0 // 10 BPM = 6-second beats
|
||||
beatInterval := time.Duration(60/tempo) * time.Second
|
||||
|
||||
// Simulate sustained load
|
||||
for i := 0; i < 1000; i++ {
|
||||
frame := generateBeatFrame(i, tempo)
|
||||
|
||||
start := time.Now()
|
||||
err := processor.ProcessBeat(frame)
|
||||
duration := time.Since(start)
|
||||
|
||||
// Verify processing completed within phase duration
|
||||
phaseDuration := beatInterval / 3
|
||||
assert.Less(t, duration, phaseDuration)
|
||||
assert.NoError(t, err)
|
||||
|
||||
// Wait for next beat
|
||||
time.Sleep(beatInterval)
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Capacity Planning
|
||||
|
||||
```go
|
||||
type CapacityPlanner struct {
|
||||
maxTempo float64
|
||||
resourceLimits ResourceLimits
|
||||
taskCharacteristics TaskProfile
|
||||
}
|
||||
|
||||
func (cp *CapacityPlanner) calculateMaxTempo() float64 {
|
||||
// Based on CPU capacity
|
||||
cpuConstrainedTempo := 60.0 / (cp.taskCharacteristics.CPUTime * 3)
|
||||
|
||||
// Based on memory capacity
|
||||
memConstrainedTempo := cp.resourceLimits.Memory / cp.taskCharacteristics.MemoryPerBeat
|
||||
|
||||
// Based on I/O capacity
|
||||
ioConstrainedTempo := cp.resourceLimits.IOPS / cp.taskCharacteristics.IOPerBeat
|
||||
|
||||
// Take the minimum (most restrictive constraint)
|
||||
return math.Min(cpuConstrainedTempo, math.Min(memConstrainedTempo, ioConstrainedTempo))
|
||||
}
|
||||
```
|
||||
|
||||
## Common Patterns and Anti-Patterns
|
||||
|
||||
### ✅ Good Patterns
|
||||
|
||||
#### Progressive Backoff
|
||||
```go
|
||||
func (p *Processor) handleOverload() {
|
||||
if p.metrics.DeadlineMissRate > 0.1 {
|
||||
// Temporarily reduce work per beat
|
||||
p.workPerBeat *= 0.8
|
||||
log.Warn("Reducing work per beat due to overload")
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### Graceful Degradation
|
||||
```go
|
||||
func (p *Processor) executePhase(frame BeatFrame) error {
|
||||
timeRemaining := time.Until(frame.DeadlineAt)
|
||||
|
||||
if timeRemaining < p.minimumTime {
|
||||
// Skip non-essential work
|
||||
return p.executeEssentialOnly(frame)
|
||||
}
|
||||
|
||||
return p.executeFullWorkload(frame)
|
||||
}
|
||||
```
|
||||
|
||||
#### Work Prioritization
|
||||
```go
|
||||
func (p *Processor) planPhase(frame BeatFrame) {
|
||||
// Sort work by priority and deadline
|
||||
work := p.getAvailableWork()
|
||||
sort.Sort(ByPriorityAndDeadline(work))
|
||||
|
||||
// Plan only what can be completed in time
|
||||
plannedWork := p.selectWorkForTempo(work, frame.TempoBPM)
|
||||
p.scheduleWork(plannedWork)
|
||||
}
|
||||
```
|
||||
|
||||
### ❌ Anti-Patterns
|
||||
|
||||
#### Blocking I/O in Beat Processing
|
||||
```go
|
||||
// DON'T: Synchronous I/O can cause deadline misses
|
||||
func badExecutePhase(frame BeatFrame) error {
|
||||
data := fetchFromDatabase() // Blocking call!
|
||||
return processData(data)
|
||||
}
|
||||
|
||||
// DO: Use async I/O with timeouts
|
||||
func goodExecutePhase(frame BeatFrame) error {
|
||||
ctx, cancel := context.WithDeadline(context.Background(), frame.DeadlineAt)
|
||||
defer cancel()
|
||||
|
||||
data, err := fetchFromDatabaseAsync(ctx)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
return processData(data)
|
||||
}
|
||||
```
|
||||
|
||||
#### Ignoring Tempo Changes
|
||||
```go
|
||||
// DON'T: Assume tempo is constant
|
||||
func badBeatHandler(frame BeatFrame) {
|
||||
// Hard-coded timing assumptions
|
||||
time.Sleep(10 * time.Second) // Fails if tempo > 6 BPM!
|
||||
}
|
||||
|
||||
// DO: Adapt to current tempo
|
||||
func goodBeatHandler(frame BeatFrame) {
|
||||
phaseDuration := time.Duration(60/frame.TempoBPM*1000/3) * time.Millisecond
|
||||
maxWorkTime := time.Duration(float64(phaseDuration) * 0.8)
|
||||
|
||||
// Adapt work to available time
|
||||
ctx, cancel := context.WithTimeout(context.Background(), maxWorkTime)
|
||||
defer cancel()
|
||||
|
||||
doWork(ctx)
|
||||
}
|
||||
```
|
||||
|
||||
#### Unbounded Work Queues
|
||||
```go
|
||||
// DON'T: Let work queues grow infinitely
|
||||
type BadProcessor struct {
|
||||
workQueue chan Task // Unbounded queue
|
||||
}
|
||||
|
||||
// DO: Use bounded queues with backpressure
|
||||
type GoodProcessor struct {
|
||||
workQueue chan Task // Bounded queue
|
||||
metrics *TempoMetrics
|
||||
}
|
||||
|
||||
func (p *GoodProcessor) addWork(task Task) error {
|
||||
select {
|
||||
case p.workQueue <- task:
|
||||
return nil
|
||||
default:
|
||||
p.metrics.WorkRejectedCount++
|
||||
return ErrQueueFull
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Troubleshooting Performance Issues
|
||||
|
||||
### Diagnostic Checklist
|
||||
|
||||
1. **Beat Processing Time**: Are beats completing within phase deadlines?
|
||||
2. **Resource Utilization**: Is CPU/memory/I/O being over-utilized?
|
||||
3. **Network Latency**: Are BACKBEAT messages arriving late?
|
||||
4. **Work Distribution**: Is work evenly distributed across beats?
|
||||
5. **Error Rates**: Are errors causing processing delays?
|
||||
|
||||
### Performance Tuning Steps
|
||||
|
||||
1. **Measure Current Performance**
|
||||
```bash
|
||||
# Monitor beat processing metrics
|
||||
kubectl logs deployment/my-service | grep "beat_processing_time"
|
||||
|
||||
# Check resource utilization
|
||||
kubectl top pods
|
||||
```
|
||||
|
||||
2. **Identify Bottlenecks**
|
||||
```go
|
||||
func profileBeatProcessing(frame BeatFrame) {
|
||||
defer func(start time.Time) {
|
||||
log.Infof("Beat %d phase %s took %v",
|
||||
frame.BeatIndex, frame.Phase, time.Since(start))
|
||||
}(time.Now())
|
||||
|
||||
// Your beat processing code here
|
||||
}
|
||||
```
|
||||
|
||||
3. **Optimize Critical Paths**
|
||||
- Cache frequently accessed data
|
||||
- Use connection pooling
|
||||
- Implement circuit breakers
|
||||
- Add request timeouts
|
||||
|
||||
4. **Scale Resources**
|
||||
- Increase CPU/memory limits
|
||||
- Add more replicas
|
||||
- Use faster storage
|
||||
- Optimize network configuration
|
||||
|
||||
5. **Adjust Tempo**
|
||||
- Reduce tempo if overloaded
|
||||
- Increase tempo if under-utilized
|
||||
- Consider tempo auto-scaling
|
||||
|
||||
## Future Enhancements
|
||||
|
||||
### Planned Features
|
||||
|
||||
1. **Dynamic Tempo Scaling**: Automatic tempo adjustment based on load
|
||||
2. **Beat Prediction**: ML-based prediction of optimal tempo
|
||||
3. **Resource-Aware Scheduling**: Beat scheduling based on resource availability
|
||||
4. **Cross-Service Tempo Negotiation**: Services negotiate optimal cluster tempo
|
||||
|
||||
### Experimental Features
|
||||
|
||||
1. **Hierarchical Beats**: Different tempo for different service types
|
||||
2. **Beat Priorities**: Critical beats get processing preference
|
||||
3. **Temporal Load Balancing**: Distribute work across beat phases
|
||||
4. **Beat Replay**: Replay missed beats during low-load periods
|
||||
|
||||
Understanding and implementing these tempo guidelines will ensure your BACKBEAT-enabled services operate efficiently and reliably across the full range of CHORUS 2.0.0 workloads.
|
||||
267
BACKBEAT-prototype/contracts/schemas/barreport-v1.schema.json
Normal file
267
BACKBEAT-prototype/contracts/schemas/barreport-v1.schema.json
Normal file
@@ -0,0 +1,267 @@
|
||||
{
|
||||
"$schema": "http://json-schema.org/draft-07/schema#",
|
||||
"$id": "https://chorus.services/schemas/backbeat/barreport/v1.0.0",
|
||||
"title": "BACKBEAT BarReport (INT-C)",
|
||||
"description": "Periodic report from Reverb service summarizing agent activity over a bar (120 beats)",
|
||||
"version": "1.0.0",
|
||||
"type": "object",
|
||||
"required": [
|
||||
"type",
|
||||
"window_id",
|
||||
"from_beat",
|
||||
"to_beat",
|
||||
"agents_reporting",
|
||||
"on_time_reviews",
|
||||
"help_promises_fulfilled",
|
||||
"secret_rotations_ok",
|
||||
"tempo_drift_ms"
|
||||
],
|
||||
"additionalProperties": false,
|
||||
"properties": {
|
||||
"type": {
|
||||
"type": "string",
|
||||
"const": "backbeat.barreport.v1",
|
||||
"description": "Message type identifier for BarReport v1"
|
||||
},
|
||||
"window_id": {
|
||||
"type": "string",
|
||||
"pattern": "^[0-9a-fA-F]{32}$",
|
||||
"description": "Unique identifier for this reporting window"
|
||||
},
|
||||
"from_beat": {
|
||||
"type": "integer",
|
||||
"minimum": 0,
|
||||
"maximum": 9223372036854775807,
|
||||
"description": "Starting beat index for this report (inclusive)"
|
||||
},
|
||||
"to_beat": {
|
||||
"type": "integer",
|
||||
"minimum": 0,
|
||||
"maximum": 9223372036854775807,
|
||||
"description": "Ending beat index for this report (inclusive)"
|
||||
},
|
||||
"agents_reporting": {
|
||||
"type": "integer",
|
||||
"minimum": 0,
|
||||
"description": "Total number of unique agents that sent status claims during this window"
|
||||
},
|
||||
"on_time_reviews": {
|
||||
"type": "integer",
|
||||
"minimum": 0,
|
||||
"description": "Number of agents that completed review phase within deadline"
|
||||
},
|
||||
"help_promises_fulfilled": {
|
||||
"type": "integer",
|
||||
"minimum": 0,
|
||||
"description": "Number of successful help/collaboration completions"
|
||||
},
|
||||
"secret_rotations_ok": {
|
||||
"type": "boolean",
|
||||
"description": "True if all required credential rotations completed successfully"
|
||||
},
|
||||
"tempo_drift_ms": {
|
||||
"type": "number",
|
||||
"description": "Average timing drift in milliseconds (positive = running behind, negative = ahead)"
|
||||
},
|
||||
"issues": {
|
||||
"type": "array",
|
||||
"maxItems": 100,
|
||||
"description": "List of significant issues or anomalies detected during this window",
|
||||
"items": {
|
||||
"type": "object",
|
||||
"required": ["severity", "category", "count"],
|
||||
"additionalProperties": false,
|
||||
"properties": {
|
||||
"severity": {
|
||||
"type": "string",
|
||||
"enum": ["info", "warning", "error", "critical"],
|
||||
"description": "Issue severity level"
|
||||
},
|
||||
"category": {
|
||||
"type": "string",
|
||||
"enum": [
|
||||
"timing",
|
||||
"failed_tasks",
|
||||
"missing_agents",
|
||||
"resource_exhaustion",
|
||||
"network_partition",
|
||||
"credential_failure",
|
||||
"data_corruption",
|
||||
"unknown"
|
||||
],
|
||||
"description": "Issue category for automated handling"
|
||||
},
|
||||
"count": {
|
||||
"type": "integer",
|
||||
"minimum": 1,
|
||||
"description": "Number of occurrences of this issue type"
|
||||
},
|
||||
"description": {
|
||||
"type": "string",
|
||||
"maxLength": 512,
|
||||
"description": "Human-readable description of the issue"
|
||||
},
|
||||
"affected_agents": {
|
||||
"type": "array",
|
||||
"maxItems": 50,
|
||||
"description": "List of agent IDs affected by this issue",
|
||||
"items": {
|
||||
"type": "string",
|
||||
"pattern": "^[a-zA-Z0-9_:-]+$",
|
||||
"maxLength": 128
|
||||
}
|
||||
},
|
||||
"first_seen_beat": {
|
||||
"type": "integer",
|
||||
"minimum": 0,
|
||||
"description": "Beat index when this issue was first detected"
|
||||
},
|
||||
"last_seen_beat": {
|
||||
"type": "integer",
|
||||
"minimum": 0,
|
||||
"description": "Beat index when this issue was last seen"
|
||||
}
|
||||
}
|
||||
}
|
||||
},
|
||||
"performance": {
|
||||
"type": "object",
|
||||
"description": "Performance metrics for this reporting window",
|
||||
"additionalProperties": false,
|
||||
"properties": {
|
||||
"avg_response_time_ms": {
|
||||
"type": "number",
|
||||
"minimum": 0,
|
||||
"description": "Average response time for status claims in milliseconds"
|
||||
},
|
||||
"p95_response_time_ms": {
|
||||
"type": "number",
|
||||
"minimum": 0,
|
||||
"description": "95th percentile response time for status claims"
|
||||
},
|
||||
"total_tasks_completed": {
|
||||
"type": "integer",
|
||||
"minimum": 0,
|
||||
"description": "Total number of tasks completed during this window"
|
||||
},
|
||||
"total_tasks_failed": {
|
||||
"type": "integer",
|
||||
"minimum": 0,
|
||||
"description": "Total number of tasks that failed during this window"
|
||||
},
|
||||
"peak_concurrent_agents": {
|
||||
"type": "integer",
|
||||
"minimum": 0,
|
||||
"description": "Maximum number of agents active simultaneously"
|
||||
},
|
||||
"network_bytes_transferred": {
|
||||
"type": "integer",
|
||||
"minimum": 0,
|
||||
"description": "Total network bytes transferred by all agents"
|
||||
}
|
||||
}
|
||||
},
|
||||
"health_indicators": {
|
||||
"type": "object",
|
||||
"description": "Cluster health indicators",
|
||||
"additionalProperties": false,
|
||||
"properties": {
|
||||
"cluster_sync_score": {
|
||||
"type": "number",
|
||||
"minimum": 0.0,
|
||||
"maximum": 1.0,
|
||||
"description": "How well synchronized the cluster is (1.0 = perfect sync)"
|
||||
},
|
||||
"resource_utilization": {
|
||||
"type": "number",
|
||||
"minimum": 0.0,
|
||||
"maximum": 1.0,
|
||||
"description": "Average resource utilization across all agents"
|
||||
},
|
||||
"collaboration_efficiency": {
|
||||
"type": "number",
|
||||
"minimum": 0.0,
|
||||
"maximum": 1.0,
|
||||
"description": "How effectively agents are helping each other"
|
||||
},
|
||||
"error_rate": {
|
||||
"type": "number",
|
||||
"minimum": 0.0,
|
||||
"maximum": 1.0,
|
||||
"description": "Proportion of beats that had errors"
|
||||
}
|
||||
}
|
||||
},
|
||||
"metadata": {
|
||||
"type": "object",
|
||||
"description": "Optional metadata for extensions and debugging",
|
||||
"additionalProperties": true,
|
||||
"properties": {
|
||||
"reverb_version": {
|
||||
"type": "string",
|
||||
"description": "Version of the Reverb service generating this report"
|
||||
},
|
||||
"report_generation_time_ms": {
|
||||
"type": "number",
|
||||
"minimum": 0,
|
||||
"description": "Time taken to generate this report"
|
||||
},
|
||||
"next_window_id": {
|
||||
"type": "string",
|
||||
"pattern": "^[0-9a-fA-F]{32}$",
|
||||
"description": "Window ID for the next reporting period"
|
||||
}
|
||||
}
|
||||
}
|
||||
},
|
||||
"examples": [
|
||||
{
|
||||
"type": "backbeat.barreport.v1",
|
||||
"window_id": "7e9b0e6c4c9a4e59b7f2d9a3c1b2e4d5",
|
||||
"from_beat": 240,
|
||||
"to_beat": 359,
|
||||
"agents_reporting": 978,
|
||||
"on_time_reviews": 942,
|
||||
"help_promises_fulfilled": 87,
|
||||
"secret_rotations_ok": true,
|
||||
"tempo_drift_ms": 7.3,
|
||||
"issues": [
|
||||
{
|
||||
"severity": "warning",
|
||||
"category": "timing",
|
||||
"count": 12,
|
||||
"description": "Some agents consistently reporting 50ms+ late",
|
||||
"affected_agents": ["worker:batch-03", "indexer:shard-7"],
|
||||
"first_seen_beat": 245,
|
||||
"last_seen_beat": 358
|
||||
}
|
||||
],
|
||||
"performance": {
|
||||
"avg_response_time_ms": 45.2,
|
||||
"p95_response_time_ms": 125.7,
|
||||
"total_tasks_completed": 15678,
|
||||
"total_tasks_failed": 23,
|
||||
"peak_concurrent_agents": 1203,
|
||||
"network_bytes_transferred": 67890123
|
||||
},
|
||||
"health_indicators": {
|
||||
"cluster_sync_score": 0.94,
|
||||
"resource_utilization": 0.67,
|
||||
"collaboration_efficiency": 0.89,
|
||||
"error_rate": 0.001
|
||||
}
|
||||
},
|
||||
{
|
||||
"type": "backbeat.barreport.v1",
|
||||
"window_id": "a1b2c3d4e5f6789012345678901234ab",
|
||||
"from_beat": 0,
|
||||
"to_beat": 119,
|
||||
"agents_reporting": 150,
|
||||
"on_time_reviews": 147,
|
||||
"help_promises_fulfilled": 12,
|
||||
"secret_rotations_ok": true,
|
||||
"tempo_drift_ms": -2.1,
|
||||
"issues": []
|
||||
}
|
||||
]
|
||||
}
|
||||
121
BACKBEAT-prototype/contracts/schemas/beatframe-v1.schema.json
Normal file
121
BACKBEAT-prototype/contracts/schemas/beatframe-v1.schema.json
Normal file
@@ -0,0 +1,121 @@
|
||||
{
|
||||
"$schema": "http://json-schema.org/draft-07/schema#",
|
||||
"$id": "https://chorus.services/schemas/backbeat/beatframe/v1.0.0",
|
||||
"title": "BACKBEAT BeatFrame (INT-A)",
|
||||
"description": "Beat synchronization message broadcast from Pulse service to all BACKBEAT-enabled services",
|
||||
"version": "1.0.0",
|
||||
"type": "object",
|
||||
"required": [
|
||||
"type",
|
||||
"cluster_id",
|
||||
"beat_index",
|
||||
"downbeat",
|
||||
"phase",
|
||||
"hlc",
|
||||
"deadline_at",
|
||||
"tempo_bpm",
|
||||
"window_id"
|
||||
],
|
||||
"additionalProperties": false,
|
||||
"properties": {
|
||||
"type": {
|
||||
"type": "string",
|
||||
"const": "backbeat.beatframe.v1",
|
||||
"description": "Message type identifier for BeatFrame v1"
|
||||
},
|
||||
"cluster_id": {
|
||||
"type": "string",
|
||||
"pattern": "^[a-zA-Z0-9_-]+$",
|
||||
"minLength": 1,
|
||||
"maxLength": 64,
|
||||
"description": "Unique identifier for the BACKBEAT cluster"
|
||||
},
|
||||
"beat_index": {
|
||||
"type": "integer",
|
||||
"minimum": 0,
|
||||
"maximum": 9223372036854775807,
|
||||
"description": "Monotonically increasing beat counter since cluster start"
|
||||
},
|
||||
"downbeat": {
|
||||
"type": "boolean",
|
||||
"description": "True if this is the first beat of a new bar (every 120 beats by default)"
|
||||
},
|
||||
"phase": {
|
||||
"type": "string",
|
||||
"enum": ["plan", "execute", "review"],
|
||||
"description": "Current phase within the beat cycle"
|
||||
},
|
||||
"hlc": {
|
||||
"type": "string",
|
||||
"pattern": "^[0-9a-fA-F]{4}:[0-9a-fA-F]{4}:[0-9a-fA-F]{4}$",
|
||||
"description": "Hybrid Logical Clock timestamp for causal ordering (format: wall:logical:node)"
|
||||
},
|
||||
"deadline_at": {
|
||||
"type": "string",
|
||||
"format": "date-time",
|
||||
"description": "ISO 8601 timestamp when this beat phase must complete"
|
||||
},
|
||||
"tempo_bpm": {
|
||||
"type": "number",
|
||||
"minimum": 0.1,
|
||||
"maximum": 1000,
|
||||
"multipleOf": 0.1,
|
||||
"description": "Current tempo in beats per minute (default: 2.0 for 30-second beats)"
|
||||
},
|
||||
"window_id": {
|
||||
"type": "string",
|
||||
"pattern": "^[0-9a-fA-F]{32}$",
|
||||
"description": "Unique identifier for the current reporting window (changes every bar)"
|
||||
},
|
||||
"metadata": {
|
||||
"type": "object",
|
||||
"description": "Optional metadata for extensions and debugging",
|
||||
"additionalProperties": true,
|
||||
"properties": {
|
||||
"pulse_version": {
|
||||
"type": "string",
|
||||
"description": "Version of the Pulse service generating this beat"
|
||||
},
|
||||
"cluster_health": {
|
||||
"type": "string",
|
||||
"enum": ["healthy", "degraded", "critical"],
|
||||
"description": "Overall cluster health status"
|
||||
},
|
||||
"expected_agents": {
|
||||
"type": "integer",
|
||||
"minimum": 0,
|
||||
"description": "Number of agents expected to participate in this beat"
|
||||
}
|
||||
}
|
||||
}
|
||||
},
|
||||
"examples": [
|
||||
{
|
||||
"type": "backbeat.beatframe.v1",
|
||||
"cluster_id": "chorus-prod",
|
||||
"beat_index": 1337,
|
||||
"downbeat": false,
|
||||
"phase": "execute",
|
||||
"hlc": "7ffd:0001:abcd",
|
||||
"deadline_at": "2025-09-05T12:30:00Z",
|
||||
"tempo_bpm": 2.0,
|
||||
"window_id": "7e9b0e6c4c9a4e59b7f2d9a3c1b2e4d5",
|
||||
"metadata": {
|
||||
"pulse_version": "1.2.3",
|
||||
"cluster_health": "healthy",
|
||||
"expected_agents": 150
|
||||
}
|
||||
},
|
||||
{
|
||||
"type": "backbeat.beatframe.v1",
|
||||
"cluster_id": "dev-cluster",
|
||||
"beat_index": 0,
|
||||
"downbeat": true,
|
||||
"phase": "plan",
|
||||
"hlc": "0001:0000:cafe",
|
||||
"deadline_at": "2025-09-05T12:00:30Z",
|
||||
"tempo_bpm": 4.0,
|
||||
"window_id": "a1b2c3d4e5f6789012345678901234ab"
|
||||
}
|
||||
]
|
||||
}
|
||||
181
BACKBEAT-prototype/contracts/schemas/statusclaim-v1.schema.json
Normal file
181
BACKBEAT-prototype/contracts/schemas/statusclaim-v1.schema.json
Normal file
@@ -0,0 +1,181 @@
|
||||
{
|
||||
"$schema": "http://json-schema.org/draft-07/schema#",
|
||||
"$id": "https://chorus.services/schemas/backbeat/statusclaim/v1.0.0",
|
||||
"title": "BACKBEAT StatusClaim (INT-B)",
|
||||
"description": "Status update message sent from agents to Reverb service during beat execution",
|
||||
"version": "1.0.0",
|
||||
"type": "object",
|
||||
"required": [
|
||||
"type",
|
||||
"agent_id",
|
||||
"beat_index",
|
||||
"state",
|
||||
"hlc"
|
||||
],
|
||||
"additionalProperties": false,
|
||||
"properties": {
|
||||
"type": {
|
||||
"type": "string",
|
||||
"const": "backbeat.statusclaim.v1",
|
||||
"description": "Message type identifier for StatusClaim v1"
|
||||
},
|
||||
"agent_id": {
|
||||
"type": "string",
|
||||
"pattern": "^[a-zA-Z0-9_:-]+$",
|
||||
"minLength": 1,
|
||||
"maxLength": 128,
|
||||
"description": "Unique identifier for the reporting agent (format: service:instance or agent:id)"
|
||||
},
|
||||
"task_id": {
|
||||
"type": "string",
|
||||
"pattern": "^[a-zA-Z0-9_:-]+$",
|
||||
"minLength": 1,
|
||||
"maxLength": 128,
|
||||
"description": "Optional task identifier if agent is working on a specific task"
|
||||
},
|
||||
"beat_index": {
|
||||
"type": "integer",
|
||||
"minimum": 0,
|
||||
"maximum": 9223372036854775807,
|
||||
"description": "Beat index this status claim refers to (must match current or recent BeatFrame)"
|
||||
},
|
||||
"state": {
|
||||
"type": "string",
|
||||
"enum": [
|
||||
"idle",
|
||||
"planning",
|
||||
"executing",
|
||||
"reviewing",
|
||||
"completed",
|
||||
"failed",
|
||||
"blocked",
|
||||
"helping"
|
||||
],
|
||||
"description": "Current state of the agent"
|
||||
},
|
||||
"beats_left": {
|
||||
"type": "integer",
|
||||
"minimum": 0,
|
||||
"maximum": 1000,
|
||||
"description": "Estimated number of beats needed to complete current work (0 = done this beat)"
|
||||
},
|
||||
"progress": {
|
||||
"type": "number",
|
||||
"minimum": 0.0,
|
||||
"maximum": 1.0,
|
||||
"description": "Progress percentage for current task/phase (0.0 = not started, 1.0 = complete)"
|
||||
},
|
||||
"notes": {
|
||||
"type": "string",
|
||||
"maxLength": 256,
|
||||
"description": "Brief human-readable status description or error message"
|
||||
},
|
||||
"hlc": {
|
||||
"type": "string",
|
||||
"pattern": "^[0-9a-fA-F]{4}:[0-9a-fA-F]{4}:[0-9a-fA-F]{4}$",
|
||||
"description": "Hybrid Logical Clock timestamp from the agent"
|
||||
},
|
||||
"resources": {
|
||||
"type": "object",
|
||||
"description": "Optional resource utilization information",
|
||||
"additionalProperties": false,
|
||||
"properties": {
|
||||
"cpu_percent": {
|
||||
"type": "number",
|
||||
"minimum": 0.0,
|
||||
"maximum": 100.0,
|
||||
"description": "CPU utilization percentage"
|
||||
},
|
||||
"memory_mb": {
|
||||
"type": "integer",
|
||||
"minimum": 0,
|
||||
"description": "Memory usage in megabytes"
|
||||
},
|
||||
"disk_io_ops": {
|
||||
"type": "integer",
|
||||
"minimum": 0,
|
||||
"description": "Disk I/O operations since last beat"
|
||||
},
|
||||
"network_kb": {
|
||||
"type": "integer",
|
||||
"minimum": 0,
|
||||
"description": "Network traffic in kilobytes since last beat"
|
||||
}
|
||||
}
|
||||
},
|
||||
"dependencies": {
|
||||
"type": "array",
|
||||
"maxItems": 50,
|
||||
"description": "List of agent IDs this agent is waiting on or helping",
|
||||
"items": {
|
||||
"type": "string",
|
||||
"pattern": "^[a-zA-Z0-9_:-]+$",
|
||||
"maxLength": 128
|
||||
}
|
||||
},
|
||||
"metadata": {
|
||||
"type": "object",
|
||||
"description": "Optional metadata for extensions and debugging",
|
||||
"additionalProperties": true,
|
||||
"properties": {
|
||||
"agent_version": {
|
||||
"type": "string",
|
||||
"description": "Version of the agent software"
|
||||
},
|
||||
"error_code": {
|
||||
"type": "string",
|
||||
"description": "Structured error code if state is 'failed'"
|
||||
},
|
||||
"retry_count": {
|
||||
"type": "integer",
|
||||
"minimum": 0,
|
||||
"description": "Number of retries attempted for current task"
|
||||
}
|
||||
}
|
||||
}
|
||||
},
|
||||
"examples": [
|
||||
{
|
||||
"type": "backbeat.statusclaim.v1",
|
||||
"agent_id": "search-indexer:worker-03",
|
||||
"task_id": "index-batch:20250905-120",
|
||||
"beat_index": 1337,
|
||||
"state": "executing",
|
||||
"beats_left": 3,
|
||||
"progress": 0.65,
|
||||
"notes": "processing batch 120/200",
|
||||
"hlc": "7ffd:0001:beef",
|
||||
"resources": {
|
||||
"cpu_percent": 85.0,
|
||||
"memory_mb": 2048,
|
||||
"disk_io_ops": 1250,
|
||||
"network_kb": 512
|
||||
}
|
||||
},
|
||||
{
|
||||
"type": "backbeat.statusclaim.v1",
|
||||
"agent_id": "agent:backup-runner",
|
||||
"beat_index": 1338,
|
||||
"state": "failed",
|
||||
"beats_left": 0,
|
||||
"progress": 0.0,
|
||||
"notes": "connection timeout to storage backend",
|
||||
"hlc": "7ffe:0002:dead",
|
||||
"metadata": {
|
||||
"agent_version": "2.1.0",
|
||||
"error_code": "STORAGE_TIMEOUT",
|
||||
"retry_count": 3
|
||||
}
|
||||
},
|
||||
{
|
||||
"type": "backbeat.statusclaim.v1",
|
||||
"agent_id": "ml-trainer:gpu-node-1",
|
||||
"beat_index": 1336,
|
||||
"state": "helping",
|
||||
"progress": 1.0,
|
||||
"notes": "completed own work, assisting node-2 with large model",
|
||||
"hlc": "7ffc:0005:cafe",
|
||||
"dependencies": ["ml-trainer:gpu-node-2"]
|
||||
}
|
||||
]
|
||||
}
|
||||
533
BACKBEAT-prototype/contracts/tests/conformance_test.go
Normal file
533
BACKBEAT-prototype/contracts/tests/conformance_test.go
Normal file
@@ -0,0 +1,533 @@
|
||||
package tests
|
||||
|
||||
import (
|
||||
"encoding/json"
|
||||
"fmt"
|
||||
"os"
|
||||
"path/filepath"
|
||||
"strings"
|
||||
"testing"
|
||||
"time"
|
||||
|
||||
"github.com/xeipuuv/gojsonschema"
|
||||
)
|
||||
|
||||
// MessageTypes defines the three core BACKBEAT interfaces
|
||||
const (
|
||||
BeatFrameType = "backbeat.beatframe.v1"
|
||||
StatusClaimType = "backbeat.statusclaim.v1"
|
||||
BarReportType = "backbeat.barreport.v1"
|
||||
)
|
||||
|
||||
// BeatFrame represents INT-A: Pulse → All Services
|
||||
type BeatFrame struct {
|
||||
Type string `json:"type"`
|
||||
ClusterID string `json:"cluster_id"`
|
||||
BeatIndex int64 `json:"beat_index"`
|
||||
Downbeat bool `json:"downbeat"`
|
||||
Phase string `json:"phase"`
|
||||
HLC string `json:"hlc"`
|
||||
DeadlineAt time.Time `json:"deadline_at"`
|
||||
TempoBPM float64 `json:"tempo_bpm"`
|
||||
WindowID string `json:"window_id"`
|
||||
Metadata map[string]interface{} `json:"metadata,omitempty"`
|
||||
}
|
||||
|
||||
// StatusClaim represents INT-B: Agents → Reverb
|
||||
type StatusClaim struct {
|
||||
Type string `json:"type"`
|
||||
AgentID string `json:"agent_id"`
|
||||
TaskID string `json:"task_id,omitempty"`
|
||||
BeatIndex int64 `json:"beat_index"`
|
||||
State string `json:"state"`
|
||||
BeatsLeft int `json:"beats_left,omitempty"`
|
||||
Progress float64 `json:"progress,omitempty"`
|
||||
Notes string `json:"notes,omitempty"`
|
||||
HLC string `json:"hlc"`
|
||||
Resources map[string]interface{} `json:"resources,omitempty"`
|
||||
Dependencies []string `json:"dependencies,omitempty"`
|
||||
Metadata map[string]interface{} `json:"metadata,omitempty"`
|
||||
}
|
||||
|
||||
// BarReport represents INT-C: Reverb → All Services
|
||||
type BarReport struct {
|
||||
Type string `json:"type"`
|
||||
WindowID string `json:"window_id"`
|
||||
FromBeat int64 `json:"from_beat"`
|
||||
ToBeat int64 `json:"to_beat"`
|
||||
AgentsReporting int `json:"agents_reporting"`
|
||||
OnTimeReviews int `json:"on_time_reviews"`
|
||||
HelpPromisesFulfilled int `json:"help_promises_fulfilled"`
|
||||
SecretRotationsOK bool `json:"secret_rotations_ok"`
|
||||
TempoDriftMS float64 `json:"tempo_drift_ms"`
|
||||
Issues []map[string]interface{} `json:"issues,omitempty"`
|
||||
Performance map[string]interface{} `json:"performance,omitempty"`
|
||||
HealthIndicators map[string]interface{} `json:"health_indicators,omitempty"`
|
||||
Metadata map[string]interface{} `json:"metadata,omitempty"`
|
||||
}
|
||||
|
||||
// TestSchemaValidation tests that all JSON schemas are valid and messages conform
|
||||
func TestSchemaValidation(t *testing.T) {
|
||||
schemaDir := "../schemas"
|
||||
|
||||
tests := []struct {
|
||||
name string
|
||||
schemaFile string
|
||||
validMsgs []interface{}
|
||||
invalidMsgs []map[string]interface{}
|
||||
}{
|
||||
{
|
||||
name: "BeatFrame Schema Validation",
|
||||
schemaFile: "beatframe-v1.schema.json",
|
||||
validMsgs: []interface{}{
|
||||
BeatFrame{
|
||||
Type: BeatFrameType,
|
||||
ClusterID: "test-cluster",
|
||||
BeatIndex: 100,
|
||||
Downbeat: false,
|
||||
Phase: "execute",
|
||||
HLC: "7ffd:0001:abcd",
|
||||
DeadlineAt: time.Now().Add(30 * time.Second),
|
||||
TempoBPM: 2.0,
|
||||
WindowID: "7e9b0e6c4c9a4e59b7f2d9a3c1b2e4d5",
|
||||
},
|
||||
BeatFrame{
|
||||
Type: BeatFrameType,
|
||||
ClusterID: "prod",
|
||||
BeatIndex: 0,
|
||||
Downbeat: true,
|
||||
Phase: "plan",
|
||||
HLC: "0001:0000:cafe",
|
||||
DeadlineAt: time.Now().Add(15 * time.Second),
|
||||
TempoBPM: 4.0,
|
||||
WindowID: "a1b2c3d4e5f6789012345678901234ab",
|
||||
Metadata: map[string]interface{}{
|
||||
"pulse_version": "1.0.0",
|
||||
"cluster_health": "healthy",
|
||||
},
|
||||
},
|
||||
},
|
||||
invalidMsgs: []map[string]interface{}{
|
||||
// Missing required fields
|
||||
{
|
||||
"type": BeatFrameType,
|
||||
"cluster_id": "test",
|
||||
// missing beat_index, downbeat, phase, etc.
|
||||
},
|
||||
// Invalid phase
|
||||
{
|
||||
"type": BeatFrameType,
|
||||
"cluster_id": "test",
|
||||
"beat_index": 0,
|
||||
"downbeat": false,
|
||||
"phase": "invalid_phase",
|
||||
"hlc": "7ffd:0001:abcd",
|
||||
"deadline_at": "2025-09-05T12:00:00Z",
|
||||
"tempo_bpm": 2.0,
|
||||
"window_id": "7e9b0e6c4c9a4e59b7f2d9a3c1b2e4d5",
|
||||
},
|
||||
// Invalid HLC format
|
||||
{
|
||||
"type": BeatFrameType,
|
||||
"cluster_id": "test",
|
||||
"beat_index": 0,
|
||||
"downbeat": false,
|
||||
"phase": "plan",
|
||||
"hlc": "invalid-hlc-format",
|
||||
"deadline_at": "2025-09-05T12:00:00Z",
|
||||
"tempo_bpm": 2.0,
|
||||
"window_id": "7e9b0e6c4c9a4e59b7f2d9a3c1b2e4d5",
|
||||
},
|
||||
},
|
||||
},
|
||||
{
|
||||
name: "StatusClaim Schema Validation",
|
||||
schemaFile: "statusclaim-v1.schema.json",
|
||||
validMsgs: []interface{}{
|
||||
StatusClaim{
|
||||
Type: StatusClaimType,
|
||||
AgentID: "worker:test-01",
|
||||
TaskID: "task:123",
|
||||
BeatIndex: 100,
|
||||
State: "executing",
|
||||
BeatsLeft: 3,
|
||||
Progress: 0.5,
|
||||
Notes: "processing batch",
|
||||
HLC: "7ffd:0001:beef",
|
||||
},
|
||||
StatusClaim{
|
||||
Type: StatusClaimType,
|
||||
AgentID: "agent:backup",
|
||||
BeatIndex: 101,
|
||||
State: "idle",
|
||||
HLC: "7ffe:0002:dead",
|
||||
Resources: map[string]interface{}{
|
||||
"cpu_percent": 25.0,
|
||||
"memory_mb": 512,
|
||||
},
|
||||
},
|
||||
},
|
||||
invalidMsgs: []map[string]interface{}{
|
||||
// Missing required fields
|
||||
{
|
||||
"type": StatusClaimType,
|
||||
"agent_id": "test",
|
||||
// missing beat_index, state, hlc
|
||||
},
|
||||
// Invalid state
|
||||
{
|
||||
"type": StatusClaimType,
|
||||
"agent_id": "test",
|
||||
"beat_index": 0,
|
||||
"state": "invalid_state",
|
||||
"hlc": "7ffd:0001:abcd",
|
||||
},
|
||||
// Negative progress
|
||||
{
|
||||
"type": StatusClaimType,
|
||||
"agent_id": "test",
|
||||
"beat_index": 0,
|
||||
"state": "executing",
|
||||
"progress": -0.1,
|
||||
"hlc": "7ffd:0001:abcd",
|
||||
},
|
||||
},
|
||||
},
|
||||
{
|
||||
name: "BarReport Schema Validation",
|
||||
schemaFile: "barreport-v1.schema.json",
|
||||
validMsgs: []interface{}{
|
||||
BarReport{
|
||||
Type: BarReportType,
|
||||
WindowID: "7e9b0e6c4c9a4e59b7f2d9a3c1b2e4d5",
|
||||
FromBeat: 0,
|
||||
ToBeat: 119,
|
||||
AgentsReporting: 150,
|
||||
OnTimeReviews: 147,
|
||||
HelpPromisesFulfilled: 12,
|
||||
SecretRotationsOK: true,
|
||||
TempoDriftMS: -2.1,
|
||||
},
|
||||
BarReport{
|
||||
Type: BarReportType,
|
||||
WindowID: "a1b2c3d4e5f6789012345678901234ab",
|
||||
FromBeat: 120,
|
||||
ToBeat: 239,
|
||||
AgentsReporting: 200,
|
||||
OnTimeReviews: 195,
|
||||
HelpPromisesFulfilled: 25,
|
||||
SecretRotationsOK: false,
|
||||
TempoDriftMS: 15.7,
|
||||
Issues: []map[string]interface{}{
|
||||
{
|
||||
"severity": "warning",
|
||||
"category": "timing",
|
||||
"count": 5,
|
||||
"description": "Some agents running late",
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
invalidMsgs: []map[string]interface{}{
|
||||
// Missing required fields
|
||||
{
|
||||
"type": BarReportType,
|
||||
"window_id": "7e9b0e6c4c9a4e59b7f2d9a3c1b2e4d5",
|
||||
// missing from_beat, to_beat, etc.
|
||||
},
|
||||
// Invalid window_id format
|
||||
{
|
||||
"type": BarReportType,
|
||||
"window_id": "invalid-window-id",
|
||||
"from_beat": 0,
|
||||
"to_beat": 119,
|
||||
"agents_reporting": 150,
|
||||
"on_time_reviews": 147,
|
||||
"help_promises_fulfilled": 12,
|
||||
"secret_rotations_ok": true,
|
||||
"tempo_drift_ms": 0.0,
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
|
||||
for _, tt := range tests {
|
||||
t.Run(tt.name, func(t *testing.T) {
|
||||
// Load schema
|
||||
schemaPath := filepath.Join(schemaDir, tt.schemaFile)
|
||||
schemaLoader := gojsonschema.NewReferenceLoader("file://" + schemaPath)
|
||||
|
||||
// Test valid messages
|
||||
for i, validMsg := range tt.validMsgs {
|
||||
t.Run(fmt.Sprintf("Valid_%d", i), func(t *testing.T) {
|
||||
msgBytes, err := json.Marshal(validMsg)
|
||||
if err != nil {
|
||||
t.Fatalf("Failed to marshal valid message: %v", err)
|
||||
}
|
||||
|
||||
docLoader := gojsonschema.NewBytesLoader(msgBytes)
|
||||
result, err := gojsonschema.Validate(schemaLoader, docLoader)
|
||||
if err != nil {
|
||||
t.Fatalf("Schema validation failed: %v", err)
|
||||
}
|
||||
|
||||
if !result.Valid() {
|
||||
t.Errorf("Valid message failed validation: %v", result.Errors())
|
||||
}
|
||||
})
|
||||
}
|
||||
|
||||
// Test invalid messages
|
||||
for i, invalidMsg := range tt.invalidMsgs {
|
||||
t.Run(fmt.Sprintf("Invalid_%d", i), func(t *testing.T) {
|
||||
msgBytes, err := json.Marshal(invalidMsg)
|
||||
if err != nil {
|
||||
t.Fatalf("Failed to marshal invalid message: %v", err)
|
||||
}
|
||||
|
||||
docLoader := gojsonschema.NewBytesLoader(msgBytes)
|
||||
result, err := gojsonschema.Validate(schemaLoader, docLoader)
|
||||
if err != nil {
|
||||
t.Fatalf("Schema validation failed: %v", err)
|
||||
}
|
||||
|
||||
if result.Valid() {
|
||||
t.Errorf("Invalid message passed validation when it should have failed")
|
||||
}
|
||||
})
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
// TestMessageParsing tests that messages can be correctly parsed from JSON
|
||||
func TestMessageParsing(t *testing.T) {
|
||||
tests := []struct {
|
||||
name string
|
||||
jsonStr string
|
||||
expected interface{}
|
||||
}{
|
||||
{
|
||||
name: "Parse BeatFrame",
|
||||
jsonStr: `{
|
||||
"type": "backbeat.beatframe.v1",
|
||||
"cluster_id": "test",
|
||||
"beat_index": 123,
|
||||
"downbeat": true,
|
||||
"phase": "review",
|
||||
"hlc": "7ffd:0001:abcd",
|
||||
"deadline_at": "2025-09-05T12:00:00Z",
|
||||
"tempo_bpm": 2.5,
|
||||
"window_id": "7e9b0e6c4c9a4e59b7f2d9a3c1b2e4d5"
|
||||
}`,
|
||||
expected: BeatFrame{
|
||||
Type: BeatFrameType,
|
||||
ClusterID: "test",
|
||||
BeatIndex: 123,
|
||||
Downbeat: true,
|
||||
Phase: "review",
|
||||
HLC: "7ffd:0001:abcd",
|
||||
TempoBPM: 2.5,
|
||||
WindowID: "7e9b0e6c4c9a4e59b7f2d9a3c1b2e4d5",
|
||||
},
|
||||
},
|
||||
{
|
||||
name: "Parse StatusClaim",
|
||||
jsonStr: `{
|
||||
"type": "backbeat.statusclaim.v1",
|
||||
"agent_id": "worker:01",
|
||||
"beat_index": 456,
|
||||
"state": "completed",
|
||||
"progress": 1.0,
|
||||
"hlc": "7ffe:0002:beef"
|
||||
}`,
|
||||
expected: StatusClaim{
|
||||
Type: StatusClaimType,
|
||||
AgentID: "worker:01",
|
||||
BeatIndex: 456,
|
||||
State: "completed",
|
||||
Progress: 1.0,
|
||||
HLC: "7ffe:0002:beef",
|
||||
},
|
||||
},
|
||||
}
|
||||
|
||||
for _, tt := range tests {
|
||||
t.Run(tt.name, func(t *testing.T) {
|
||||
switch expected := tt.expected.(type) {
|
||||
case BeatFrame:
|
||||
var parsed BeatFrame
|
||||
err := json.Unmarshal([]byte(tt.jsonStr), &parsed)
|
||||
if err != nil {
|
||||
t.Fatalf("Failed to parse BeatFrame: %v", err)
|
||||
}
|
||||
|
||||
if parsed.Type != expected.Type ||
|
||||
parsed.ClusterID != expected.ClusterID ||
|
||||
parsed.BeatIndex != expected.BeatIndex {
|
||||
t.Errorf("Parsed BeatFrame doesn't match expected")
|
||||
}
|
||||
|
||||
case StatusClaim:
|
||||
var parsed StatusClaim
|
||||
err := json.Unmarshal([]byte(tt.jsonStr), &parsed)
|
||||
if err != nil {
|
||||
t.Fatalf("Failed to parse StatusClaim: %v", err)
|
||||
}
|
||||
|
||||
if parsed.Type != expected.Type ||
|
||||
parsed.AgentID != expected.AgentID ||
|
||||
parsed.State != expected.State {
|
||||
t.Errorf("Parsed StatusClaim doesn't match expected")
|
||||
}
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
// TestHLCValidation tests Hybrid Logical Clock format validation
|
||||
func TestHLCValidation(t *testing.T) {
|
||||
validHLCs := []string{
|
||||
"0000:0000:0000",
|
||||
"7ffd:0001:abcd",
|
||||
"FFFF:FFFF:FFFF",
|
||||
"1234:5678:90ab",
|
||||
}
|
||||
|
||||
invalidHLCs := []string{
|
||||
"invalid",
|
||||
"7ffd:0001", // too short
|
||||
"7ffd:0001:abcd:ef", // too long
|
||||
"gggg:0001:abcd", // invalid hex
|
||||
"7ffd:0001:abcdz", // invalid hex
|
||||
}
|
||||
|
||||
for _, hlc := range validHLCs {
|
||||
t.Run(fmt.Sprintf("Valid_%s", hlc), func(t *testing.T) {
|
||||
if !isValidHLC(hlc) {
|
||||
t.Errorf("Valid HLC %s was rejected", hlc)
|
||||
}
|
||||
})
|
||||
}
|
||||
|
||||
for _, hlc := range invalidHLCs {
|
||||
t.Run(fmt.Sprintf("Invalid_%s", hlc), func(t *testing.T) {
|
||||
if isValidHLC(hlc) {
|
||||
t.Errorf("Invalid HLC %s was accepted", hlc)
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
// TestWindowIDValidation tests window ID format validation
|
||||
func TestWindowIDValidation(t *testing.T) {
|
||||
validWindowIDs := []string{
|
||||
"7e9b0e6c4c9a4e59b7f2d9a3c1b2e4d5",
|
||||
"a1b2c3d4e5f6789012345678901234ab",
|
||||
"00000000000000000000000000000000",
|
||||
"FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF",
|
||||
}
|
||||
|
||||
invalidWindowIDs := []string{
|
||||
"invalid",
|
||||
"7e9b0e6c4c9a4e59b7f2d9a3c1b2e4d", // too short
|
||||
"7e9b0e6c4c9a4e59b7f2d9a3c1b2e4d55", // too long
|
||||
"7e9b0e6c4c9a4e59b7f2d9a3c1b2e4g5", // invalid hex
|
||||
}
|
||||
|
||||
for _, windowID := range validWindowIDs {
|
||||
t.Run(fmt.Sprintf("Valid_%s", windowID), func(t *testing.T) {
|
||||
if !isValidWindowID(windowID) {
|
||||
t.Errorf("Valid window ID %s was rejected", windowID)
|
||||
}
|
||||
})
|
||||
}
|
||||
|
||||
for _, windowID := range invalidWindowIDs {
|
||||
t.Run(fmt.Sprintf("Invalid_%s", windowID), func(t *testing.T) {
|
||||
if isValidWindowID(windowID) {
|
||||
t.Errorf("Invalid window ID %s was accepted", windowID)
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
// Helper functions for validation
|
||||
func isValidHLC(hlc string) bool {
|
||||
parts := strings.Split(hlc, ":")
|
||||
if len(parts) != 3 {
|
||||
return false
|
||||
}
|
||||
|
||||
for _, part := range parts {
|
||||
if len(part) != 4 {
|
||||
return false
|
||||
}
|
||||
for _, char := range part {
|
||||
if !((char >= '0' && char <= '9') || (char >= 'a' && char <= 'f') || (char >= 'A' && char <= 'F')) {
|
||||
return false
|
||||
}
|
||||
}
|
||||
}
|
||||
return true
|
||||
}
|
||||
|
||||
func isValidWindowID(windowID string) bool {
|
||||
if len(windowID) != 32 {
|
||||
return false
|
||||
}
|
||||
|
||||
for _, char := range windowID {
|
||||
if !((char >= '0' && char <= '9') || (char >= 'a' && char <= 'f') || (char >= 'A' && char <= 'F')) {
|
||||
return false
|
||||
}
|
||||
}
|
||||
return true
|
||||
}
|
||||
|
||||
// BenchmarkSchemaValidation benchmarks schema validation performance
|
||||
func BenchmarkSchemaValidation(b *testing.B) {
|
||||
schemaDir := "../schemas"
|
||||
schemaPath := filepath.Join(schemaDir, "beatframe-v1.schema.json")
|
||||
schemaLoader := gojsonschema.NewReferenceLoader("file://" + schemaPath)
|
||||
|
||||
beatFrame := BeatFrame{
|
||||
Type: BeatFrameType,
|
||||
ClusterID: "benchmark",
|
||||
BeatIndex: 1000,
|
||||
Downbeat: false,
|
||||
Phase: "execute",
|
||||
HLC: "7ffd:0001:abcd",
|
||||
DeadlineAt: time.Now().Add(30 * time.Second),
|
||||
TempoBPM: 2.0,
|
||||
WindowID: "7e9b0e6c4c9a4e59b7f2d9a3c1b2e4d5",
|
||||
}
|
||||
|
||||
msgBytes, _ := json.Marshal(beatFrame)
|
||||
docLoader := gojsonschema.NewBytesLoader(msgBytes)
|
||||
|
||||
b.ResetTimer()
|
||||
for i := 0; i < b.N; i++ {
|
||||
result, err := gojsonschema.Validate(schemaLoader, docLoader)
|
||||
if err != nil || !result.Valid() {
|
||||
b.Fatal("Validation failed")
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Helper function to check if schema files exist
|
||||
func TestSchemaFilesExist(t *testing.T) {
|
||||
schemaDir := "../schemas"
|
||||
requiredSchemas := []string{
|
||||
"beatframe-v1.schema.json",
|
||||
"statusclaim-v1.schema.json",
|
||||
"barreport-v1.schema.json",
|
||||
}
|
||||
|
||||
for _, schema := range requiredSchemas {
|
||||
schemaPath := filepath.Join(schemaDir, schema)
|
||||
if _, err := os.Stat(schemaPath); os.IsNotExist(err) {
|
||||
t.Errorf("Required schema file %s does not exist", schemaPath)
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,275 @@
|
||||
[
|
||||
{
|
||||
"description": "Missing required field 'from_beat'",
|
||||
"message": {
|
||||
"type": "backbeat.barreport.v1",
|
||||
"window_id": "7e9b0e6c4c9a4e59b7f2d9a3c1b2e4d5",
|
||||
"to_beat": 119,
|
||||
"agents_reporting": 150,
|
||||
"on_time_reviews": 147,
|
||||
"help_promises_fulfilled": 12,
|
||||
"secret_rotations_ok": true,
|
||||
"tempo_drift_ms": -2.1
|
||||
},
|
||||
"expected_errors": ["from_beat is required"]
|
||||
},
|
||||
{
|
||||
"description": "Missing required field 'agents_reporting'",
|
||||
"message": {
|
||||
"type": "backbeat.barreport.v1",
|
||||
"window_id": "7e9b0e6c4c9a4e59b7f2d9a3c1b2e4d5",
|
||||
"from_beat": 0,
|
||||
"to_beat": 119,
|
||||
"on_time_reviews": 147,
|
||||
"help_promises_fulfilled": 12,
|
||||
"secret_rotations_ok": true,
|
||||
"tempo_drift_ms": -2.1
|
||||
},
|
||||
"expected_errors": ["agents_reporting is required"]
|
||||
},
|
||||
{
|
||||
"description": "Invalid window_id format (too short)",
|
||||
"message": {
|
||||
"type": "backbeat.barreport.v1",
|
||||
"window_id": "7e9b0e6c4c9a4e59b7f2d9a3c1b2e4d",
|
||||
"from_beat": 0,
|
||||
"to_beat": 119,
|
||||
"agents_reporting": 150,
|
||||
"on_time_reviews": 147,
|
||||
"help_promises_fulfilled": 12,
|
||||
"secret_rotations_ok": true,
|
||||
"tempo_drift_ms": -2.1
|
||||
},
|
||||
"expected_errors": ["window_id must be exactly 32 hex characters"]
|
||||
},
|
||||
{
|
||||
"description": "Invalid window_id format (non-hex characters)",
|
||||
"message": {
|
||||
"type": "backbeat.barreport.v1",
|
||||
"window_id": "7e9b0e6c4c9a4e59b7f2d9a3c1b2e4g5",
|
||||
"from_beat": 0,
|
||||
"to_beat": 119,
|
||||
"agents_reporting": 150,
|
||||
"on_time_reviews": 147,
|
||||
"help_promises_fulfilled": 12,
|
||||
"secret_rotations_ok": true,
|
||||
"tempo_drift_ms": -2.1
|
||||
},
|
||||
"expected_errors": ["window_id must match pattern ^[0-9a-fA-F]{32}$"]
|
||||
},
|
||||
{
|
||||
"description": "Negative from_beat",
|
||||
"message": {
|
||||
"type": "backbeat.barreport.v1",
|
||||
"window_id": "7e9b0e6c4c9a4e59b7f2d9a3c1b2e4d5",
|
||||
"from_beat": -1,
|
||||
"to_beat": 119,
|
||||
"agents_reporting": 150,
|
||||
"on_time_reviews": 147,
|
||||
"help_promises_fulfilled": 12,
|
||||
"secret_rotations_ok": true,
|
||||
"tempo_drift_ms": -2.1
|
||||
},
|
||||
"expected_errors": ["from_beat must be >= 0"]
|
||||
},
|
||||
{
|
||||
"description": "Negative agents_reporting",
|
||||
"message": {
|
||||
"type": "backbeat.barreport.v1",
|
||||
"window_id": "7e9b0e6c4c9a4e59b7f2d9a3c1b2e4d5",
|
||||
"from_beat": 0,
|
||||
"to_beat": 119,
|
||||
"agents_reporting": -1,
|
||||
"on_time_reviews": 147,
|
||||
"help_promises_fulfilled": 12,
|
||||
"secret_rotations_ok": true,
|
||||
"tempo_drift_ms": -2.1
|
||||
},
|
||||
"expected_errors": ["agents_reporting must be >= 0"]
|
||||
},
|
||||
{
|
||||
"description": "Negative on_time_reviews",
|
||||
"message": {
|
||||
"type": "backbeat.barreport.v1",
|
||||
"window_id": "7e9b0e6c4c9a4e59b7f2d9a3c1b2e4d5",
|
||||
"from_beat": 0,
|
||||
"to_beat": 119,
|
||||
"agents_reporting": 150,
|
||||
"on_time_reviews": -1,
|
||||
"help_promises_fulfilled": 12,
|
||||
"secret_rotations_ok": true,
|
||||
"tempo_drift_ms": -2.1
|
||||
},
|
||||
"expected_errors": ["on_time_reviews must be >= 0"]
|
||||
},
|
||||
{
|
||||
"description": "Too many issues (over 100)",
|
||||
"message": {
|
||||
"type": "backbeat.barreport.v1",
|
||||
"window_id": "7e9b0e6c4c9a4e59b7f2d9a3c1b2e4d5",
|
||||
"from_beat": 0,
|
||||
"to_beat": 119,
|
||||
"agents_reporting": 150,
|
||||
"on_time_reviews": 147,
|
||||
"help_promises_fulfilled": 12,
|
||||
"secret_rotations_ok": true,
|
||||
"tempo_drift_ms": -2.1,
|
||||
"issues": []
|
||||
},
|
||||
"note": "This would need 101 issues to properly test, generating dynamically in actual test"
|
||||
},
|
||||
{
|
||||
"description": "Issue with invalid severity",
|
||||
"message": {
|
||||
"type": "backbeat.barreport.v1",
|
||||
"window_id": "7e9b0e6c4c9a4e59b7f2d9a3c1b2e4d5",
|
||||
"from_beat": 0,
|
||||
"to_beat": 119,
|
||||
"agents_reporting": 150,
|
||||
"on_time_reviews": 147,
|
||||
"help_promises_fulfilled": 12,
|
||||
"secret_rotations_ok": true,
|
||||
"tempo_drift_ms": -2.1,
|
||||
"issues": [
|
||||
{
|
||||
"severity": "invalid_severity",
|
||||
"category": "timing",
|
||||
"count": 1,
|
||||
"description": "Some issue"
|
||||
}
|
||||
]
|
||||
},
|
||||
"expected_errors": ["issue.severity must be one of: info, warning, error, critical"]
|
||||
},
|
||||
{
|
||||
"description": "Issue with invalid category",
|
||||
"message": {
|
||||
"type": "backbeat.barreport.v1",
|
||||
"window_id": "7e9b0e6c4c9a4e59b7f2d9a3c1b2e4d5",
|
||||
"from_beat": 0,
|
||||
"to_beat": 119,
|
||||
"agents_reporting": 150,
|
||||
"on_time_reviews": 147,
|
||||
"help_promises_fulfilled": 12,
|
||||
"secret_rotations_ok": true,
|
||||
"tempo_drift_ms": -2.1,
|
||||
"issues": [
|
||||
{
|
||||
"severity": "warning",
|
||||
"category": "invalid_category",
|
||||
"count": 1,
|
||||
"description": "Some issue"
|
||||
}
|
||||
]
|
||||
},
|
||||
"expected_errors": ["issue.category must be one of: timing, failed_tasks, missing_agents, resource_exhaustion, network_partition, credential_failure, data_corruption, unknown"]
|
||||
},
|
||||
{
|
||||
"description": "Issue with zero count",
|
||||
"message": {
|
||||
"type": "backbeat.barreport.v1",
|
||||
"window_id": "7e9b0e6c4c9a4e59b7f2d9a3c1b2e4d5",
|
||||
"from_beat": 0,
|
||||
"to_beat": 119,
|
||||
"agents_reporting": 150,
|
||||
"on_time_reviews": 147,
|
||||
"help_promises_fulfilled": 12,
|
||||
"secret_rotations_ok": true,
|
||||
"tempo_drift_ms": -2.1,
|
||||
"issues": [
|
||||
{
|
||||
"severity": "warning",
|
||||
"category": "timing",
|
||||
"count": 0,
|
||||
"description": "Some issue"
|
||||
}
|
||||
]
|
||||
},
|
||||
"expected_errors": ["issue.count must be >= 1"]
|
||||
},
|
||||
{
|
||||
"description": "Issue with description too long (over 512 chars)",
|
||||
"message": {
|
||||
"type": "backbeat.barreport.v1",
|
||||
"window_id": "7e9b0e6c4c9a4e59b7f2d9a3c1b2e4d5",
|
||||
"from_beat": 0,
|
||||
"to_beat": 119,
|
||||
"agents_reporting": 150,
|
||||
"on_time_reviews": 147,
|
||||
"help_promises_fulfilled": 12,
|
||||
"secret_rotations_ok": true,
|
||||
"tempo_drift_ms": -2.1,
|
||||
"issues": [
|
||||
{
|
||||
"severity": "warning",
|
||||
"category": "timing",
|
||||
"count": 1,
|
||||
"description": "This is a very long description that exceeds the maximum allowed length of 512 characters for issue descriptions in BACKBEAT BarReport messages. This constraint is in place to prevent excessively large messages and ensure that issue descriptions remain concise and actionable. The system should reject this message because the description field contains more than 512 characters and violates the schema validation rules that have been carefully designed to maintain message size limits and system performance characteristics."
|
||||
}
|
||||
]
|
||||
},
|
||||
"expected_errors": ["issue.description must be at most 512 characters"]
|
||||
},
|
||||
{
|
||||
"description": "Issue with too many affected agents (over 50)",
|
||||
"message": {
|
||||
"type": "backbeat.barreport.v1",
|
||||
"window_id": "7e9b0e6c4c9a4e59b7f2d9a3c1b2e4d5",
|
||||
"from_beat": 0,
|
||||
"to_beat": 119,
|
||||
"agents_reporting": 150,
|
||||
"on_time_reviews": 147,
|
||||
"help_promises_fulfilled": 12,
|
||||
"secret_rotations_ok": true,
|
||||
"tempo_drift_ms": -2.1,
|
||||
"issues": [
|
||||
{
|
||||
"severity": "warning",
|
||||
"category": "timing",
|
||||
"count": 1,
|
||||
"description": "Too many affected agents",
|
||||
"affected_agents": [
|
||||
"agent1", "agent2", "agent3", "agent4", "agent5", "agent6", "agent7", "agent8", "agent9", "agent10",
|
||||
"agent11", "agent12", "agent13", "agent14", "agent15", "agent16", "agent17", "agent18", "agent19", "agent20",
|
||||
"agent21", "agent22", "agent23", "agent24", "agent25", "agent26", "agent27", "agent28", "agent29", "agent30",
|
||||
"agent31", "agent32", "agent33", "agent34", "agent35", "agent36", "agent37", "agent38", "agent39", "agent40",
|
||||
"agent41", "agent42", "agent43", "agent44", "agent45", "agent46", "agent47", "agent48", "agent49", "agent50",
|
||||
"agent51"
|
||||
]
|
||||
}
|
||||
]
|
||||
},
|
||||
"expected_errors": ["issue.affected_agents must have at most 50 items"]
|
||||
},
|
||||
{
|
||||
"description": "Wrong message type",
|
||||
"message": {
|
||||
"type": "backbeat.wrongtype.v1",
|
||||
"window_id": "7e9b0e6c4c9a4e59b7f2d9a3c1b2e4d5",
|
||||
"from_beat": 0,
|
||||
"to_beat": 119,
|
||||
"agents_reporting": 150,
|
||||
"on_time_reviews": 147,
|
||||
"help_promises_fulfilled": 12,
|
||||
"secret_rotations_ok": true,
|
||||
"tempo_drift_ms": -2.1
|
||||
},
|
||||
"expected_errors": ["type must be 'backbeat.barreport.v1'"]
|
||||
},
|
||||
{
|
||||
"description": "Extra unknown properties (should fail with additionalProperties: false)",
|
||||
"message": {
|
||||
"type": "backbeat.barreport.v1",
|
||||
"window_id": "7e9b0e6c4c9a4e59b7f2d9a3c1b2e4d5",
|
||||
"from_beat": 0,
|
||||
"to_beat": 119,
|
||||
"agents_reporting": 150,
|
||||
"on_time_reviews": 147,
|
||||
"help_promises_fulfilled": 12,
|
||||
"secret_rotations_ok": true,
|
||||
"tempo_drift_ms": -2.1,
|
||||
"unknown_field": "should not be allowed"
|
||||
},
|
||||
"expected_errors": ["Additional property unknown_field is not allowed"]
|
||||
}
|
||||
]
|
||||
190
BACKBEAT-prototype/contracts/tests/examples/barreport-valid.json
Normal file
190
BACKBEAT-prototype/contracts/tests/examples/barreport-valid.json
Normal file
@@ -0,0 +1,190 @@
|
||||
[
|
||||
{
|
||||
"description": "Healthy cluster with good performance",
|
||||
"message": {
|
||||
"type": "backbeat.barreport.v1",
|
||||
"window_id": "7e9b0e6c4c9a4e59b7f2d9a3c1b2e4d5",
|
||||
"from_beat": 240,
|
||||
"to_beat": 359,
|
||||
"agents_reporting": 978,
|
||||
"on_time_reviews": 942,
|
||||
"help_promises_fulfilled": 87,
|
||||
"secret_rotations_ok": true,
|
||||
"tempo_drift_ms": 7.3,
|
||||
"issues": [
|
||||
{
|
||||
"severity": "warning",
|
||||
"category": "timing",
|
||||
"count": 12,
|
||||
"description": "Some agents consistently reporting 50ms+ late",
|
||||
"affected_agents": ["worker:batch-03", "indexer:shard-7"],
|
||||
"first_seen_beat": 245,
|
||||
"last_seen_beat": 358
|
||||
}
|
||||
],
|
||||
"performance": {
|
||||
"avg_response_time_ms": 45.2,
|
||||
"p95_response_time_ms": 125.7,
|
||||
"total_tasks_completed": 15678,
|
||||
"total_tasks_failed": 23,
|
||||
"peak_concurrent_agents": 1203,
|
||||
"network_bytes_transferred": 67890123
|
||||
},
|
||||
"health_indicators": {
|
||||
"cluster_sync_score": 0.94,
|
||||
"resource_utilization": 0.67,
|
||||
"collaboration_efficiency": 0.89,
|
||||
"error_rate": 0.001
|
||||
}
|
||||
}
|
||||
},
|
||||
{
|
||||
"description": "Small development cluster with perfect sync",
|
||||
"message": {
|
||||
"type": "backbeat.barreport.v1",
|
||||
"window_id": "a1b2c3d4e5f6789012345678901234ab",
|
||||
"from_beat": 0,
|
||||
"to_beat": 119,
|
||||
"agents_reporting": 5,
|
||||
"on_time_reviews": 5,
|
||||
"help_promises_fulfilled": 2,
|
||||
"secret_rotations_ok": true,
|
||||
"tempo_drift_ms": -0.1,
|
||||
"issues": []
|
||||
}
|
||||
},
|
||||
{
|
||||
"description": "Cluster with multiple serious issues",
|
||||
"message": {
|
||||
"type": "backbeat.barreport.v1",
|
||||
"window_id": "fedcba9876543210fedcba9876543210",
|
||||
"from_beat": 1200,
|
||||
"to_beat": 1319,
|
||||
"agents_reporting": 450,
|
||||
"on_time_reviews": 380,
|
||||
"help_promises_fulfilled": 15,
|
||||
"secret_rotations_ok": false,
|
||||
"tempo_drift_ms": 125.7,
|
||||
"issues": [
|
||||
{
|
||||
"severity": "critical",
|
||||
"category": "credential_failure",
|
||||
"count": 3,
|
||||
"description": "Failed to rotate database credentials",
|
||||
"affected_agents": ["db-manager:primary", "backup:secondary"],
|
||||
"first_seen_beat": 1205,
|
||||
"last_seen_beat": 1318
|
||||
},
|
||||
{
|
||||
"severity": "error",
|
||||
"category": "network_partition",
|
||||
"count": 1,
|
||||
"description": "Lost connection to east coast data center",
|
||||
"affected_agents": ["worker:east-01", "worker:east-02", "worker:east-03"],
|
||||
"first_seen_beat": 1210,
|
||||
"last_seen_beat": 1319
|
||||
},
|
||||
{
|
||||
"severity": "warning",
|
||||
"category": "resource_exhaustion",
|
||||
"count": 45,
|
||||
"description": "High memory usage detected",
|
||||
"affected_agents": ["ml-trainer:gpu-01"],
|
||||
"first_seen_beat": 1200,
|
||||
"last_seen_beat": 1315
|
||||
}
|
||||
],
|
||||
"performance": {
|
||||
"avg_response_time_ms": 180.5,
|
||||
"p95_response_time_ms": 450.0,
|
||||
"total_tasks_completed": 5432,
|
||||
"total_tasks_failed": 123,
|
||||
"peak_concurrent_agents": 487,
|
||||
"network_bytes_transferred": 23456789
|
||||
},
|
||||
"health_indicators": {
|
||||
"cluster_sync_score": 0.72,
|
||||
"resource_utilization": 0.95,
|
||||
"collaboration_efficiency": 0.45,
|
||||
"error_rate": 0.022
|
||||
}
|
||||
}
|
||||
},
|
||||
{
|
||||
"description": "High-frequency cluster report (8 BPM tempo)",
|
||||
"message": {
|
||||
"type": "backbeat.barreport.v1",
|
||||
"window_id": "0123456789abcdef0123456789abcdef",
|
||||
"from_beat": 960,
|
||||
"to_beat": 1079,
|
||||
"agents_reporting": 2000,
|
||||
"on_time_reviews": 1985,
|
||||
"help_promises_fulfilled": 156,
|
||||
"secret_rotations_ok": true,
|
||||
"tempo_drift_ms": 3.2,
|
||||
"issues": [
|
||||
{
|
||||
"severity": "info",
|
||||
"category": "timing",
|
||||
"count": 15,
|
||||
"description": "Minor timing variations detected",
|
||||
"first_seen_beat": 965,
|
||||
"last_seen_beat": 1078
|
||||
}
|
||||
],
|
||||
"performance": {
|
||||
"avg_response_time_ms": 25.1,
|
||||
"p95_response_time_ms": 67.3,
|
||||
"total_tasks_completed": 45678,
|
||||
"total_tasks_failed": 12,
|
||||
"peak_concurrent_agents": 2100,
|
||||
"network_bytes_transferred": 123456789
|
||||
},
|
||||
"health_indicators": {
|
||||
"cluster_sync_score": 0.98,
|
||||
"resource_utilization": 0.78,
|
||||
"collaboration_efficiency": 0.92,
|
||||
"error_rate": 0.0003
|
||||
},
|
||||
"metadata": {
|
||||
"reverb_version": "1.3.0",
|
||||
"report_generation_time_ms": 45.7,
|
||||
"next_window_id": "fedcba0987654321fedcba0987654321"
|
||||
}
|
||||
}
|
||||
},
|
||||
{
|
||||
"description": "Minimal valid bar report (only required fields)",
|
||||
"message": {
|
||||
"type": "backbeat.barreport.v1",
|
||||
"window_id": "1111222233334444555566667777888",
|
||||
"from_beat": 600,
|
||||
"to_beat": 719,
|
||||
"agents_reporting": 1,
|
||||
"on_time_reviews": 1,
|
||||
"help_promises_fulfilled": 0,
|
||||
"secret_rotations_ok": true,
|
||||
"tempo_drift_ms": 0.0
|
||||
}
|
||||
},
|
||||
{
|
||||
"description": "Empty issues array (valid)",
|
||||
"message": {
|
||||
"type": "backbeat.barreport.v1",
|
||||
"window_id": "9999aaaa0000bbbb1111cccc2222dddd",
|
||||
"from_beat": 480,
|
||||
"to_beat": 599,
|
||||
"agents_reporting": 100,
|
||||
"on_time_reviews": 98,
|
||||
"help_promises_fulfilled": 25,
|
||||
"secret_rotations_ok": true,
|
||||
"tempo_drift_ms": -1.5,
|
||||
"issues": [],
|
||||
"performance": {
|
||||
"avg_response_time_ms": 50.0,
|
||||
"total_tasks_completed": 1000,
|
||||
"total_tasks_failed": 2
|
||||
}
|
||||
}
|
||||
}
|
||||
]
|
||||
@@ -0,0 +1,152 @@
|
||||
[
|
||||
{
|
||||
"description": "Missing required field 'beat_index'",
|
||||
"message": {
|
||||
"type": "backbeat.beatframe.v1",
|
||||
"cluster_id": "test",
|
||||
"downbeat": false,
|
||||
"phase": "execute",
|
||||
"hlc": "7ffd:0001:abcd",
|
||||
"deadline_at": "2025-09-05T12:00:00Z",
|
||||
"tempo_bpm": 2.0,
|
||||
"window_id": "7e9b0e6c4c9a4e59b7f2d9a3c1b2e4d5"
|
||||
},
|
||||
"expected_errors": ["beat_index is required"]
|
||||
},
|
||||
{
|
||||
"description": "Invalid phase value",
|
||||
"message": {
|
||||
"type": "backbeat.beatframe.v1",
|
||||
"cluster_id": "test",
|
||||
"beat_index": 0,
|
||||
"downbeat": false,
|
||||
"phase": "invalid_phase",
|
||||
"hlc": "7ffd:0001:abcd",
|
||||
"deadline_at": "2025-09-05T12:00:00Z",
|
||||
"tempo_bpm": 2.0,
|
||||
"window_id": "7e9b0e6c4c9a4e59b7f2d9a3c1b2e4d5"
|
||||
},
|
||||
"expected_errors": ["phase must be one of: plan, execute, review"]
|
||||
},
|
||||
{
|
||||
"description": "Invalid HLC format (wrong number of segments)",
|
||||
"message": {
|
||||
"type": "backbeat.beatframe.v1",
|
||||
"cluster_id": "test",
|
||||
"beat_index": 0,
|
||||
"downbeat": false,
|
||||
"phase": "plan",
|
||||
"hlc": "7ffd:0001",
|
||||
"deadline_at": "2025-09-05T12:00:00Z",
|
||||
"tempo_bpm": 2.0,
|
||||
"window_id": "7e9b0e6c4c9a4e59b7f2d9a3c1b2e4d5"
|
||||
},
|
||||
"expected_errors": ["hlc must match pattern ^[0-9a-fA-F]{4}:[0-9a-fA-F]{4}:[0-9a-fA-F]{4}$"]
|
||||
},
|
||||
{
|
||||
"description": "Invalid HLC format (non-hex characters)",
|
||||
"message": {
|
||||
"type": "backbeat.beatframe.v1",
|
||||
"cluster_id": "test",
|
||||
"beat_index": 0,
|
||||
"downbeat": false,
|
||||
"phase": "plan",
|
||||
"hlc": "gggg:0001:abcd",
|
||||
"deadline_at": "2025-09-05T12:00:00Z",
|
||||
"tempo_bpm": 2.0,
|
||||
"window_id": "7e9b0e6c4c9a4e59b7f2d9a3c1b2e4d5"
|
||||
},
|
||||
"expected_errors": ["hlc must match pattern ^[0-9a-fA-F]{4}:[0-9a-fA-F]{4}:[0-9a-fA-F]{4}$"]
|
||||
},
|
||||
{
|
||||
"description": "Invalid window_id format (too short)",
|
||||
"message": {
|
||||
"type": "backbeat.beatframe.v1",
|
||||
"cluster_id": "test",
|
||||
"beat_index": 0,
|
||||
"downbeat": false,
|
||||
"phase": "plan",
|
||||
"hlc": "7ffd:0001:abcd",
|
||||
"deadline_at": "2025-09-05T12:00:00Z",
|
||||
"tempo_bpm": 2.0,
|
||||
"window_id": "7e9b0e6c4c9a4e59b7f2d9a3c1b2e4d"
|
||||
},
|
||||
"expected_errors": ["window_id must be exactly 32 hex characters"]
|
||||
},
|
||||
{
|
||||
"description": "Invalid tempo_bpm (too low)",
|
||||
"message": {
|
||||
"type": "backbeat.beatframe.v1",
|
||||
"cluster_id": "test",
|
||||
"beat_index": 0,
|
||||
"downbeat": false,
|
||||
"phase": "plan",
|
||||
"hlc": "7ffd:0001:abcd",
|
||||
"deadline_at": "2025-09-05T12:00:00Z",
|
||||
"tempo_bpm": 0.05,
|
||||
"window_id": "7e9b0e6c4c9a4e59b7f2d9a3c1b2e4d5"
|
||||
},
|
||||
"expected_errors": ["tempo_bpm must be at least 0.1"]
|
||||
},
|
||||
{
|
||||
"description": "Invalid tempo_bpm (too high)",
|
||||
"message": {
|
||||
"type": "backbeat.beatframe.v1",
|
||||
"cluster_id": "test",
|
||||
"beat_index": 0,
|
||||
"downbeat": false,
|
||||
"phase": "plan",
|
||||
"hlc": "7ffd:0001:abcd",
|
||||
"deadline_at": "2025-09-05T12:00:00Z",
|
||||
"tempo_bpm": 1001.0,
|
||||
"window_id": "7e9b0e6c4c9a4e59b7f2d9a3c1b2e4d5"
|
||||
},
|
||||
"expected_errors": ["tempo_bpm must be at most 1000"]
|
||||
},
|
||||
{
|
||||
"description": "Invalid beat_index (negative)",
|
||||
"message": {
|
||||
"type": "backbeat.beatframe.v1",
|
||||
"cluster_id": "test",
|
||||
"beat_index": -1,
|
||||
"downbeat": false,
|
||||
"phase": "plan",
|
||||
"hlc": "7ffd:0001:abcd",
|
||||
"deadline_at": "2025-09-05T12:00:00Z",
|
||||
"tempo_bpm": 2.0,
|
||||
"window_id": "7e9b0e6c4c9a4e59b7f2d9a3c1b2e4d5"
|
||||
},
|
||||
"expected_errors": ["beat_index must be >= 0"]
|
||||
},
|
||||
{
|
||||
"description": "Wrong message type",
|
||||
"message": {
|
||||
"type": "backbeat.wrongtype.v1",
|
||||
"cluster_id": "test",
|
||||
"beat_index": 0,
|
||||
"downbeat": false,
|
||||
"phase": "plan",
|
||||
"hlc": "7ffd:0001:abcd",
|
||||
"deadline_at": "2025-09-05T12:00:00Z",
|
||||
"tempo_bpm": 2.0,
|
||||
"window_id": "7e9b0e6c4c9a4e59b7f2d9a3c1b2e4d5"
|
||||
},
|
||||
"expected_errors": ["type must be 'backbeat.beatframe.v1'"]
|
||||
},
|
||||
{
|
||||
"description": "Extra unknown properties (should fail with additionalProperties: false)",
|
||||
"message": {
|
||||
"type": "backbeat.beatframe.v1",
|
||||
"cluster_id": "test",
|
||||
"beat_index": 0,
|
||||
"downbeat": false,
|
||||
"phase": "plan",
|
||||
"hlc": "7ffd:0001:abcd",
|
||||
"deadline_at": "2025-09-05T12:00:00Z",
|
||||
"tempo_bpm": 2.0,
|
||||
"window_id": "7e9b0e6c4c9a4e59b7f2d9a3c1b2e4d5",
|
||||
"unknown_field": "should not be allowed"
|
||||
},
|
||||
"expected_errors": ["Additional property unknown_field is not allowed"]
|
||||
}
|
||||
]
|
||||
@@ -0,0 +1,82 @@
|
||||
[
|
||||
{
|
||||
"description": "Standard beat frame during execute phase",
|
||||
"message": {
|
||||
"type": "backbeat.beatframe.v1",
|
||||
"cluster_id": "chorus-prod",
|
||||
"beat_index": 1337,
|
||||
"downbeat": false,
|
||||
"phase": "execute",
|
||||
"hlc": "7ffd:0001:abcd",
|
||||
"deadline_at": "2025-09-05T12:30:00Z",
|
||||
"tempo_bpm": 2.0,
|
||||
"window_id": "7e9b0e6c4c9a4e59b7f2d9a3c1b2e4d5"
|
||||
}
|
||||
},
|
||||
{
|
||||
"description": "Downbeat starting new bar in plan phase",
|
||||
"message": {
|
||||
"type": "backbeat.beatframe.v1",
|
||||
"cluster_id": "dev-cluster",
|
||||
"beat_index": 0,
|
||||
"downbeat": true,
|
||||
"phase": "plan",
|
||||
"hlc": "0001:0000:cafe",
|
||||
"deadline_at": "2025-09-05T12:00:30Z",
|
||||
"tempo_bpm": 4.0,
|
||||
"window_id": "a1b2c3d4e5f6789012345678901234ab"
|
||||
}
|
||||
},
|
||||
{
|
||||
"description": "High-frequency beat with metadata",
|
||||
"message": {
|
||||
"type": "backbeat.beatframe.v1",
|
||||
"cluster_id": "fast-cluster",
|
||||
"beat_index": 999999,
|
||||
"downbeat": false,
|
||||
"phase": "review",
|
||||
"hlc": "abcd:ef01:2345",
|
||||
"deadline_at": "2025-09-05T12:00:07.5Z",
|
||||
"tempo_bpm": 8.0,
|
||||
"window_id": "fedcba9876543210fedcba9876543210",
|
||||
"metadata": {
|
||||
"pulse_version": "1.2.3",
|
||||
"cluster_health": "healthy",
|
||||
"expected_agents": 150
|
||||
}
|
||||
}
|
||||
},
|
||||
{
|
||||
"description": "Low-frequency beat (1 BPM = 60 second beats)",
|
||||
"message": {
|
||||
"type": "backbeat.beatframe.v1",
|
||||
"cluster_id": "slow-batch",
|
||||
"beat_index": 42,
|
||||
"downbeat": true,
|
||||
"phase": "plan",
|
||||
"hlc": "FFFF:FFFF:FFFF",
|
||||
"deadline_at": "2025-09-05T13:00:00Z",
|
||||
"tempo_bpm": 1.0,
|
||||
"window_id": "0123456789abcdef0123456789abcdef",
|
||||
"metadata": {
|
||||
"pulse_version": "2.0.0",
|
||||
"cluster_health": "degraded",
|
||||
"expected_agents": 5
|
||||
}
|
||||
}
|
||||
},
|
||||
{
|
||||
"description": "Minimal valid beat frame (no optional fields)",
|
||||
"message": {
|
||||
"type": "backbeat.beatframe.v1",
|
||||
"cluster_id": "minimal",
|
||||
"beat_index": 1,
|
||||
"downbeat": false,
|
||||
"phase": "execute",
|
||||
"hlc": "0000:0001:0002",
|
||||
"deadline_at": "2025-09-05T12:01:00Z",
|
||||
"tempo_bpm": 2.0,
|
||||
"window_id": "1234567890abcdef1234567890abcdef"
|
||||
}
|
||||
}
|
||||
]
|
||||
@@ -0,0 +1,189 @@
|
||||
[
|
||||
{
|
||||
"description": "Missing required field 'beat_index'",
|
||||
"message": {
|
||||
"type": "backbeat.statusclaim.v1",
|
||||
"agent_id": "test:agent",
|
||||
"state": "executing",
|
||||
"hlc": "7ffd:0001:abcd"
|
||||
},
|
||||
"expected_errors": ["beat_index is required"]
|
||||
},
|
||||
{
|
||||
"description": "Missing required field 'state'",
|
||||
"message": {
|
||||
"type": "backbeat.statusclaim.v1",
|
||||
"agent_id": "test:agent",
|
||||
"beat_index": 100,
|
||||
"hlc": "7ffd:0001:abcd"
|
||||
},
|
||||
"expected_errors": ["state is required"]
|
||||
},
|
||||
{
|
||||
"description": "Missing required field 'hlc'",
|
||||
"message": {
|
||||
"type": "backbeat.statusclaim.v1",
|
||||
"agent_id": "test:agent",
|
||||
"beat_index": 100,
|
||||
"state": "executing"
|
||||
},
|
||||
"expected_errors": ["hlc is required"]
|
||||
},
|
||||
{
|
||||
"description": "Invalid state value",
|
||||
"message": {
|
||||
"type": "backbeat.statusclaim.v1",
|
||||
"agent_id": "test:agent",
|
||||
"beat_index": 100,
|
||||
"state": "invalid_state",
|
||||
"hlc": "7ffd:0001:abcd"
|
||||
},
|
||||
"expected_errors": ["state must be one of: idle, planning, executing, reviewing, completed, failed, blocked, helping"]
|
||||
},
|
||||
{
|
||||
"description": "Invalid progress value (negative)",
|
||||
"message": {
|
||||
"type": "backbeat.statusclaim.v1",
|
||||
"agent_id": "test:agent",
|
||||
"beat_index": 100,
|
||||
"state": "executing",
|
||||
"progress": -0.1,
|
||||
"hlc": "7ffd:0001:abcd"
|
||||
},
|
||||
"expected_errors": ["progress must be between 0.0 and 1.0"]
|
||||
},
|
||||
{
|
||||
"description": "Invalid progress value (greater than 1.0)",
|
||||
"message": {
|
||||
"type": "backbeat.statusclaim.v1",
|
||||
"agent_id": "test:agent",
|
||||
"beat_index": 100,
|
||||
"state": "executing",
|
||||
"progress": 1.1,
|
||||
"hlc": "7ffd:0001:abcd"
|
||||
},
|
||||
"expected_errors": ["progress must be between 0.0 and 1.0"]
|
||||
},
|
||||
{
|
||||
"description": "Invalid beats_left (negative)",
|
||||
"message": {
|
||||
"type": "backbeat.statusclaim.v1",
|
||||
"agent_id": "test:agent",
|
||||
"beat_index": 100,
|
||||
"state": "executing",
|
||||
"beats_left": -1,
|
||||
"hlc": "7ffd:0001:abcd"
|
||||
},
|
||||
"expected_errors": ["beats_left must be >= 0"]
|
||||
},
|
||||
{
|
||||
"description": "Invalid beats_left (too high)",
|
||||
"message": {
|
||||
"type": "backbeat.statusclaim.v1",
|
||||
"agent_id": "test:agent",
|
||||
"beat_index": 100,
|
||||
"state": "executing",
|
||||
"beats_left": 1001,
|
||||
"hlc": "7ffd:0001:abcd"
|
||||
},
|
||||
"expected_errors": ["beats_left must be <= 1000"]
|
||||
},
|
||||
{
|
||||
"description": "Invalid beat_index (negative)",
|
||||
"message": {
|
||||
"type": "backbeat.statusclaim.v1",
|
||||
"agent_id": "test:agent",
|
||||
"beat_index": -1,
|
||||
"state": "executing",
|
||||
"hlc": "7ffd:0001:abcd"
|
||||
},
|
||||
"expected_errors": ["beat_index must be >= 0"]
|
||||
},
|
||||
{
|
||||
"description": "Invalid HLC format",
|
||||
"message": {
|
||||
"type": "backbeat.statusclaim.v1",
|
||||
"agent_id": "test:agent",
|
||||
"beat_index": 100,
|
||||
"state": "executing",
|
||||
"hlc": "invalid-hlc"
|
||||
},
|
||||
"expected_errors": ["hlc must match pattern ^[0-9a-fA-F]{4}:[0-9a-fA-F]{4}:[0-9a-fA-F]{4}$"]
|
||||
},
|
||||
{
|
||||
"description": "Notes too long (over 256 characters)",
|
||||
"message": {
|
||||
"type": "backbeat.statusclaim.v1",
|
||||
"agent_id": "test:agent",
|
||||
"beat_index": 100,
|
||||
"state": "executing",
|
||||
"notes": "This is a very long notes field that exceeds the maximum allowed length of 256 characters. This should fail validation because it contains too much text and violates the maxLength constraint that was set to keep status messages concise and prevent excessive message sizes in the BACKBEAT system.",
|
||||
"hlc": "7ffd:0001:abcd"
|
||||
},
|
||||
"expected_errors": ["notes must be at most 256 characters"]
|
||||
},
|
||||
{
|
||||
"description": "Too many dependencies (over 50)",
|
||||
"message": {
|
||||
"type": "backbeat.statusclaim.v1",
|
||||
"agent_id": "test:agent",
|
||||
"beat_index": 100,
|
||||
"state": "blocked",
|
||||
"hlc": "7ffd:0001:abcd",
|
||||
"dependencies": [
|
||||
"dep1", "dep2", "dep3", "dep4", "dep5", "dep6", "dep7", "dep8", "dep9", "dep10",
|
||||
"dep11", "dep12", "dep13", "dep14", "dep15", "dep16", "dep17", "dep18", "dep19", "dep20",
|
||||
"dep21", "dep22", "dep23", "dep24", "dep25", "dep26", "dep27", "dep28", "dep29", "dep30",
|
||||
"dep31", "dep32", "dep33", "dep34", "dep35", "dep36", "dep37", "dep38", "dep39", "dep40",
|
||||
"dep41", "dep42", "dep43", "dep44", "dep45", "dep46", "dep47", "dep48", "dep49", "dep50",
|
||||
"dep51"
|
||||
]
|
||||
},
|
||||
"expected_errors": ["dependencies must have at most 50 items"]
|
||||
},
|
||||
{
|
||||
"description": "Invalid agent_id format (empty)",
|
||||
"message": {
|
||||
"type": "backbeat.statusclaim.v1",
|
||||
"agent_id": "",
|
||||
"beat_index": 100,
|
||||
"state": "executing",
|
||||
"hlc": "7ffd:0001:abcd"
|
||||
},
|
||||
"expected_errors": ["agent_id must be at least 1 character"]
|
||||
},
|
||||
{
|
||||
"description": "Agent_id too long (over 128 characters)",
|
||||
"message": {
|
||||
"type": "backbeat.statusclaim.v1",
|
||||
"agent_id": "this_is_a_very_long_agent_id_that_exceeds_the_maximum_allowed_length_of_128_characters_and_should_fail_validation_because_it_is_too_long_for_the_system_to_handle_properly",
|
||||
"beat_index": 100,
|
||||
"state": "executing",
|
||||
"hlc": "7ffd:0001:abcd"
|
||||
},
|
||||
"expected_errors": ["agent_id must be at most 128 characters"]
|
||||
},
|
||||
{
|
||||
"description": "Wrong message type",
|
||||
"message": {
|
||||
"type": "backbeat.wrongtype.v1",
|
||||
"agent_id": "test:agent",
|
||||
"beat_index": 100,
|
||||
"state": "executing",
|
||||
"hlc": "7ffd:0001:abcd"
|
||||
},
|
||||
"expected_errors": ["type must be 'backbeat.statusclaim.v1'"]
|
||||
},
|
||||
{
|
||||
"description": "Extra unknown properties (should fail with additionalProperties: false)",
|
||||
"message": {
|
||||
"type": "backbeat.statusclaim.v1",
|
||||
"agent_id": "test:agent",
|
||||
"beat_index": 100,
|
||||
"state": "executing",
|
||||
"hlc": "7ffd:0001:abcd",
|
||||
"unknown_field": "should not be allowed"
|
||||
},
|
||||
"expected_errors": ["Additional property unknown_field is not allowed"]
|
||||
}
|
||||
]
|
||||
@@ -0,0 +1,135 @@
|
||||
[
|
||||
{
|
||||
"description": "Worker executing a batch processing task",
|
||||
"message": {
|
||||
"type": "backbeat.statusclaim.v1",
|
||||
"agent_id": "search-indexer:worker-03",
|
||||
"task_id": "index-batch:20250905-120",
|
||||
"beat_index": 1337,
|
||||
"state": "executing",
|
||||
"beats_left": 3,
|
||||
"progress": 0.65,
|
||||
"notes": "processing batch 120/200",
|
||||
"hlc": "7ffd:0001:beef",
|
||||
"resources": {
|
||||
"cpu_percent": 85.0,
|
||||
"memory_mb": 2048,
|
||||
"disk_io_ops": 1250,
|
||||
"network_kb": 512
|
||||
}
|
||||
}
|
||||
},
|
||||
{
|
||||
"description": "Failed backup agent with error details",
|
||||
"message": {
|
||||
"type": "backbeat.statusclaim.v1",
|
||||
"agent_id": "agent:backup-runner",
|
||||
"beat_index": 1338,
|
||||
"state": "failed",
|
||||
"beats_left": 0,
|
||||
"progress": 0.0,
|
||||
"notes": "connection timeout to storage backend",
|
||||
"hlc": "7ffe:0002:dead",
|
||||
"metadata": {
|
||||
"agent_version": "2.1.0",
|
||||
"error_code": "STORAGE_TIMEOUT",
|
||||
"retry_count": 3
|
||||
}
|
||||
}
|
||||
},
|
||||
{
|
||||
"description": "ML trainer helping another node",
|
||||
"message": {
|
||||
"type": "backbeat.statusclaim.v1",
|
||||
"agent_id": "ml-trainer:gpu-node-1",
|
||||
"beat_index": 1336,
|
||||
"state": "helping",
|
||||
"progress": 1.0,
|
||||
"notes": "completed own work, assisting node-2 with large model",
|
||||
"hlc": "7ffc:0005:cafe",
|
||||
"dependencies": ["ml-trainer:gpu-node-2"]
|
||||
}
|
||||
},
|
||||
{
|
||||
"description": "Idle agent waiting for work",
|
||||
"message": {
|
||||
"type": "backbeat.statusclaim.v1",
|
||||
"agent_id": "worker:standby-01",
|
||||
"beat_index": 1339,
|
||||
"state": "idle",
|
||||
"progress": 0.0,
|
||||
"hlc": "8000:0000:1111"
|
||||
}
|
||||
},
|
||||
{
|
||||
"description": "Agent in planning phase",
|
||||
"message": {
|
||||
"type": "backbeat.statusclaim.v1",
|
||||
"agent_id": "coordinator:main",
|
||||
"task_id": "deploy:v2.1.0",
|
||||
"beat_index": 1340,
|
||||
"state": "planning",
|
||||
"beats_left": 5,
|
||||
"progress": 0.2,
|
||||
"notes": "analyzing dependency graph",
|
||||
"hlc": "8001:0001:2222",
|
||||
"resources": {
|
||||
"cpu_percent": 15.0,
|
||||
"memory_mb": 512
|
||||
}
|
||||
}
|
||||
},
|
||||
{
|
||||
"description": "Reviewing agent with completed task",
|
||||
"message": {
|
||||
"type": "backbeat.statusclaim.v1",
|
||||
"agent_id": "quality-checker:auto",
|
||||
"task_id": "validate:batch-45",
|
||||
"beat_index": 1341,
|
||||
"state": "reviewing",
|
||||
"beats_left": 1,
|
||||
"progress": 0.9,
|
||||
"notes": "final verification of output quality",
|
||||
"hlc": "8002:0002:3333"
|
||||
}
|
||||
},
|
||||
{
|
||||
"description": "Completed agent ready for next task",
|
||||
"message": {
|
||||
"type": "backbeat.statusclaim.v1",
|
||||
"agent_id": "processor:fast-01",
|
||||
"task_id": "process:item-567",
|
||||
"beat_index": 1342,
|
||||
"state": "completed",
|
||||
"beats_left": 0,
|
||||
"progress": 1.0,
|
||||
"notes": "item processed successfully",
|
||||
"hlc": "8003:0003:4444"
|
||||
}
|
||||
},
|
||||
{
|
||||
"description": "Blocked agent waiting for external dependency",
|
||||
"message": {
|
||||
"type": "backbeat.statusclaim.v1",
|
||||
"agent_id": "data-loader:external",
|
||||
"task_id": "load:dataset-789",
|
||||
"beat_index": 1343,
|
||||
"state": "blocked",
|
||||
"beats_left": 10,
|
||||
"progress": 0.1,
|
||||
"notes": "waiting for external API rate limit reset",
|
||||
"hlc": "8004:0004:5555",
|
||||
"dependencies": ["external-api:rate-limiter"]
|
||||
}
|
||||
},
|
||||
{
|
||||
"description": "Minimal valid status claim (only required fields)",
|
||||
"message": {
|
||||
"type": "backbeat.statusclaim.v1",
|
||||
"agent_id": "simple:agent",
|
||||
"beat_index": 1344,
|
||||
"state": "idle",
|
||||
"hlc": "8005:0005:6666"
|
||||
}
|
||||
}
|
||||
]
|
||||
206
BACKBEAT-prototype/contracts/tests/integration/Makefile
Normal file
206
BACKBEAT-prototype/contracts/tests/integration/Makefile
Normal file
@@ -0,0 +1,206 @@
|
||||
# BACKBEAT Contracts CI Integration Makefile
|
||||
|
||||
# Variables
|
||||
SCHEMA_DIR = ../../schemas
|
||||
EXAMPLES_DIR = ../examples
|
||||
CLI_TOOL = ./cmd/backbeat-validate
|
||||
BINARY_NAME = backbeat-validate
|
||||
|
||||
# Default target
|
||||
.PHONY: all
|
||||
all: build test
|
||||
|
||||
# Build the CLI validation tool
|
||||
.PHONY: build
|
||||
build:
|
||||
@echo "Building BACKBEAT validation CLI tool..."
|
||||
go build -o $(BINARY_NAME) $(CLI_TOOL)
|
||||
|
||||
# Run all tests
|
||||
.PHONY: test
|
||||
test: test-schemas test-examples test-integration
|
||||
|
||||
# Test schema files are valid
|
||||
.PHONY: test-schemas
|
||||
test-schemas:
|
||||
@echo "Testing JSON schema files..."
|
||||
@for schema in $(SCHEMA_DIR)/*.schema.json; do \
|
||||
echo "Validating schema: $$schema"; \
|
||||
python3 -c "import json; json.load(open('$$schema'))" || exit 1; \
|
||||
done
|
||||
|
||||
# Test all example files
|
||||
.PHONY: test-examples
|
||||
test-examples: build
|
||||
@echo "Testing example messages..."
|
||||
./$(BINARY_NAME) --schemas $(SCHEMA_DIR) --dir $(EXAMPLES_DIR)
|
||||
|
||||
# Run Go integration tests
|
||||
.PHONY: test-integration
|
||||
test-integration:
|
||||
@echo "Running Go integration tests..."
|
||||
go test -v ./...
|
||||
|
||||
# Validate built-in examples
|
||||
.PHONY: validate-examples
|
||||
validate-examples: build
|
||||
@echo "Validating built-in examples..."
|
||||
./$(BINARY_NAME) --schemas $(SCHEMA_DIR) --examples
|
||||
|
||||
# Validate a specific directory (for CI use)
|
||||
.PHONY: validate-dir
|
||||
validate-dir: build
|
||||
@if [ -z "$(DIR)" ]; then \
|
||||
echo "Usage: make validate-dir DIR=/path/to/messages"; \
|
||||
exit 1; \
|
||||
fi
|
||||
./$(BINARY_NAME) --schemas $(SCHEMA_DIR) --dir $(DIR) --exit-code
|
||||
|
||||
# Validate a specific file (for CI use)
|
||||
.PHONY: validate-file
|
||||
validate-file: build
|
||||
@if [ -z "$(FILE)" ]; then \
|
||||
echo "Usage: make validate-file FILE=/path/to/message.json"; \
|
||||
exit 1; \
|
||||
fi
|
||||
./$(BINARY_NAME) --schemas $(SCHEMA_DIR) --file $(FILE) --exit-code
|
||||
|
||||
# Clean build artifacts
|
||||
.PHONY: clean
|
||||
clean:
|
||||
rm -f $(BINARY_NAME)
|
||||
|
||||
# Install dependencies
|
||||
.PHONY: deps
|
||||
deps:
|
||||
go mod tidy
|
||||
go mod download
|
||||
|
||||
# Format Go code
|
||||
.PHONY: fmt
|
||||
fmt:
|
||||
go fmt ./...
|
||||
|
||||
# Run static analysis
|
||||
.PHONY: lint
|
||||
lint:
|
||||
go vet ./...
|
||||
|
||||
# Generate CI configuration examples
|
||||
.PHONY: examples
|
||||
examples: generate-github-actions generate-gitlab-ci generate-makefile-example
|
||||
|
||||
# Generate GitHub Actions workflow
|
||||
.PHONY: generate-github-actions
|
||||
generate-github-actions:
|
||||
@echo "Generating GitHub Actions workflow..."
|
||||
@mkdir -p ci-examples
|
||||
@cat > ci-examples/github-actions.yml << 'EOF'\
|
||||
name: BACKBEAT Contract Validation\
|
||||
\
|
||||
on:\
|
||||
push:\
|
||||
branches: [ main, develop ]\
|
||||
pull_request:\
|
||||
branches: [ main ]\
|
||||
\
|
||||
jobs:\
|
||||
validate-backbeat-messages:\
|
||||
runs-on: ubuntu-latest\
|
||||
\
|
||||
steps:\
|
||||
- uses: actions/checkout@v4\
|
||||
with:\
|
||||
repository: 'chorus-services/backbeat'\
|
||||
path: 'backbeat-contracts'\
|
||||
\
|
||||
- uses: actions/checkout@v4\
|
||||
with:\
|
||||
path: 'current-repo'\
|
||||
\
|
||||
- name: Set up Go\
|
||||
uses: actions/setup-go@v4\
|
||||
with:\
|
||||
go-version: '1.22'\
|
||||
\
|
||||
- name: Build BACKBEAT validator\
|
||||
run: |\
|
||||
cd backbeat-contracts/contracts/tests/integration\
|
||||
make build\
|
||||
\
|
||||
- name: Validate BACKBEAT messages\
|
||||
run: |\
|
||||
cd backbeat-contracts/contracts/tests/integration\
|
||||
./backbeat-validate \\\
|
||||
--schemas ../../schemas \\\
|
||||
--dir ../../../current-repo/path/to/messages \\\
|
||||
--exit-code\
|
||||
EOF
|
||||
|
||||
# Generate GitLab CI configuration
|
||||
.PHONY: generate-gitlab-ci
|
||||
generate-gitlab-ci:
|
||||
@echo "Generating GitLab CI configuration..."
|
||||
@mkdir -p ci-examples
|
||||
@cat > ci-examples/gitlab-ci.yml << 'EOF'\
|
||||
validate-backbeat-contracts:\
|
||||
stage: test\
|
||||
image: golang:1.22\
|
||||
\
|
||||
before_script:\
|
||||
- git clone https://github.com/chorus-services/backbeat.git /tmp/backbeat\
|
||||
- cd /tmp/backbeat/contracts/tests/integration\
|
||||
- make deps build\
|
||||
\
|
||||
script:\
|
||||
- /tmp/backbeat/contracts/tests/integration/backbeat-validate \\\
|
||||
--schemas /tmp/backbeat/contracts/schemas \\\
|
||||
--dir $$CI_PROJECT_DIR/path/to/messages \\\
|
||||
--exit-code\
|
||||
\
|
||||
only:\
|
||||
- merge_requests\
|
||||
- main\
|
||||
- develop\
|
||||
EOF
|
||||
|
||||
# Generate example Makefile for downstream projects
|
||||
.PHONY: generate-makefile-example
|
||||
generate-makefile-example:
|
||||
@echo "Generating example Makefile for downstream projects..."
|
||||
@mkdir -p ci-examples
|
||||
@echo "# Example Makefile for BACKBEAT contract validation" > ci-examples/downstream-makefile
|
||||
@echo "" >> ci-examples/downstream-makefile
|
||||
@echo "BACKBEAT_REPO = https://github.com/chorus-services/backbeat.git" >> ci-examples/downstream-makefile
|
||||
@echo "BACKBEAT_DIR = .backbeat-contracts" >> ci-examples/downstream-makefile
|
||||
@echo "" >> ci-examples/downstream-makefile
|
||||
@echo "validate-backbeat:" >> ci-examples/downstream-makefile
|
||||
@echo " git clone \$$(BACKBEAT_REPO) \$$(BACKBEAT_DIR) 2>/dev/null || true" >> ci-examples/downstream-makefile
|
||||
@echo " cd \$$(BACKBEAT_DIR)/contracts/tests/integration && make build" >> ci-examples/downstream-makefile
|
||||
@echo " \$$(BACKBEAT_DIR)/contracts/tests/integration/backbeat-validate --schemas \$$(BACKBEAT_DIR)/contracts/schemas --dir messages --exit-code" >> ci-examples/downstream-makefile
|
||||
|
||||
# Help target
|
||||
.PHONY: help
|
||||
help:
|
||||
@echo "BACKBEAT Contracts CI Integration Makefile"
|
||||
@echo ""
|
||||
@echo "Available targets:"
|
||||
@echo " all - Build and test everything"
|
||||
@echo " build - Build the CLI validation tool"
|
||||
@echo " test - Run all tests"
|
||||
@echo " test-schemas - Validate JSON schema files"
|
||||
@echo " test-examples - Test example message files"
|
||||
@echo " test-integration - Run Go integration tests"
|
||||
@echo " validate-examples - Validate built-in examples"
|
||||
@echo " validate-dir DIR=path - Validate messages in directory"
|
||||
@echo " validate-file FILE=path - Validate single message file"
|
||||
@echo " clean - Clean build artifacts"
|
||||
@echo " deps - Install Go dependencies"
|
||||
@echo " fmt - Format Go code"
|
||||
@echo " lint - Run static analysis"
|
||||
@echo " examples - Generate CI configuration examples"
|
||||
@echo " help - Show this help message"
|
||||
@echo ""
|
||||
@echo "Examples:"
|
||||
@echo " make validate-dir DIR=../../../examples"
|
||||
@echo " make validate-file FILE=../../../examples/beatframe-valid.json"
|
||||
279
BACKBEAT-prototype/contracts/tests/integration/ci_helper.go
Normal file
279
BACKBEAT-prototype/contracts/tests/integration/ci_helper.go
Normal file
@@ -0,0 +1,279 @@
|
||||
// Package integration provides CI helper functions for BACKBEAT contract testing
|
||||
package integration
|
||||
|
||||
import (
|
||||
"encoding/json"
|
||||
"fmt"
|
||||
"io/fs"
|
||||
"os"
|
||||
"path/filepath"
|
||||
"strings"
|
||||
)
|
||||
|
||||
// CIHelper provides utilities for continuous integration testing
|
||||
type CIHelper struct {
|
||||
validator *MessageValidator
|
||||
}
|
||||
|
||||
// NewCIHelper creates a new CI helper with a message validator
|
||||
func NewCIHelper(schemaDir string) (*CIHelper, error) {
|
||||
validator, err := NewMessageValidator(schemaDir)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("failed to create validator: %w", err)
|
||||
}
|
||||
|
||||
return &CIHelper{
|
||||
validator: validator,
|
||||
}, nil
|
||||
}
|
||||
|
||||
// ValidateDirectory validates all JSON files in a directory against BACKBEAT schemas
|
||||
func (ci *CIHelper) ValidateDirectory(dir string) (*DirectoryValidationResult, error) {
|
||||
result := &DirectoryValidationResult{
|
||||
Directory: dir,
|
||||
Files: make(map[string]*FileValidationResult),
|
||||
}
|
||||
|
||||
err := filepath.WalkDir(dir, func(path string, d fs.DirEntry, err error) error {
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
// Skip non-JSON files
|
||||
if d.IsDir() || !strings.HasSuffix(strings.ToLower(path), ".json") {
|
||||
return nil
|
||||
}
|
||||
|
||||
fileResult, validateErr := ci.validateFile(path)
|
||||
if validateErr != nil {
|
||||
result.Errors = append(result.Errors, fmt.Sprintf("Failed to validate %s: %v", path, validateErr))
|
||||
} else {
|
||||
relPath, _ := filepath.Rel(dir, path)
|
||||
result.Files[relPath] = fileResult
|
||||
result.TotalFiles++
|
||||
if fileResult.AllValid {
|
||||
result.ValidFiles++
|
||||
} else {
|
||||
result.InvalidFiles++
|
||||
}
|
||||
}
|
||||
|
||||
return nil
|
||||
})
|
||||
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("failed to walk directory: %w", err)
|
||||
}
|
||||
|
||||
result.ValidationRate = float64(result.ValidFiles) / float64(result.TotalFiles)
|
||||
return result, nil
|
||||
}
|
||||
|
||||
// validateFile validates a single JSON file
|
||||
func (ci *CIHelper) validateFile(filePath string) (*FileValidationResult, error) {
|
||||
data, err := os.ReadFile(filePath)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("failed to read file: %w", err)
|
||||
}
|
||||
|
||||
result := &FileValidationResult{
|
||||
FilePath: filePath,
|
||||
AllValid: true,
|
||||
}
|
||||
|
||||
// Try to parse as single message first
|
||||
var singleMessage map[string]interface{}
|
||||
if err := json.Unmarshal(data, &singleMessage); err == nil {
|
||||
if msgType, hasType := singleMessage["type"].(string); hasType && ci.validator.IsMessageTypeSupported(msgType) {
|
||||
// Single BACKBEAT message
|
||||
validationResult, validateErr := ci.validator.ValidateMessage(data)
|
||||
if validateErr != nil {
|
||||
return nil, validateErr
|
||||
}
|
||||
result.Messages = []*ValidationResult{validationResult}
|
||||
result.AllValid = validationResult.Valid
|
||||
return result, nil
|
||||
}
|
||||
}
|
||||
|
||||
// Try to parse as array of messages
|
||||
var messageArray []map[string]interface{}
|
||||
if err := json.Unmarshal(data, &messageArray); err == nil {
|
||||
for i, msg := range messageArray {
|
||||
msgBytes, marshalErr := json.Marshal(msg)
|
||||
if marshalErr != nil {
|
||||
result.Errors = append(result.Errors, fmt.Sprintf("Message %d: failed to marshal: %v", i, marshalErr))
|
||||
result.AllValid = false
|
||||
continue
|
||||
}
|
||||
|
||||
validationResult, validateErr := ci.validator.ValidateMessage(msgBytes)
|
||||
if validateErr != nil {
|
||||
result.Errors = append(result.Errors, fmt.Sprintf("Message %d: validation error: %v", i, validateErr))
|
||||
result.AllValid = false
|
||||
continue
|
||||
}
|
||||
|
||||
result.Messages = append(result.Messages, validationResult)
|
||||
if !validationResult.Valid {
|
||||
result.AllValid = false
|
||||
}
|
||||
}
|
||||
return result, nil
|
||||
}
|
||||
|
||||
// Try to parse as examples format (array with description and message fields)
|
||||
var examples []ExampleMessage
|
||||
if err := json.Unmarshal(data, &examples); err == nil {
|
||||
for i, example := range examples {
|
||||
msgBytes, marshalErr := json.Marshal(example.Message)
|
||||
if marshalErr != nil {
|
||||
result.Errors = append(result.Errors, fmt.Sprintf("Example %d (%s): failed to marshal: %v", i, example.Description, marshalErr))
|
||||
result.AllValid = false
|
||||
continue
|
||||
}
|
||||
|
||||
validationResult, validateErr := ci.validator.ValidateMessage(msgBytes)
|
||||
if validateErr != nil {
|
||||
result.Errors = append(result.Errors, fmt.Sprintf("Example %d (%s): validation error: %v", i, example.Description, validateErr))
|
||||
result.AllValid = false
|
||||
continue
|
||||
}
|
||||
|
||||
result.Messages = append(result.Messages, validationResult)
|
||||
if !validationResult.Valid {
|
||||
result.AllValid = false
|
||||
}
|
||||
}
|
||||
return result, nil
|
||||
}
|
||||
|
||||
return nil, fmt.Errorf("file does not contain valid JSON message format")
|
||||
}
|
||||
|
||||
// ExampleMessage represents a message example with description
|
||||
type ExampleMessage struct {
|
||||
Description string `json:"description"`
|
||||
Message map[string]interface{} `json:"message"`
|
||||
}
|
||||
|
||||
// DirectoryValidationResult contains results for validating a directory
|
||||
type DirectoryValidationResult struct {
|
||||
Directory string `json:"directory"`
|
||||
TotalFiles int `json:"total_files"`
|
||||
ValidFiles int `json:"valid_files"`
|
||||
InvalidFiles int `json:"invalid_files"`
|
||||
ValidationRate float64 `json:"validation_rate"`
|
||||
Files map[string]*FileValidationResult `json:"files"`
|
||||
Errors []string `json:"errors,omitempty"`
|
||||
}
|
||||
|
||||
// FileValidationResult contains results for validating a single file
|
||||
type FileValidationResult struct {
|
||||
FilePath string `json:"file_path"`
|
||||
AllValid bool `json:"all_valid"`
|
||||
Messages []*ValidationResult `json:"messages"`
|
||||
Errors []string `json:"errors,omitempty"`
|
||||
}
|
||||
|
||||
// GenerateCIReport generates a formatted report suitable for CI systems
|
||||
func (ci *CIHelper) GenerateCIReport(result *DirectoryValidationResult) string {
|
||||
var sb strings.Builder
|
||||
|
||||
sb.WriteString("BACKBEAT Contract Validation Report\n")
|
||||
sb.WriteString("===================================\n\n")
|
||||
|
||||
sb.WriteString(fmt.Sprintf("Directory: %s\n", result.Directory))
|
||||
sb.WriteString(fmt.Sprintf("Total Files: %d\n", result.TotalFiles))
|
||||
sb.WriteString(fmt.Sprintf("Valid Files: %d\n", result.ValidFiles))
|
||||
sb.WriteString(fmt.Sprintf("Invalid Files: %d\n", result.InvalidFiles))
|
||||
sb.WriteString(fmt.Sprintf("Validation Rate: %.2f%%\n\n", result.ValidationRate*100))
|
||||
|
||||
if len(result.Errors) > 0 {
|
||||
sb.WriteString("Directory-level Errors:\n")
|
||||
for _, err := range result.Errors {
|
||||
sb.WriteString(fmt.Sprintf(" - %s\n", err))
|
||||
}
|
||||
sb.WriteString("\n")
|
||||
}
|
||||
|
||||
// Group files by validation status
|
||||
validFiles := make([]string, 0)
|
||||
invalidFiles := make([]string, 0)
|
||||
|
||||
for filePath, fileResult := range result.Files {
|
||||
if fileResult.AllValid {
|
||||
validFiles = append(validFiles, filePath)
|
||||
} else {
|
||||
invalidFiles = append(invalidFiles, filePath)
|
||||
}
|
||||
}
|
||||
|
||||
if len(validFiles) > 0 {
|
||||
sb.WriteString("Valid Files:\n")
|
||||
for _, file := range validFiles {
|
||||
sb.WriteString(fmt.Sprintf(" ✓ %s\n", file))
|
||||
}
|
||||
sb.WriteString("\n")
|
||||
}
|
||||
|
||||
if len(invalidFiles) > 0 {
|
||||
sb.WriteString("Invalid Files:\n")
|
||||
for _, file := range invalidFiles {
|
||||
fileResult := result.Files[file]
|
||||
sb.WriteString(fmt.Sprintf(" ✗ %s\n", file))
|
||||
|
||||
for _, err := range fileResult.Errors {
|
||||
sb.WriteString(fmt.Sprintf(" - %s\n", err))
|
||||
}
|
||||
|
||||
for i, msg := range fileResult.Messages {
|
||||
if !msg.Valid {
|
||||
sb.WriteString(fmt.Sprintf(" Message %d (%s):\n", i+1, msg.MessageType))
|
||||
for _, valErr := range msg.Errors {
|
||||
sb.WriteString(fmt.Sprintf(" - %s: %s\n", valErr.Field, valErr.Message))
|
||||
}
|
||||
}
|
||||
}
|
||||
sb.WriteString("\n")
|
||||
}
|
||||
}
|
||||
|
||||
return sb.String()
|
||||
}
|
||||
|
||||
// ExitWithStatus exits the program with appropriate status code for CI
|
||||
func (ci *CIHelper) ExitWithStatus(result *DirectoryValidationResult) {
|
||||
if result.InvalidFiles > 0 || len(result.Errors) > 0 {
|
||||
fmt.Fprint(os.Stderr, ci.GenerateCIReport(result))
|
||||
os.Exit(1)
|
||||
} else {
|
||||
fmt.Print(ci.GenerateCIReport(result))
|
||||
os.Exit(0)
|
||||
}
|
||||
}
|
||||
|
||||
// ValidateExamples validates the built-in example messages
|
||||
func (ci *CIHelper) ValidateExamples() ([]*ValidationResult, error) {
|
||||
examples := ExampleMessages()
|
||||
results := make([]*ValidationResult, 0, len(examples))
|
||||
|
||||
for name, example := range examples {
|
||||
result, err := ci.validator.ValidateStruct(example)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("failed to validate example %s: %w", name, err)
|
||||
}
|
||||
results = append(results, result)
|
||||
}
|
||||
|
||||
return results, nil
|
||||
}
|
||||
|
||||
// GetSchemaInfo returns information about loaded schemas
|
||||
func (ci *CIHelper) GetSchemaInfo() map[string]string {
|
||||
info := make(map[string]string)
|
||||
for _, msgType := range ci.validator.GetSupportedMessageTypes() {
|
||||
info[msgType] = getSchemaVersion(msgType)
|
||||
}
|
||||
return info
|
||||
}
|
||||
@@ -0,0 +1,184 @@
|
||||
// Command backbeat-validate provides CLI validation of BACKBEAT messages for CI integration
|
||||
package main
|
||||
|
||||
import (
|
||||
"encoding/json"
|
||||
"flag"
|
||||
"fmt"
|
||||
"os"
|
||||
"path/filepath"
|
||||
"strings"
|
||||
|
||||
"github.com/chorus-services/backbeat/contracts/tests/integration"
|
||||
)
|
||||
|
||||
func main() {
|
||||
var (
|
||||
schemaDir = flag.String("schemas", "", "Path to BACKBEAT schema directory (required)")
|
||||
validateDir = flag.String("dir", "", "Directory to validate (optional)")
|
||||
validateFile = flag.String("file", "", "Single file to validate (optional)")
|
||||
messageJSON = flag.String("message", "", "JSON message to validate (optional)")
|
||||
examples = flag.Bool("examples", false, "Validate built-in examples")
|
||||
quiet = flag.Bool("quiet", false, "Only output errors")
|
||||
json_output = flag.Bool("json", false, "Output results as JSON")
|
||||
exitCode = flag.Bool("exit-code", true, "Exit with non-zero code on validation failures")
|
||||
)
|
||||
flag.Parse()
|
||||
|
||||
if *schemaDir == "" {
|
||||
fmt.Fprintf(os.Stderr, "Error: --schemas parameter is required\n")
|
||||
flag.Usage()
|
||||
os.Exit(1)
|
||||
}
|
||||
|
||||
// Create CI helper
|
||||
helper, err := integration.NewCIHelper(*schemaDir)
|
||||
if err != nil {
|
||||
fmt.Fprintf(os.Stderr, "Error creating validator: %v\n", err)
|
||||
os.Exit(1)
|
||||
}
|
||||
|
||||
// Determine what to validate
|
||||
switch {
|
||||
case *examples:
|
||||
validateExamples(helper, *quiet, *json_output, *exitCode)
|
||||
case *validateDir != "":
|
||||
validateDirectory(helper, *validateDir, *quiet, *json_output, *exitCode)
|
||||
case *validateFile != "":
|
||||
validateFile_func(helper, *validateFile, *quiet, *json_output, *exitCode)
|
||||
case *messageJSON != "":
|
||||
validateMessage(helper, *messageJSON, *quiet, *json_output, *exitCode)
|
||||
default:
|
||||
fmt.Fprintf(os.Stderr, "Error: must specify one of --dir, --file, --message, or --examples\n")
|
||||
flag.Usage()
|
||||
os.Exit(1)
|
||||
}
|
||||
}
|
||||
|
||||
func validateExamples(helper *integration.CIHelper, quiet, jsonOutput, exitOnError bool) {
|
||||
results, err := helper.ValidateExamples()
|
||||
if err != nil {
|
||||
fmt.Fprintf(os.Stderr, "Error validating examples: %v\n", err)
|
||||
os.Exit(1)
|
||||
}
|
||||
|
||||
invalidCount := 0
|
||||
for _, result := range results {
|
||||
if !result.Valid {
|
||||
invalidCount++
|
||||
}
|
||||
|
||||
if !quiet || !result.Valid {
|
||||
if jsonOutput {
|
||||
jsonBytes, _ := json.MarshalIndent(result, "", " ")
|
||||
fmt.Println(string(jsonBytes))
|
||||
} else {
|
||||
fmt.Print(integration.PrettyPrintValidationResult(result))
|
||||
fmt.Println(strings.Repeat("-", 50))
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
if !quiet {
|
||||
fmt.Printf("\nSummary: %d total, %d valid, %d invalid\n", len(results), len(results)-invalidCount, invalidCount)
|
||||
}
|
||||
|
||||
if exitOnError && invalidCount > 0 {
|
||||
os.Exit(1)
|
||||
}
|
||||
}
|
||||
|
||||
func validateDirectory(helper *integration.CIHelper, dir string, quiet, jsonOutput, exitOnError bool) {
|
||||
result, err := helper.ValidateDirectory(dir)
|
||||
if err != nil {
|
||||
fmt.Fprintf(os.Stderr, "Error validating directory: %v\n", err)
|
||||
os.Exit(1)
|
||||
}
|
||||
|
||||
if jsonOutput {
|
||||
jsonBytes, _ := json.MarshalIndent(result, "", " ")
|
||||
fmt.Println(string(jsonBytes))
|
||||
} else if !quiet {
|
||||
fmt.Print(helper.GenerateCIReport(result))
|
||||
}
|
||||
|
||||
if exitOnError && (result.InvalidFiles > 0 || len(result.Errors) > 0) {
|
||||
if quiet {
|
||||
fmt.Fprintf(os.Stderr, "Validation failed: %d invalid files, %d errors\n", result.InvalidFiles, len(result.Errors))
|
||||
}
|
||||
os.Exit(1)
|
||||
}
|
||||
}
|
||||
|
||||
func validateFile_func(helper *integration.CIHelper, filePath string, quiet, jsonOutput, exitOnError bool) {
|
||||
// Create a temporary directory for validation
|
||||
tmpDir := filepath.Dir(filePath)
|
||||
result, err := helper.ValidateDirectory(tmpDir)
|
||||
if err != nil {
|
||||
fmt.Fprintf(os.Stderr, "Error validating file: %v\n", err)
|
||||
os.Exit(1)
|
||||
}
|
||||
|
||||
// Filter results to just this file
|
||||
fileName := filepath.Base(filePath)
|
||||
fileResult, exists := result.Files[fileName]
|
||||
if !exists {
|
||||
fmt.Fprintf(os.Stderr, "File was not validated (may not contain BACKBEAT messages)\n")
|
||||
os.Exit(1)
|
||||
}
|
||||
|
||||
if jsonOutput {
|
||||
jsonBytes, _ := json.MarshalIndent(fileResult, "", " ")
|
||||
fmt.Println(string(jsonBytes))
|
||||
} else if !quiet {
|
||||
fmt.Printf("File: %s\n", fileName)
|
||||
fmt.Printf("Valid: %t\n", fileResult.AllValid)
|
||||
|
||||
if len(fileResult.Errors) > 0 {
|
||||
fmt.Println("Errors:")
|
||||
for _, err := range fileResult.Errors {
|
||||
fmt.Printf(" - %s\n", err)
|
||||
}
|
||||
}
|
||||
|
||||
for i, msg := range fileResult.Messages {
|
||||
fmt.Printf("\nMessage %d:\n", i+1)
|
||||
fmt.Print(integration.PrettyPrintValidationResult(msg))
|
||||
}
|
||||
}
|
||||
|
||||
if exitOnError && !fileResult.AllValid {
|
||||
if quiet {
|
||||
fmt.Fprintf(os.Stderr, "Validation failed\n")
|
||||
}
|
||||
os.Exit(1)
|
||||
}
|
||||
}
|
||||
|
||||
func validateMessage(helper *integration.CIHelper, messageJSON string, quiet, jsonOutput, exitOnError bool) {
|
||||
validator, err := integration.NewMessageValidator(flag.Lookup("schemas").Value.String())
|
||||
if err != nil {
|
||||
fmt.Fprintf(os.Stderr, "Error creating validator: %v\n", err)
|
||||
os.Exit(1)
|
||||
}
|
||||
|
||||
result, err := validator.ValidateMessageString(messageJSON)
|
||||
if err != nil {
|
||||
fmt.Fprintf(os.Stderr, "Error validating message: %v\n", err)
|
||||
os.Exit(1)
|
||||
}
|
||||
|
||||
if jsonOutput {
|
||||
jsonBytes, _ := json.MarshalIndent(result, "", " ")
|
||||
fmt.Println(string(jsonBytes))
|
||||
} else if !quiet {
|
||||
fmt.Print(integration.PrettyPrintValidationResult(result))
|
||||
}
|
||||
|
||||
if exitOnError && !result.Valid {
|
||||
if quiet {
|
||||
fmt.Fprintf(os.Stderr, "Validation failed\n")
|
||||
}
|
||||
os.Exit(1)
|
||||
}
|
||||
}
|
||||
283
BACKBEAT-prototype/contracts/tests/integration/validator.go
Normal file
283
BACKBEAT-prototype/contracts/tests/integration/validator.go
Normal file
@@ -0,0 +1,283 @@
|
||||
// Package integration provides CI validation helpers for BACKBEAT conformance testing
|
||||
package integration
|
||||
|
||||
import (
|
||||
"encoding/json"
|
||||
"fmt"
|
||||
"path/filepath"
|
||||
"strings"
|
||||
|
||||
"github.com/xeipuuv/gojsonschema"
|
||||
)
|
||||
|
||||
// MessageValidator provides validation for BACKBEAT messages against JSON schemas
|
||||
type MessageValidator struct {
|
||||
schemaLoaders map[string]gojsonschema.JSONLoader
|
||||
}
|
||||
|
||||
// MessageType constants for the three core BACKBEAT interfaces
|
||||
const (
|
||||
BeatFrameType = "backbeat.beatframe.v1"
|
||||
StatusClaimType = "backbeat.statusclaim.v1"
|
||||
BarReportType = "backbeat.barreport.v1"
|
||||
)
|
||||
|
||||
// ValidationError represents a validation failure with context
|
||||
type ValidationError struct {
|
||||
MessageType string `json:"message_type"`
|
||||
Field string `json:"field"`
|
||||
Value string `json:"value"`
|
||||
Message string `json:"message"`
|
||||
Errors []string `json:"errors"`
|
||||
}
|
||||
|
||||
func (ve ValidationError) Error() string {
|
||||
return fmt.Sprintf("validation failed for %s: %s", ve.MessageType, strings.Join(ve.Errors, "; "))
|
||||
}
|
||||
|
||||
// ValidationResult contains the outcome of message validation
|
||||
type ValidationResult struct {
|
||||
Valid bool `json:"valid"`
|
||||
MessageType string `json:"message_type"`
|
||||
Errors []ValidationError `json:"errors,omitempty"`
|
||||
SchemaVersion string `json:"schema_version"`
|
||||
}
|
||||
|
||||
// NewMessageValidator creates a new validator with schema loaders
|
||||
func NewMessageValidator(schemaDir string) (*MessageValidator, error) {
|
||||
validator := &MessageValidator{
|
||||
schemaLoaders: make(map[string]gojsonschema.JSONLoader),
|
||||
}
|
||||
|
||||
// Load all schema files
|
||||
schemas := map[string]string{
|
||||
BeatFrameType: "beatframe-v1.schema.json",
|
||||
StatusClaimType: "statusclaim-v1.schema.json",
|
||||
BarReportType: "barreport-v1.schema.json",
|
||||
}
|
||||
|
||||
for msgType, schemaFile := range schemas {
|
||||
schemaPath := filepath.Join(schemaDir, schemaFile)
|
||||
loader := gojsonschema.NewReferenceLoader("file://" + schemaPath)
|
||||
validator.schemaLoaders[msgType] = loader
|
||||
}
|
||||
|
||||
return validator, nil
|
||||
}
|
||||
|
||||
// ValidateMessage validates a JSON message against the appropriate BACKBEAT schema
|
||||
func (v *MessageValidator) ValidateMessage(messageJSON []byte) (*ValidationResult, error) {
|
||||
// Parse message to determine type
|
||||
var msgMap map[string]interface{}
|
||||
if err := json.Unmarshal(messageJSON, &msgMap); err != nil {
|
||||
return nil, fmt.Errorf("failed to parse JSON: %w", err)
|
||||
}
|
||||
|
||||
msgType, ok := msgMap["type"].(string)
|
||||
if !ok {
|
||||
return &ValidationResult{
|
||||
Valid: false,
|
||||
MessageType: "unknown",
|
||||
Errors: []ValidationError{
|
||||
{
|
||||
Field: "type",
|
||||
Message: "message type field is missing or not a string",
|
||||
Errors: []string{"type field is required and must be a string"},
|
||||
},
|
||||
},
|
||||
}, nil
|
||||
}
|
||||
|
||||
// Get appropriate schema loader
|
||||
schemaLoader, exists := v.schemaLoaders[msgType]
|
||||
if !exists {
|
||||
return &ValidationResult{
|
||||
Valid: false,
|
||||
MessageType: msgType,
|
||||
Errors: []ValidationError{
|
||||
{
|
||||
Field: "type",
|
||||
Value: msgType,
|
||||
Message: fmt.Sprintf("unsupported message type: %s", msgType),
|
||||
Errors: []string{fmt.Sprintf("message type %s is not supported by BACKBEAT contracts", msgType)},
|
||||
},
|
||||
},
|
||||
}, nil
|
||||
}
|
||||
|
||||
// Validate against schema
|
||||
docLoader := gojsonschema.NewBytesLoader(messageJSON)
|
||||
result, err := gojsonschema.Validate(schemaLoader, docLoader)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("schema validation failed: %w", err)
|
||||
}
|
||||
|
||||
validationResult := &ValidationResult{
|
||||
Valid: result.Valid(),
|
||||
MessageType: msgType,
|
||||
SchemaVersion: getSchemaVersion(msgType),
|
||||
}
|
||||
|
||||
if !result.Valid() {
|
||||
for _, desc := range result.Errors() {
|
||||
validationResult.Errors = append(validationResult.Errors, ValidationError{
|
||||
MessageType: msgType,
|
||||
Field: desc.Field(),
|
||||
Value: fmt.Sprintf("%v", desc.Value()),
|
||||
Message: desc.Description(),
|
||||
Errors: []string{desc.String()},
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
return validationResult, nil
|
||||
}
|
||||
|
||||
// ValidateMessageString validates a JSON message string
|
||||
func (v *MessageValidator) ValidateMessageString(messageJSON string) (*ValidationResult, error) {
|
||||
return v.ValidateMessage([]byte(messageJSON))
|
||||
}
|
||||
|
||||
// ValidateStruct validates a Go struct by marshaling to JSON first
|
||||
func (v *MessageValidator) ValidateStruct(message interface{}) (*ValidationResult, error) {
|
||||
jsonBytes, err := json.Marshal(message)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("failed to marshal struct to JSON: %w", err)
|
||||
}
|
||||
return v.ValidateMessage(jsonBytes)
|
||||
}
|
||||
|
||||
// BatchValidate validates multiple messages and returns aggregated results
|
||||
func (v *MessageValidator) BatchValidate(messages [][]byte) ([]*ValidationResult, error) {
|
||||
results := make([]*ValidationResult, len(messages))
|
||||
for i, msg := range messages {
|
||||
result, err := v.ValidateMessage(msg)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("failed to validate message %d: %w", i, err)
|
||||
}
|
||||
results[i] = result
|
||||
}
|
||||
return results, nil
|
||||
}
|
||||
|
||||
// GetSupportedMessageTypes returns the list of supported BACKBEAT message types
|
||||
func (v *MessageValidator) GetSupportedMessageTypes() []string {
|
||||
types := make([]string, 0, len(v.schemaLoaders))
|
||||
for msgType := range v.schemaLoaders {
|
||||
types = append(types, msgType)
|
||||
}
|
||||
return types
|
||||
}
|
||||
|
||||
// IsMessageTypeSupported checks if a message type is supported
|
||||
func (v *MessageValidator) IsMessageTypeSupported(msgType string) bool {
|
||||
_, exists := v.schemaLoaders[msgType]
|
||||
return exists
|
||||
}
|
||||
|
||||
// getSchemaVersion returns the version for a given message type
|
||||
func getSchemaVersion(msgType string) string {
|
||||
versions := map[string]string{
|
||||
BeatFrameType: "1.0.0",
|
||||
StatusClaimType: "1.0.0",
|
||||
BarReportType: "1.0.0",
|
||||
}
|
||||
return versions[msgType]
|
||||
}
|
||||
|
||||
// ValidationStats provides summary statistics for batch validation
|
||||
type ValidationStats struct {
|
||||
TotalMessages int `json:"total_messages"`
|
||||
ValidMessages int `json:"valid_messages"`
|
||||
InvalidMessages int `json:"invalid_messages"`
|
||||
MessageTypes map[string]int `json:"message_types"`
|
||||
ErrorSummary map[string]int `json:"error_summary"`
|
||||
ValidationRate float64 `json:"validation_rate"`
|
||||
}
|
||||
|
||||
// GetValidationStats computes statistics from validation results
|
||||
func GetValidationStats(results []*ValidationResult) *ValidationStats {
|
||||
stats := &ValidationStats{
|
||||
TotalMessages: len(results),
|
||||
MessageTypes: make(map[string]int),
|
||||
ErrorSummary: make(map[string]int),
|
||||
}
|
||||
|
||||
for _, result := range results {
|
||||
// Count message types
|
||||
stats.MessageTypes[result.MessageType]++
|
||||
|
||||
if result.Valid {
|
||||
stats.ValidMessages++
|
||||
} else {
|
||||
stats.InvalidMessages++
|
||||
// Aggregate error types
|
||||
for _, err := range result.Errors {
|
||||
stats.ErrorSummary[err.Field]++
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
if stats.TotalMessages > 0 {
|
||||
stats.ValidationRate = float64(stats.ValidMessages) / float64(stats.TotalMessages)
|
||||
}
|
||||
|
||||
return stats
|
||||
}
|
||||
|
||||
// ExampleMessages provides sample messages for testing and documentation
|
||||
func ExampleMessages() map[string]interface{} {
|
||||
return map[string]interface{}{
|
||||
"beatframe_minimal": map[string]interface{}{
|
||||
"type": BeatFrameType,
|
||||
"cluster_id": "test-cluster",
|
||||
"beat_index": 0,
|
||||
"downbeat": true,
|
||||
"phase": "plan",
|
||||
"hlc": "0001:0000:cafe",
|
||||
"deadline_at": "2025-09-05T12:00:30Z",
|
||||
"tempo_bpm": 2.0,
|
||||
"window_id": "a1b2c3d4e5f6789012345678901234ab",
|
||||
},
|
||||
"statusclaim_minimal": map[string]interface{}{
|
||||
"type": StatusClaimType,
|
||||
"agent_id": "test:agent",
|
||||
"beat_index": 100,
|
||||
"state": "idle",
|
||||
"hlc": "7ffd:0001:abcd",
|
||||
},
|
||||
"barreport_minimal": map[string]interface{}{
|
||||
"type": BarReportType,
|
||||
"window_id": "7e9b0e6c4c9a4e59b7f2d9a3c1b2e4d5",
|
||||
"from_beat": 0,
|
||||
"to_beat": 119,
|
||||
"agents_reporting": 1,
|
||||
"on_time_reviews": 1,
|
||||
"help_promises_fulfilled": 0,
|
||||
"secret_rotations_ok": true,
|
||||
"tempo_drift_ms": 0.0,
|
||||
},
|
||||
}
|
||||
}
|
||||
|
||||
// PrettyPrintValidationResult formats validation results for human reading
|
||||
func PrettyPrintValidationResult(result *ValidationResult) string {
|
||||
var sb strings.Builder
|
||||
|
||||
sb.WriteString(fmt.Sprintf("Message Type: %s\n", result.MessageType))
|
||||
sb.WriteString(fmt.Sprintf("Schema Version: %s\n", result.SchemaVersion))
|
||||
sb.WriteString(fmt.Sprintf("Valid: %t\n", result.Valid))
|
||||
|
||||
if !result.Valid && len(result.Errors) > 0 {
|
||||
sb.WriteString("\nValidation Errors:\n")
|
||||
for i, err := range result.Errors {
|
||||
sb.WriteString(fmt.Sprintf(" %d. Field: %s\n", i+1, err.Field))
|
||||
if err.Value != "" {
|
||||
sb.WriteString(fmt.Sprintf(" Value: %s\n", err.Value))
|
||||
}
|
||||
sb.WriteString(fmt.Sprintf(" Error: %s\n", err.Message))
|
||||
}
|
||||
}
|
||||
|
||||
return sb.String()
|
||||
}
|
||||
Reference in New Issue
Block a user