Files
anthonyrawlins 26e4ef7d8b feat: Implement complete CHORUS leader election system
Major milestone: CHORUS leader election is now fully functional!

## Key Features Implemented:

### 🗳️ Leader Election Core
- Fixed root cause: nodes now trigger elections when no admin exists
- Added randomized election delays to prevent simultaneous elections
- Implemented concurrent election prevention (only one election at a time)
- Added proper election state management and transitions

### 📡 Admin Discovery System
- Enhanced discovery requests with "WHOAMI" debug messages
- Fixed discovery responses to properly include current leader ID
- Added comprehensive discovery request/response logging
- Implemented admin confirmation from multiple sources

### 🔧 Configuration Improvements
- Increased discovery timeout from 3s to 15s for better reliability
- Added proper Docker Hub image deployment workflow
- Updated build process to use correct chorus-agent binary (not deprecated chorus)
- Added static compilation flags for Alpine Linux compatibility

### 🐛 Critical Fixes
- Fixed build process confusion between chorus vs chorus-agent binaries
- Added missing admin_election capability to enable leader elections
- Corrected discovery logic to handle zero admin responses
- Enhanced debugging with detailed state and timing information

## Current Operational Status:
 Admin Election: Working with proper consensus
 Heartbeat System: 15-second intervals from elected admin
 Discovery Protocol: Nodes can find and confirm current admin
 P2P Connectivity: 5+ connected peers with libp2p
 SLURP Functionality: Enabled on admin nodes
 BACKBEAT Integration: Tempo synchronization working
 Container Health: All health checks passing

## Technical Details:
- Election uses weighted scoring based on uptime, capabilities, and resources
- Randomized delays prevent election storms (30-45s wait periods)
- Discovery responses include current leader ID for network-wide consensus
- State management prevents multiple concurrent elections
- Enhanced logging provides full visibility into election process

🎉 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-09-23 13:06:53 +10:00

3.8 KiB

gobreaker

GoDoc

gobreaker implements the Circuit Breaker pattern in Go.

Installation

go get github.com/sony/gobreaker

Usage

The struct CircuitBreaker is a state machine to prevent sending requests that are likely to fail. The function NewCircuitBreaker creates a new CircuitBreaker.

func NewCircuitBreaker(st Settings) *CircuitBreaker

You can configure CircuitBreaker by the struct Settings:

type Settings struct {
	Name          string
	MaxRequests   uint32
	Interval      time.Duration
	Timeout       time.Duration
	ReadyToTrip   func(counts Counts) bool
	OnStateChange func(name string, from State, to State)
	IsSuccessful  func(err error) bool
}
  • Name is the name of the CircuitBreaker.

  • MaxRequests is the maximum number of requests allowed to pass through when the CircuitBreaker is half-open. If MaxRequests is 0, CircuitBreaker allows only 1 request.

  • Interval is the cyclic period of the closed state for CircuitBreaker to clear the internal Counts, described later in this section. If Interval is 0, CircuitBreaker doesn't clear the internal Counts during the closed state.

  • Timeout is the period of the open state, after which the state of CircuitBreaker becomes half-open. If Timeout is 0, the timeout value of CircuitBreaker is set to 60 seconds.

  • ReadyToTrip is called with a copy of Counts whenever a request fails in the closed state. If ReadyToTrip returns true, CircuitBreaker will be placed into the open state. If ReadyToTrip is nil, default ReadyToTrip is used. Default ReadyToTrip returns true when the number of consecutive failures is more than 5.

  • OnStateChange is called whenever the state of CircuitBreaker changes.

  • IsSuccessful is called with the error returned from a request. If IsSuccessful returns true, the error is counted as a success. Otherwise the error is counted as a failure. If IsSuccessful is nil, default IsSuccessful is used, which returns false for all non-nil errors.

The struct Counts holds the numbers of requests and their successes/failures:

type Counts struct {
	Requests             uint32
	TotalSuccesses       uint32
	TotalFailures        uint32
	ConsecutiveSuccesses uint32
	ConsecutiveFailures  uint32
}

CircuitBreaker clears the internal Counts either on the change of the state or at the closed-state intervals. Counts ignores the results of the requests sent before clearing.

CircuitBreaker can wrap any function to send a request:

func (cb *CircuitBreaker) Execute(req func() (interface{}, error)) (interface{}, error)

The method Execute runs the given request if CircuitBreaker accepts it. Execute returns an error instantly if CircuitBreaker rejects the request. Otherwise, Execute returns the result of the request. If a panic occurs in the request, CircuitBreaker handles it as an error and causes the same panic again.

Example

var cb *breaker.CircuitBreaker

func Get(url string) ([]byte, error) {
	body, err := cb.Execute(func() (interface{}, error) {
		resp, err := http.Get(url)
		if err != nil {
			return nil, err
		}

		defer resp.Body.Close()
		body, err := ioutil.ReadAll(resp.Body)
		if err != nil {
			return nil, err
		}

		return body, nil
	})
	if err != nil {
		return nil, err
	}

	return body.([]byte), nil
}

See example for details.

License

The MIT License (MIT)

See LICENSE for details.