# Backbeat Protocol — Pulse/Reverb (v0.1) > **Purpose:** Give CHORUS a shared, lightweight rhythm so multi‑agent, p2p work doesn’t deadlock or drift. Standardise expectations (plan/work/review), exit conditions, promises, and time‑bounded collaboration across CHORUS, HMMM, SLURP, SHHH, UCXL, WHOOSH, and COOEE. --- ## 1) Rationale - **Problem:** In pub/sub meshes, agents can wait indefinitely for help/context; there’s no universal cadence for planning, execution, or re‑evaluation. - **Principle:** Use **coarse, explicit tempo** (beats/bars) for policy alignment; not for hard real‑time sync. Must be **partition‑tolerant**, **observable**, and **cheap**. - **Design:** Human‑readable **beats/bars/phrases** for policy, **Hybrid Logical Clocks (HLC)** for mergeable ordering. --- ## 2) Core Concepts - **Tempo (BPM):** Beats per minute (e.g., 6–30 BPM). Cluster‑level default; task classes may suggest hints. - **Beat:** Base epoch (e.g., 4 s @ 15 BPM). - **Bar:** Group of beats (e.g., 8). **Downbeat** (beat 1) is a soft barrier (checkpoints, secret rotation). - **Phrase:** A sequence of bars that maps to a work cycle: **plan → work → review**. - **Score (per task):** Declarative allocation of beats across phases + wait budgets + retries. --- ## 3) Roles & Components - **Pulse:** Cluster tempo broadcaster. Publishes `BeatFrame` each beat; single elected leader (Raft/etcd), followers can degrade to local. - **Reverb:** Aggregator/rollup. Ingests `StatusClaim`s and emits per‑bar `BarReport`s, plus hints for adaptive tempo. - **Agents (CHORUS workers, HMMM collaborators, SLURP, etc.):** Consume beats, enforce **Score**, publish `StatusClaim`s. - **SHHH:** Rotates short‑lived secrets **on downbeats** (per‑bar keys). - **COOEE/DHT:** Transport for topics `backbeat://cluster/{id}` and per‑project status lanes. ### Implementation Snapshot (2025-10) - **Pulse service (`cmd/pulse`)** – Encapsulates Raft leader election (`internal/backbeat/leader.go`), Hybrid Logical Clock maintenance (`internal/backbeat/hlc.go`), degradation control (`internal/backbeat/degradation.go`), and beat publishing over NATS. It also exposes an admin HTTP surface and collects tempo/drift metrics via `internal/backbeat/metrics.go`. - **Reverb service (`cmd/reverb`)** – Subscribes to pulse beats and agent status subjects, aggregates `StatusClaim`s into rolling windows, and emits `BarReport`s on downbeats. Readiness, health, and Prometheus endpoints report claim throughput, aggregation latency, and NATs failures. - **Go SDK (`pkg/sdk`)** – Provides clients for beat callbacks, status emission, and health reporting with retry/circuit breaker hooks. CHORUS (`project-queues/active/CHORUS/internal/backbeat/integration.go`) and WHOOSH (`project-queues/active/WHOOSH/internal/backbeat/integration.go`) embed the SDK to align runtime operations with cluster tempo. - **Inter-module telemetry** – CHORUS maps P2P lifecycle operations (elections, DHT bootstrap, council delivery) into BACKBEAT status claims, while WHOOSH emits search/composer activity. This keeps Reverb windows authoritative for council health and informs SLURP/BUBBLE provenance. - **Observability bundle** – Monitoring assets (`monitoring/`, `prometheus.yml`) plus service metrics export drift, tempo adjustments, Raft state, and window KPIs, meeting BACKBEAT-PER-001/002/003 targets and enabling WHOOSH scaling gates to react to rhythm degradation. --- ## 4) Wire Model ### 4.1 BeatFrame (Pulse → all) ```json { "cluster_id": "chorus-aus-01", "tempo_bpm": 15, "beat_ms": 4000, "bar_len_beats": 8, "bar": 1287, "beat": 3, "phase": "work", "hlc": "2025-09-03T02:12:27.183Z+1287:3+17", "policy_hash": "sha256:...", "deadline_at": "2025-09-03T02:12:31.183Z" } ``` ### 4.2 StatusClaim (agents → Reverb) ```json { "agent_id": "chorus-192-168-1-27", "task_id": "ucxl://...", "bar": 1287, "beat": 3, "state": "planning|executing|waiting|review|done|failed", "wait_for": ["hmmm://thread/abc"], "beats_left": 2, "progress": 0.42, "notes": "awaiting summarised artifact from peer", "hlc": "..." } ``` ### 4.3 HelpPromise (HMMM → requester) ```json { "thread_id": "hmmm://thread/abc", "promise_beats": 2, "confidence": 0.7, "fail_after_beats": 3, "on_fail": "fallback-plan-A" } ``` ### 4.4 BarReport (Reverb → observability) - Per‑bar rollup: task counts by state, overruns, broken promises, queue depth, utilisation hints, suggested tempo/phase tweak. --- ## 5) Score Spec (YAML) ```yaml score: tempo: 15 # bpm hint; cluster policy can override bar_len: 8 # beats per bar phases: plan: 2 # beats work: 4 review: 2 wait_budget: help: 2 # max beats to wait for HMMM replies across the phrase io: 1 # max beats to wait for I/O retry: max_phrases: 2 backoff: geometric # plan/work/review shrink each retry escalation: on_wait_exhausted: ["emit:needs-attention", "fallback:coarse-answer"] on_overrun: ["checkpoint", "defer:next-phrase"] ``` > **Rule:** Agents must not exceed phase beat allocations. If `help` budget is exhausted, **exit cleanly** with degraded but auditable output. --- ## 6) Agent Loop (sketch) ```text on BeatFrame(bf): if new bar and beat==1: rotate_ephemeral_keys(); checkpoint(); phase = score.phase_for(bf.beat) switch phase: PLAN: if not planned: do_planning_until(phase_end) WORK: if need_help and !help_promised: request_help_with_promise() if waiting_for_help: if wait_beats > score.wait_budget.help: exit_with_fallback() else continue_work_on_alternative_path() else do_work_step() REVIEW: run_tests_and_summarise(); publish StatusClaim(state=done|failed) enforce_cutoffs_at_phase_boundaries() ``` --- ## 7) Adaptive Tempo Controller (ATC) - **Inputs:** Queue depth per role, GPU/CPU util (WHOOSH), overrun frequency, broken promises. - **Policy:** Adjust `tempo_bpm` and/or redistribute phase beats **between bars only** (PI‑style control, hysteresis ±10%). - **Guardrails:** ≤1 beat change per minute; freeze during incidents. --- ## 8) Exit Conditions & Deadlock Prevention - **Wait budgets** are hard ceilings. Missing `HelpPromise` by end‑of‑bar triggers `on_wait_exhausted`. - **Locks & leases** expire at bar boundaries unless renewed with `beats_left`. - **Promises** include `promise_beats` and `fail_after_beats` so callers can plan. - **Idempotent checkpoints** at downbeats enable safe retries/resumptions. --- ## 9) Integration Points - **CHORUS (workers):** Consume `BeatFrame`; enforce `Score`; publish `StatusClaim` each beat/change. - **HMMM (collab):** Replies carry `HelpPromise`; threads auto‑close if `fail_after_beats` elapses. - **SLURP (curation):** Batch ingest windows tied to review beats; produce bar‑stamped artefacts. - **SHHH (secrets):** Rotate per bar; credentials scoped to ``. - **UCXL:** Attach tempo metadata to deliverables: `{bar, beat, hlc}`; optional address suffix `;bar=1287#beat=8`. - **WHOOSH:** Expose utilisation to ATC; enforce resource leases in beat units. - **COOEE/DHT:** Topics: `backbeat://cluster/{id}`, `status://{project}`, `promise://hmmm`. --- ## 10) Failure Modes & Degraded Operation - **No Pulse leader:** Agents derive a **median‑of‑pulses** from available Purses; if none, use local monotonic clock (jitter ok) and **freeze tempo changes**. - **Partitions:** Keep counting beats locally (HLC ensures mergeable order). Reverb reconciles by HLC and bar on heal. - **Drift:** Tempo changes only on downbeats; publish `policy_hash` so agents detect misconfig. --- ## 11) Config Examples ### 11.1 Cluster Tempo Policy ```yaml cluster_id: chorus-aus-01 initial_bpm: 12 bar_len_beats: 8 phases: [plan, work, review] limits: max_bpm: 24 min_bpm: 6 adaptation: enable: true hysteresis_pct: 10 change_per_minute: 1_beat observability: emit_bar_reports: true ``` ### 11.2 Task Score (attached to UCXL deliverable) ```yaml ucxl: ucxl://proj:any/*/task/graph_ingest score: tempo: 15 bar_len: 8 phases: {plan: 2, work: 4, review: 2} wait_budget: {help: 2, io: 1} retry: {max_phrases: 2, backoff: geometric} escalation: on_wait_exhausted: ["emit:needs-attention", "fallback:coarse-answer"] ``` --- ## 12) Observability - **Per‑bar dashboards:** state counts, overruns, broken promises, tempo changes, queue depth, utilisation. - **Trace stamps:** Every artifact/event carries `{bar, beat, hlc}` for forensic replay. - **Alarms:** `promise_miss_rate`, `overrun_rate`, `no_status_claims`. --- ## 13) Security - Rotate **ephemeral keys on downbeats**; scope to project/role when possible. - Bar‑stamped tokens reduce blast radius; revoke at bar+N. --- ## 14) Economics & Budgeting — Beats as Unit of Cost ### 14.1 Beat Unit (BU) - **Definition:** 1 BU = one cluster beat interval (`beat_ms`). It’s the atomic scheduling & accounting quantum. ### 14.2 Resource Primitives (WHOOSH‑measured) - `cpu_sec`, `gpu_sec[class]`, `accel_sec[class]`, `mem_gbs` (GB·s), `disk_io_mb`, `net_egress_mb`, `storage_gbh`. ### 14.3 Budget & Costing ```yaml budget: max_bu: N_total phase_caps: { plan: Np, work: Nw, review: Nr } wait_caps: { help: Nh, io: Ni } hard_end: bar+K charge_to: ucxl://acct/... ``` Cost per phrase: ``` Total = Σ(beats_used * role_rate_bu) + Σ_class(gpu_sec[class] * rate_gpu_sec[class]) + cpu_sec*rate_cpu_sec + mem_gbs*rate_mem_gbs + disk_io_mb*rate_io_mb + net_egress_mb*rate_egress_mb + storage_gbh*rate_storage_gbh ``` ### 14.4 KPIs - **TNT** (tempo‑normalised throughput), **BPD** (beats per deliverable), **WR** (wait ratio), **η** (efficiency), **PMR** (promise miss rate), **CPD** (cost per deliverable), **TTFU** (time to first useful). --- ## 15) Tokenless Accounting (Hybrid CPU/GPU, On‑prem + Cloud) - **No tokens.** Price **beats + measured resources**; ignore model‑token counts. - **Device classes:** price per GPU/accelerator class (A100, 4090, MI300X, TPU…). - **Rates:** on‑prem from TCO / duty‑cycle seconds; cloud from per‑second list prices. Bind via config. - **Beat‑scoped caps:** per‑BU ceilings on resource primitives to contain spend regardless of hardware skew. - **Calibration (planning‑only):** per‑family normalisers if you want **Effective Compute Units** for planning; **billing remains raw seconds**. --- ## 16) MVP Bring‑up Plan 1. **Pulse**: static BPM, broadcast `BeatFrame` over COOEE. 2. **Agents**: publish `StatusClaim`; enforce `wait_budget` & `HelpPromise`. 3. **Reverb**: roll up to `BarReport`; surface early KPIs. 4. **SHHH**: rotate credentials on downbeats. 5. **ATC**: enable adaptation after telemetry. --- ## 17) Open Questions - Per‑role tempi vs one cluster tempo? - Fixed `bar_len` vs dynamic redistribution of phase beats? - Score UI: YAML + visual “score sheet” editor? --- ### Naming (on brand) - **Backbeat Protocol** — **Pulse** (broadcaster) + **Reverb** (rollup & reports). Musical, expressive; conveys ripples from each downbeat. # Backbeat — Relative Beats Addendum (UCXL ^^/~~) **Why this addendum?** We’re removing dependence on ever‑increasing `bar`/`beat` counters. All coordination is expressed **relative to NOW** in **beats**, aligned with UCXL temporal markers `^^` (future) and `~~` (past). ## A) Wire Model Adjustments ### BeatFrame (Pulse → all) **Replace** prior fields `{bar, beat}` with: ```json { "cluster_id": "...", "tempo_bpm": 15, "beat_ms": 4000, "bar_len_beats": 8, "beat_index": 3, // 1..bar_len_beats (cyclic within bar) "beat_epoch": "2025-09-03T02:12:27.000Z", // start time of this beat "downbeat": false, // true when beat_index==1 "phase": "work", "hlc": "2025-09-03T02:12:27.183Z+17", "policy_hash": "sha256:...", "deadline_at": "2025-09-03T02:12:31.183Z" } ``` ### StatusClaim (agents → Reverb) **Replace** prior fields `{bar, beat}` with: ```json { "agent_id": "...", "task_id": "...", "beat_index": 3, "state": "planning|executing|waiting|review|done|failed", "beats_left": 2, "progress": 0.42, "notes": "...", "hlc": "..." } ``` ### Bar/Window Aggregation - Reverb aggregates per **window** bounded by `downbeat=true` frames. - **No global bar counters** are transmitted. Observability UIs may keep a local `window_id` for navigation. ## B) UCXL Temporal Suffix - (Requires RFC-UCXL 1.1) Attach **relative beat** navigation to any UCXL address: - `;beats=^^N` → target **N beats in the future** from now - `;beats=~~N` → target **N beats in the past** from now - Optional: `;phase=plan|work|review` **Example:** ``` ucxl://proj:any/*/task/ingest;beats=^^4;phase=work ``` ## C) Policy & Promises - All time budgets are **Δbeats**: `wait_budget.help`, `retry.max_phrases`, `promise_beats`, `fail_after_beats`. - **Leases/locks** renew per beat and expire on phase change unless renewed. ## D) Derivations - `beat_index = 1 + floor( (unix_ms / beat_ms) mod bar_len_beats )` (derived locally). - `beat_epoch = floor_to_multiple(now, beat_ms)`. - `Δbeats(target_time) = round( (target_time - now) / beat_ms )`. ## E) Compatibility Notes - Old fields `{bar, beat}` are **deprecated**; if received, they can be ignored or mapped to local windows. - HLC remains the canonical merge key for causality. ## F) Action Items 1. Update the **spec wire model** sections accordingly. 2. Regenerate the **Go prototype** using `BeatIndex/BeatEpoch/Downbeat` instead of `Bar/Beat` counters. 3. Add UCXL parsing for `;beats=^^/~~` in RUSTLE. - [ ] TODO: RUSTLE update for BACKBEAT compatibility