chore: align slurp config and scaffolding
This commit is contained in:
94
docs/development/sec-slurp-ucxl-beacon-pin-steward.md
Normal file
94
docs/development/sec-slurp-ucxl-beacon-pin-steward.md
Normal file
@@ -0,0 +1,94 @@
|
||||
# SEC-SLURP UCXL Beacon & Pin Steward Design Notes
|
||||
|
||||
## Purpose
|
||||
- Establish the authoritative UCXL context beacon that bridges SLURP persistence with WHOOSH/role-aware agents.
|
||||
- Define the Pin Steward responsibilities so DHT replication, healing, and telemetry satisfy SEC-SLURP 1.1a acceptance criteria.
|
||||
- Provide an incremental execution plan aligned with the Persistence Wiring Report and DHT Resilience Supplement.
|
||||
|
||||
## UCXL Beacon Data Model
|
||||
- **manifest_id** (`string`): deterministic hash of `project:task:address:version`.
|
||||
- **ucxl_address** (`ucxl.Address`): canonical address that produced the manifest.
|
||||
- **context_version** (`int`): monotonic version from SLURP temporal graph.
|
||||
- **source_hash** (`string`): content hash emitted by `persistContext` (LevelDB) for change detection.
|
||||
- **generated_by** (`string`): CHORUS agent id / role bundle that wrote the context.
|
||||
- **generated_at** (`time.Time`): timestamp from SLURP persistence event.
|
||||
- **replica_targets** (`[]string`): desired replica node ids (Pin Steward enforces `replication_factor`).
|
||||
- **replica_state** (`[]ReplicaInfo`): health snapshot (`node_id`, `provider_id`, `status`, `last_checked`, `latency_ms`).
|
||||
- **encryption** (`EncryptionMetadata`):
|
||||
- `dek_fingerprint` (`string`)
|
||||
- `kek_policy` (`string`): BACKBEAT rotation policy identifier.
|
||||
- `rotation_due` (`time.Time`)
|
||||
- **compliance_tags** (`[]string`): SHHH/WHOOSH governance hooks (e.g. `sec-high`, `audit-required`).
|
||||
- **beacon_metrics** (`BeaconMetrics`): summarized counters for cache hits, DHT retrieves, validation errors.
|
||||
|
||||
### Storage Strategy
|
||||
- Primary persistence in LevelDB (`pkg/slurp/slurp.go`) using key prefix `beacon::<manifest_id>`.
|
||||
- Secondary replication to DHT under `dht://beacon/<manifest_id>` enabling WHOOSH agents to read via Pin Steward API.
|
||||
- Optional export to UCXL Decision Record envelope for historical traceability.
|
||||
|
||||
## Beacon APIs
|
||||
| Endpoint | Purpose | Notes |
|
||||
|----------|---------|-------|
|
||||
| `Beacon.Upsert(manifest)` | Persist/update manifest | Called by SLURP after `persistContext` success. |
|
||||
| `Beacon.Get(ucxlAddress)` | Resolve latest manifest | Used by WHOOSH/agents to locate canonical context. |
|
||||
| `Beacon.List(filter)` | Query manifests by tags/roles/time | Backs dashboards and Pin Steward audits. |
|
||||
| `Beacon.StreamChanges(since)` | Provide change feed for Pin Steward anti-entropy jobs | Implements backpressure and bookmark tokens. |
|
||||
|
||||
All APIs return envelope with UCXL citation + checksum to make SLURP⇄WHOOSH handoff auditable.
|
||||
|
||||
## Pin Steward Responsibilities
|
||||
1. **Replication Planning**
|
||||
- Read manifests via `Beacon.StreamChanges`.
|
||||
- Evaluate current replica_state vs. `replication_factor` from configuration.
|
||||
- Produce queue of DHT store/refresh tasks (`storeAsync`, `storeSync`, `storeQuorum`).
|
||||
2. **Healing & Anti-Entropy**
|
||||
- Schedule `heal_under_replicated` jobs every `anti_entropy_interval`.
|
||||
- Re-announce providers on Pulse/Reverb when TTL < threshold.
|
||||
- Record outcomes back into manifest (`replica_state`).
|
||||
3. **Envelope Encryption Enforcement**
|
||||
- Request KEK material from KACHING/SHHH as described in SEC-SLURP 1.1a.
|
||||
- Ensure DEK fingerprints match `encryption` metadata; trigger rotation if stale.
|
||||
4. **Telemetry Export**
|
||||
- Emit Prometheus counters: `pin_steward_replica_heal_total`, `pin_steward_replica_unhealthy`, `pin_steward_encryption_rotations_total`.
|
||||
- Surface aggregated health to WHOOSH dashboards for council visibility.
|
||||
|
||||
## Interaction Flow
|
||||
1. **SLURP Persistence**
|
||||
- `UpsertContext` → LevelDB write → manifests assembled (`persistContext`).
|
||||
- Beacon `Upsert` called with manifest + context hash.
|
||||
2. **Pin Steward Intake**
|
||||
- `StreamChanges` yields manifest → steward verifies encryption metadata and schedules replication tasks.
|
||||
3. **DHT Coordination**
|
||||
- `ReplicationManager.EnsureReplication` invoked with target factor.
|
||||
- `defaultVectorClockManager` (temporary) to be replaced with libp2p-aware implementation for provider TTL tracking.
|
||||
4. **WHOOSH Consumption**
|
||||
- WHOOSH SLURP proxy fetches manifest via `Beacon.Get`, caches in WHOOSH DB, attaches to deliverable artifacts.
|
||||
- Council UI surfaces replication state + encryption posture for operator decisions.
|
||||
|
||||
## Incremental Delivery Plan
|
||||
1. **Sprint A (Persistence parity)**
|
||||
- Finalize LevelDB manifest schema + tests (extend `slurp_persistence_test.go`).
|
||||
- Implement Beacon interfaces within SLURP service (in-memory + LevelDB).
|
||||
- Add Prometheus metrics for persistence reads/misses.
|
||||
2. **Sprint B (Pin Steward MVP)**
|
||||
- Build steward worker with configurable reconciliation loop.
|
||||
- Wire to existing `DistributedStorage` stubs (`StoreAsync/Sync/Quorum`).
|
||||
- Emit health logs; integrate with CLI diagnostics.
|
||||
3. **Sprint C (DHT Resilience)**
|
||||
- Swap `defaultVectorClockManager` with libp2p implementation; add provider TTL probes.
|
||||
- Implement envelope encryption path leveraging KACHING/SHHH interfaces (replace stubs in `pkg/crypto`).
|
||||
- Add CI checks: replica factor assertions, provider refresh tests, beacon schema validation.
|
||||
4. **Sprint D (WHOOSH Integration)**
|
||||
- Expose REST/gRPC endpoint for WHOOSH to query manifests.
|
||||
- Update WHOOSH SLURPArtifactManager to require beacon confirmation before submission.
|
||||
- Surface Pin Steward alerts in WHOOSH admin UI.
|
||||
|
||||
## Open Questions
|
||||
- Confirm whether Beacon manifests should include DER signatures or rely on UCXL envelope hash.
|
||||
- Determine storage for historical manifests (append-only log vs. latest-only) to support temporal rewind.
|
||||
- Align Pin Steward job scheduling with existing BACKBEAT cadence to avoid conflicting rotations.
|
||||
|
||||
## Next Actions
|
||||
- Prototype `BeaconStore` interface + LevelDB implementation in SLURP package.
|
||||
- Document Pin Steward anti-entropy algorithm with pseudocode and integrate into SEC-SLURP test plan.
|
||||
- Sync with WHOOSH team on manifest query contract (REST vs. gRPC; pagination semantics).
|
||||
52
docs/development/sec-slurp-whoosh-integration-demo.md
Normal file
52
docs/development/sec-slurp-whoosh-integration-demo.md
Normal file
@@ -0,0 +1,52 @@
|
||||
# WHOOSH ↔ CHORUS Integration Demo Plan (SEC-SLURP Track)
|
||||
|
||||
## Demo Objectives
|
||||
- Showcase end-to-end persistence → UCXL beacon → Pin Steward → WHOOSH artifact submission flow.
|
||||
- Validate role-based agent interactions with SLURP contexts (resolver + temporal graph) prior to DHT hardening.
|
||||
- Capture metrics/telemetry needed for SEC-SLURP exit criteria and WHOOSH Phase 1 sign-off.
|
||||
|
||||
## Sequenced Milestones
|
||||
1. **Persistence Validation Session**
|
||||
- Run `GOWORK=off go test ./pkg/slurp/...` with stubs patched; demo LevelDB warm/load using `slurp_persistence_test.go`.
|
||||
- Inspect beacon manifests via CLI (`slurpctl beacon list`).
|
||||
- Deliverable: test log + manifest sample archived in UCXL.
|
||||
|
||||
2. **Beacon → Pin Steward Dry Run**
|
||||
- Replay stored manifests through Pin Steward worker with mock DHT backend.
|
||||
- Show replication planner queue + telemetry counters (`pin_steward_replica_heal_total`).
|
||||
- Deliverable: decision record linking manifest to replication outcome.
|
||||
|
||||
3. **WHOOSH SLURP Proxy Alignment**
|
||||
- Point WHOOSH dev stack (`npm run dev`) at local SLURP with beacon API enabled.
|
||||
- Walk through council formation, capture SLURP artifact submission with beacon confirmation modal.
|
||||
- Deliverable: screen recording + WHOOSH DB entry referencing beacon manifest id.
|
||||
|
||||
4. **DHT Resilience Checkpoint**
|
||||
- Switch Pin Steward to libp2p DHT (once wired) and run replication + provider TTL check.
|
||||
- Fail one node intentionally, demonstrate heal path + alert surfaced in WHOOSH UI.
|
||||
- Deliverable: telemetry dump + alert screenshot.
|
||||
|
||||
5. **Governance & Telemetry Wrap-Up**
|
||||
- Export Prometheus metrics (cache hit/miss, beacon writes, replication heals) into KACHING dashboard.
|
||||
- Publish Decision Record documenting UCXL address flow, referencing SEC-SLURP docs.
|
||||
|
||||
## Roles & Responsibilities
|
||||
- **SLURP Team:** finalize persistence build, implement beacon APIs, own Pin Steward worker.
|
||||
- **WHOOSH Team:** wire beacon client, expose replication/encryption status in UI, capture council telemetry.
|
||||
- **KACHING/SHHH Stakeholders:** validate telemetry ingestion and encryption custody notes.
|
||||
- **Program Management:** schedule demo rehearsal, ensure Decision Records and UCXL addresses recorded.
|
||||
|
||||
## Tooling & Environments
|
||||
- Local cluster via `docker compose up slurp whoosh pin-steward` (to be scripted in `commands/`).
|
||||
- Use `make demo-sec-slurp` target to run integration harness (to be added).
|
||||
- Prometheus/Grafana docker compose for metrics validation.
|
||||
|
||||
## Success Criteria
|
||||
- Beacon manifest accessible from WHOOSH UI within 2s average latency.
|
||||
- Pin Steward resolves under-replicated manifest within demo timeline (<30s) and records healing event.
|
||||
- All demo steps logged with UCXL references and SHHH redaction checks passing.
|
||||
|
||||
## Open Items
|
||||
- Need sample repo/issues to feed WHOOSH analyzer (consider `project-queues/active/WHOOSH/demo-data`).
|
||||
- Determine minimal DHT cluster footprint for the demo (3 vs 5 nodes).
|
||||
- Align on telemetry retention window for demo (24h?).
|
||||
32
docs/progress/SEC-SLURP-1.1a-supplemental.md
Normal file
32
docs/progress/SEC-SLURP-1.1a-supplemental.md
Normal file
@@ -0,0 +1,32 @@
|
||||
# SEC-SLURP 1.1a – DHT Resilience Supplement
|
||||
|
||||
## Requirements (derived from `docs/Modules/DHT.md`)
|
||||
|
||||
1. **Real DHT state & persistence**
|
||||
- Replace mock DHT usage with libp2p-based storage or equivalent real implementation.
|
||||
- Store DHT/blockstore data on persistent volumes (named volumes/ZFS/NFS) with node placement constraints.
|
||||
- Ensure bootstrap nodes are stateful and survive container churn.
|
||||
|
||||
2. **Pin Steward + replication policy**
|
||||
- Introduce a Pin Steward service that tracks UCXL CID manifests and enforces replication factor (e.g. 3–5 replicas).
|
||||
- Re-announce providers on Pulse/Reverb and heal under-replicated content.
|
||||
- Schedule anti-entropy jobs to verify and repair replicas.
|
||||
|
||||
3. **Envelope encryption & shared key custody**
|
||||
- Implement envelope encryption (DEK+KEK) with threshold/organizational custody rather than per-role ownership.
|
||||
- Store KEK metadata with UCXL manifests; rotate via BACKBEAT.
|
||||
- Update crypto/key-manager stubs to real implementations once available.
|
||||
|
||||
4. **Shared UCXL Beacon index**
|
||||
- Maintain an authoritative CID registry (DR/UCXL) replicated outside individual agents.
|
||||
- Ensure metadata updates are durable and role-agnostic to prevent stranded CIDs.
|
||||
|
||||
5. **CI/SLO validation**
|
||||
- Add automated tests/health checks covering provider refresh, replication factor, and persistent-storage guarantees.
|
||||
- Gate releases on DHT resilience checks (provider TTLs, replica counts).
|
||||
|
||||
## Integration Path for SEC-SLURP 1.1
|
||||
|
||||
- Incorporate the above requirements as acceptance criteria alongside LevelDB persistence.
|
||||
- Sequence work to: migrate DHT interactions, introduce Pin Steward, implement envelope crypto, and wire CI validation.
|
||||
- Attach artifacts (Pin Steward design, envelope crypto spec, CI scripts) to the Phase 1 deliverable checklist.
|
||||
@@ -5,10 +5,14 @@
|
||||
- Upgraded SLURP’s lifecycle so initialization bootstraps cached context data from disk, cache misses hydrate from persistence, successful `UpsertContext` calls write back to LevelDB, and shutdown closes the store with error telemetry.
|
||||
- Introduced `pkg/slurp/slurp_persistence_test.go` to confirm contexts survive process restarts and can be resolved after clearing in-memory caches.
|
||||
- Instrumented cache/persistence metrics so hit/miss ratios and storage failures are tracked for observability.
|
||||
- Attempted `GOWORK=off go test ./pkg/slurp`; execution was blocked by legacy references to `config.Authority*` symbols in `pkg/slurp/context`, so the new test did not run.
|
||||
- Implemented lightweight crypto/key-management stubs (`pkg/crypto/role_crypto_stub.go`, `pkg/crypto/key_manager_stub.go`) so SLURP modules compile while the production stack is ported.
|
||||
- Updated DHT distribution and encrypted storage layers (`pkg/slurp/distribution/dht_impl.go`, `pkg/slurp/storage/encrypted_storage.go`) to use the crypto stubs, adding per-role fingerprints and durable decoding logic.
|
||||
- Expanded storage metadata models (`pkg/slurp/storage/types.go`, `pkg/slurp/storage/backup_manager.go`) with fields referenced by backup/replication flows (progress, error messages, retention, data size).
|
||||
- Incrementally stubbed/simplified distributed storage helpers to inch toward a compilable SLURP package.
|
||||
- Attempted `GOWORK=off go test ./pkg/slurp`; the original authority-level blocker is resolved, but builds still fail in storage/index code due to remaining stub work (e.g., Bleve queries, DHT helpers).
|
||||
|
||||
## Recommended Next Steps
|
||||
- Address the `config.Authority*` symbol drift (or scope down the impacted packages) so the SLURP test suite can compile cleanly, then rerun `GOWORK=off go test ./pkg/slurp` to validate persistence changes.
|
||||
- Feed the durable store into the resolver and temporal graph implementations to finish the remaining Phase 1 SLURP roadmap items.
|
||||
- Expand Prometheus metrics and logging to track cache hit/miss ratios plus persistence errors for SEC-SLURP observability goals.
|
||||
- Review unrelated changes on `feature/phase-4-real-providers` (e.g., docker-compose edits) and either align them with this roadmap work or revert to keep the branch focused.
|
||||
- Stub the remaining storage/index dependencies (Bleve query scaffolding, UCXL helpers, `errorCh` queues, cache regex usage) or neutralize the heavy modules so that `GOWORK=off go test ./pkg/slurp` compiles and runs.
|
||||
- Feed the durable store into the resolver and temporal graph implementations to finish the SEC-SLURP 1.1 milestone once the package builds cleanly.
|
||||
- Extend Prometheus metrics/logging to track cache hit/miss ratios plus persistence errors for observability alignment.
|
||||
- Review unrelated changes still tracked on `feature/phase-4-real-providers` (e.g., docker-compose edits) and either align them with this roadmap work or revert for focus.
|
||||
|
||||
Reference in New Issue
Block a user