bootstrap: freeze March 8 release path and evidence tooling

2026-02-26 22:48:50 +11:00
parent 8fa636acbb
commit 2147cec1c5
10 changed files with 682 additions and 280 deletions
--- a/docs/decisions/2026-02-26-resetdata-model-freeze.md
+++ b/docs/decisions/2026-02-26-resetdata-model-freeze.md
@@ -0,0 +1,46 @@
+# DR: ResetData Model Freeze for March 8 Bootstrap Release
+
+Date: February 26, 2026
+Status: Accepted
+Scope: March 8 bootstrap release window
+
+## Decision
+
+Freeze the release model pair to:
+
+- Primary: `openai/gpt-oss-120b`
+- Fallback: `zai-org/glm-4.7-fp8`
+
+## Why
+
+- Both models were validated live against `https://app.resetdata.ai/api/v1/chat/completions` with HTTP 200.
+- `penai/gpt-oss-120b` returned `model_not_found`; remove ambiguity and standardize on known-good IDs.
+- Existing compose defaults already used `openai/gpt-oss-120b`; align Go default to the same model.
+
+## Validation snapshot
+
+Probe run date: February 26, 2026 (UTC)
+
+- `zai-org/glm-4.7-fp8` -> 200
+- `openai/gpt-oss-120b` -> 200
+- `penai/gpt-oss-120b` -> 404 (`model_not_found`)
+- `meta/llama-3.1-8b-instruct` -> 200
+- `google/gemma-3-27b-it` -> 200
+
+## Implementation updates
+
+- Updated Go default model:
+  - `pkg/config/config.go`
+- Updated bootstrap gate validations:
+  - `testing/march8_bootstrap_gate.sh`
+- Updated release board:
+  - `docs/progress/MARCH8-BOOTSTRAP-RELEASE-BOARD.md`
+
+## Consequences
+
+- All release validation and e2e runs must use the frozen pair until March 8, 2026.
+- Any model change before release must open a new decision record and rerun live gate + evidence capture.
+
+## UCXL reference
+
+`ucxl://arbiter:release-coordinator@CHORUS:march8-bootstrap/#/docs/decisions/2026-02-26-resetdata-model-freeze.md`
--- a/docs/progress/MARCH8-BOOTSTRAP-RELEASE-BOARD.md
+++ b/docs/progress/MARCH8-BOOTSTRAP-RELEASE-BOARD.md
@@ -0,0 +1,92 @@
+# March 8 Bootstrap Release Board
+
+Date window: February 26, 2026 to March 8, 2026
+Objective: ship a replayable "CHORUS bootstrap path" that uses real inference, produces traceable artifacts, and avoids mock execution in the critical flow.
+
+## Scope lock (do not expand)
+
+Single path only:
+
+1. Issue intake
+2. SWOOSH transition
+3. CHORUS task execution (real model call)
+4. SLURP bundle creation
+5. BUBBLE decision record
+6. UCXL address persisted and retrievable
+
+Everything else is out of scope unless it blocks this path.
+
+## Release gates
+
+All must pass by March 8:
+
+- [ ] G1: No mock fallback in critical task execution path.
+- [ ] G2: ResetData model configuration is canonical and consistent across compose + Go defaults.
+- [ ] G3: At least one primary model and one fallback model validated against ResetData API.
+- [ ] G4: End-to-end run produces DR + UCXL pointer + provenance evidence.
+- [ ] G5: 24h stability test completes with reproducible logs and failure classification.
+- [ ] G6: Operator runbook exists with exact commands used for validation.
+
+## Frozen model pair (locked on February 26, 2026)
+
+- Primary: `openai/gpt-oss-120b`
+- Fallback: `zai-org/glm-4.7-fp8`
+- Validation status: both returned HTTP 200 against `https://app.resetdata.ai/api/v1/chat/completions` on February 26, 2026.
+
+## Daily plan
+
+### Feb 26-28: Remove ambiguity, remove mocks
+
+- [x] Freeze target model pair for release.
+- [x] Validate ResetData auth + chat completion from runtime environment.
+- [x] Remove or hard-disable mock execution in critical path.
+- [ ] Capture first green baseline run (single issue -> artifact path).
+
+### Mar 1-4: Stabilize integration
+
+- [ ] Run repeated e2e cycles under SWOOSH + CHORUS.
+- [ ] Measure pass rate, latency, and top failure classes.
+- [ ] Fix top 3 failure classes only.
+- [ ] Ensure DR/UCXL artifacts are emitted every successful run.
+
+### Mar 5-7: Hardening + evidence
+
+- [ ] Run 24h soak on frozen config.
+- [ ] Produce validation bundle (commands, logs, outputs, known limits).
+- [ ] Confirm rollback instructions.
+
+### Mar 8: Freeze + release
+
+- [ ] Freeze config/image tags.
+- [ ] Run final gate script.
+- [ ] Publish release note + operator checklist.
+
+## Coordination protocol
+
+- One active lane at a time:
+  - `NOW`
+  - `NEXT`
+  - `BLOCKED`
+- Any new idea goes to backlog unless directly required for a failing gate.
+- Every work item must map to at least one gate ID (`G1`..`G6`).
+- No "architecture expansion" during this window.
+
+## Work lanes
+
+NOW:
+- [x] Create and run bootstrap gate script (`testing/march8_bootstrap_gate.sh`)
+- [ ] Create and run e2e evidence capture (`testing/march8_e2e_evidence.sh`)
+
+NEXT:
+- [ ] Capture first baseline evidence bundle with DR + UCXL + provenance
+
+BLOCKED:
+- [ ] None
+
+## Evidence checklist (release packet)
+
+- [ ] Gate script output (final passing run)
+- [ ] Model validation output (primary + fallback)
+- [ ] E2E run log showing DR + UCXL + provenance
+- [ ] 24h soak summary (pass/fail + failures by class)
+- [ ] Known limitations and immediate post-release priorities