bootstrap: freeze March 8 release path and evidence tooling

This commit is contained in:
anthonyrawlins
2026-02-26 22:48:50 +11:00
parent 8fa636acbb
commit 2147cec1c5
10 changed files with 682 additions and 280 deletions

View File

@@ -0,0 +1,46 @@
# DR: ResetData Model Freeze for March 8 Bootstrap Release
Date: February 26, 2026
Status: Accepted
Scope: March 8 bootstrap release window
## Decision
Freeze the release model pair to:
- Primary: `openai/gpt-oss-120b`
- Fallback: `zai-org/glm-4.7-fp8`
## Why
- Both models were validated live against `https://app.resetdata.ai/api/v1/chat/completions` with HTTP 200.
- `penai/gpt-oss-120b` returned `model_not_found`; remove ambiguity and standardize on known-good IDs.
- Existing compose defaults already used `openai/gpt-oss-120b`; align Go default to the same model.
## Validation snapshot
Probe run date: February 26, 2026 (UTC)
- `zai-org/glm-4.7-fp8` -> 200
- `openai/gpt-oss-120b` -> 200
- `penai/gpt-oss-120b` -> 404 (`model_not_found`)
- `meta/llama-3.1-8b-instruct` -> 200
- `google/gemma-3-27b-it` -> 200
## Implementation updates
- Updated Go default model:
- `pkg/config/config.go`
- Updated bootstrap gate validations:
- `testing/march8_bootstrap_gate.sh`
- Updated release board:
- `docs/progress/MARCH8-BOOTSTRAP-RELEASE-BOARD.md`
## Consequences
- All release validation and e2e runs must use the frozen pair until March 8, 2026.
- Any model change before release must open a new decision record and rerun live gate + evidence capture.
## UCXL reference
`ucxl://arbiter:release-coordinator@CHORUS:march8-bootstrap/#/docs/decisions/2026-02-26-resetdata-model-freeze.md`

View File

@@ -0,0 +1,92 @@
# March 8 Bootstrap Release Board
Date window: February 26, 2026 to March 8, 2026
Objective: ship a replayable "CHORUS bootstrap path" that uses real inference, produces traceable artifacts, and avoids mock execution in the critical flow.
## Scope lock (do not expand)
Single path only:
1. Issue intake
2. SWOOSH transition
3. CHORUS task execution (real model call)
4. SLURP bundle creation
5. BUBBLE decision record
6. UCXL address persisted and retrievable
Everything else is out of scope unless it blocks this path.
## Release gates
All must pass by March 8:
- [ ] G1: No mock fallback in critical task execution path.
- [ ] G2: ResetData model configuration is canonical and consistent across compose + Go defaults.
- [ ] G3: At least one primary model and one fallback model validated against ResetData API.
- [ ] G4: End-to-end run produces DR + UCXL pointer + provenance evidence.
- [ ] G5: 24h stability test completes with reproducible logs and failure classification.
- [ ] G6: Operator runbook exists with exact commands used for validation.
## Frozen model pair (locked on February 26, 2026)
- Primary: `openai/gpt-oss-120b`
- Fallback: `zai-org/glm-4.7-fp8`
- Validation status: both returned HTTP 200 against `https://app.resetdata.ai/api/v1/chat/completions` on February 26, 2026.
## Daily plan
### Feb 26-28: Remove ambiguity, remove mocks
- [x] Freeze target model pair for release.
- [x] Validate ResetData auth + chat completion from runtime environment.
- [x] Remove or hard-disable mock execution in critical path.
- [ ] Capture first green baseline run (single issue -> artifact path).
### Mar 1-4: Stabilize integration
- [ ] Run repeated e2e cycles under SWOOSH + CHORUS.
- [ ] Measure pass rate, latency, and top failure classes.
- [ ] Fix top 3 failure classes only.
- [ ] Ensure DR/UCXL artifacts are emitted every successful run.
### Mar 5-7: Hardening + evidence
- [ ] Run 24h soak on frozen config.
- [ ] Produce validation bundle (commands, logs, outputs, known limits).
- [ ] Confirm rollback instructions.
### Mar 8: Freeze + release
- [ ] Freeze config/image tags.
- [ ] Run final gate script.
- [ ] Publish release note + operator checklist.
## Coordination protocol
- One active lane at a time:
- `NOW`
- `NEXT`
- `BLOCKED`
- Any new idea goes to backlog unless directly required for a failing gate.
- Every work item must map to at least one gate ID (`G1`..`G6`).
- No "architecture expansion" during this window.
## Work lanes
NOW:
- [x] Create and run bootstrap gate script (`testing/march8_bootstrap_gate.sh`)
- [ ] Create and run e2e evidence capture (`testing/march8_e2e_evidence.sh`)
NEXT:
- [ ] Capture first baseline evidence bundle with DR + UCXL + provenance
BLOCKED:
- [ ] None
## Evidence checklist (release packet)
- [ ] Gate script output (final passing run)
- [ ] Model validation output (primary + fallback)
- [ ] E2E run log showing DR + UCXL + provenance
- [ ] 24h soak summary (pass/fail + failures by class)
- [ ] Known limitations and immediate post-release priorities