bootstrap: freeze March 8 release path and evidence tooling
This commit is contained in:
46
docs/decisions/2026-02-26-resetdata-model-freeze.md
Normal file
46
docs/decisions/2026-02-26-resetdata-model-freeze.md
Normal file
@@ -0,0 +1,46 @@
|
||||
# DR: ResetData Model Freeze for March 8 Bootstrap Release
|
||||
|
||||
Date: February 26, 2026
|
||||
Status: Accepted
|
||||
Scope: March 8 bootstrap release window
|
||||
|
||||
## Decision
|
||||
|
||||
Freeze the release model pair to:
|
||||
|
||||
- Primary: `openai/gpt-oss-120b`
|
||||
- Fallback: `zai-org/glm-4.7-fp8`
|
||||
|
||||
## Why
|
||||
|
||||
- Both models were validated live against `https://app.resetdata.ai/api/v1/chat/completions` with HTTP 200.
|
||||
- `penai/gpt-oss-120b` returned `model_not_found`; remove ambiguity and standardize on known-good IDs.
|
||||
- Existing compose defaults already used `openai/gpt-oss-120b`; align Go default to the same model.
|
||||
|
||||
## Validation snapshot
|
||||
|
||||
Probe run date: February 26, 2026 (UTC)
|
||||
|
||||
- `zai-org/glm-4.7-fp8` -> 200
|
||||
- `openai/gpt-oss-120b` -> 200
|
||||
- `penai/gpt-oss-120b` -> 404 (`model_not_found`)
|
||||
- `meta/llama-3.1-8b-instruct` -> 200
|
||||
- `google/gemma-3-27b-it` -> 200
|
||||
|
||||
## Implementation updates
|
||||
|
||||
- Updated Go default model:
|
||||
- `pkg/config/config.go`
|
||||
- Updated bootstrap gate validations:
|
||||
- `testing/march8_bootstrap_gate.sh`
|
||||
- Updated release board:
|
||||
- `docs/progress/MARCH8-BOOTSTRAP-RELEASE-BOARD.md`
|
||||
|
||||
## Consequences
|
||||
|
||||
- All release validation and e2e runs must use the frozen pair until March 8, 2026.
|
||||
- Any model change before release must open a new decision record and rerun live gate + evidence capture.
|
||||
|
||||
## UCXL reference
|
||||
|
||||
`ucxl://arbiter:release-coordinator@CHORUS:march8-bootstrap/#/docs/decisions/2026-02-26-resetdata-model-freeze.md`
|
||||
92
docs/progress/MARCH8-BOOTSTRAP-RELEASE-BOARD.md
Normal file
92
docs/progress/MARCH8-BOOTSTRAP-RELEASE-BOARD.md
Normal file
@@ -0,0 +1,92 @@
|
||||
# March 8 Bootstrap Release Board
|
||||
|
||||
Date window: February 26, 2026 to March 8, 2026
|
||||
Objective: ship a replayable "CHORUS bootstrap path" that uses real inference, produces traceable artifacts, and avoids mock execution in the critical flow.
|
||||
|
||||
## Scope lock (do not expand)
|
||||
|
||||
Single path only:
|
||||
|
||||
1. Issue intake
|
||||
2. SWOOSH transition
|
||||
3. CHORUS task execution (real model call)
|
||||
4. SLURP bundle creation
|
||||
5. BUBBLE decision record
|
||||
6. UCXL address persisted and retrievable
|
||||
|
||||
Everything else is out of scope unless it blocks this path.
|
||||
|
||||
## Release gates
|
||||
|
||||
All must pass by March 8:
|
||||
|
||||
- [ ] G1: No mock fallback in critical task execution path.
|
||||
- [ ] G2: ResetData model configuration is canonical and consistent across compose + Go defaults.
|
||||
- [ ] G3: At least one primary model and one fallback model validated against ResetData API.
|
||||
- [ ] G4: End-to-end run produces DR + UCXL pointer + provenance evidence.
|
||||
- [ ] G5: 24h stability test completes with reproducible logs and failure classification.
|
||||
- [ ] G6: Operator runbook exists with exact commands used for validation.
|
||||
|
||||
## Frozen model pair (locked on February 26, 2026)
|
||||
|
||||
- Primary: `openai/gpt-oss-120b`
|
||||
- Fallback: `zai-org/glm-4.7-fp8`
|
||||
- Validation status: both returned HTTP 200 against `https://app.resetdata.ai/api/v1/chat/completions` on February 26, 2026.
|
||||
|
||||
## Daily plan
|
||||
|
||||
### Feb 26-28: Remove ambiguity, remove mocks
|
||||
|
||||
- [x] Freeze target model pair for release.
|
||||
- [x] Validate ResetData auth + chat completion from runtime environment.
|
||||
- [x] Remove or hard-disable mock execution in critical path.
|
||||
- [ ] Capture first green baseline run (single issue -> artifact path).
|
||||
|
||||
### Mar 1-4: Stabilize integration
|
||||
|
||||
- [ ] Run repeated e2e cycles under SWOOSH + CHORUS.
|
||||
- [ ] Measure pass rate, latency, and top failure classes.
|
||||
- [ ] Fix top 3 failure classes only.
|
||||
- [ ] Ensure DR/UCXL artifacts are emitted every successful run.
|
||||
|
||||
### Mar 5-7: Hardening + evidence
|
||||
|
||||
- [ ] Run 24h soak on frozen config.
|
||||
- [ ] Produce validation bundle (commands, logs, outputs, known limits).
|
||||
- [ ] Confirm rollback instructions.
|
||||
|
||||
### Mar 8: Freeze + release
|
||||
|
||||
- [ ] Freeze config/image tags.
|
||||
- [ ] Run final gate script.
|
||||
- [ ] Publish release note + operator checklist.
|
||||
|
||||
## Coordination protocol
|
||||
|
||||
- One active lane at a time:
|
||||
- `NOW`
|
||||
- `NEXT`
|
||||
- `BLOCKED`
|
||||
- Any new idea goes to backlog unless directly required for a failing gate.
|
||||
- Every work item must map to at least one gate ID (`G1`..`G6`).
|
||||
- No "architecture expansion" during this window.
|
||||
|
||||
## Work lanes
|
||||
|
||||
NOW:
|
||||
- [x] Create and run bootstrap gate script (`testing/march8_bootstrap_gate.sh`)
|
||||
- [ ] Create and run e2e evidence capture (`testing/march8_e2e_evidence.sh`)
|
||||
|
||||
NEXT:
|
||||
- [ ] Capture first baseline evidence bundle with DR + UCXL + provenance
|
||||
|
||||
BLOCKED:
|
||||
- [ ] None
|
||||
|
||||
## Evidence checklist (release packet)
|
||||
|
||||
- [ ] Gate script output (final passing run)
|
||||
- [ ] Model validation output (primary + fallback)
|
||||
- [ ] E2E run log showing DR + UCXL + provenance
|
||||
- [ ] 24h soak summary (pass/fail + failures by class)
|
||||
- [ ] Known limitations and immediate post-release priorities
|
||||
Reference in New Issue
Block a user