Files
DistOS/councils/08-api-surface.md
anthonyrawlins 7f56ca4d46 Initial DistOS project constitution and council design briefs
12 council design briefs for distributed OS specification project targeting
1024-node Hopper/Grace/Blackwell GPU cluster with Weka parallel filesystem.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 14:15:39 +11:00

515 lines
35 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Council Design Brief: API Surface and Developer Experience
**Council ID:** `council-api`
**Mission:** Define the complete, coherent, and ergonomic interface between DistOS and its users — operators, application developers, and other systems. This council decides what the operating system looks like from the outside: system calls, SDK bindings, CLI tools, and the conventions that make all of the above consistent and maintainable across language boundaries and API versions.
**UCXL Base Address:** `ucxl://council-api:*@DistOS:api/*`
**Agent Count:** ~40
**Status:** Design Brief — Constitution Phase
---
## 1. Scope and Responsibilities
`council-api` owns the external interface contract of DistOS. Its scope covers:
- Deciding the overall API philosophy: POSIX-compatible extension, clean-slate design, or a layered model that offers both
- Defining GPU-native system calls for kernel launch, memory allocation, device-to-device transfers, stream and graph management, and event synchronisation
- Defining distributed system calls: remote procedure invocation (covering both synchronous RPC and async futures), distributed lock acquisition and release, barriers, and collective operations across node groups
- Designing an async-first API surface that aligns with modern language runtimes (Rust `async`/`await`, Go goroutines, Python `asyncio`)
- Establishing error handling conventions, including integration with UCXL response codes for errors that carry provenance (which node, which operation, at what logical time)
- Designing the SDK for four target languages: C (ABI-stable systems interface), Rust (idiomatic, zero-cost), Go (ergonomic, channel-friendly), and Python (user-friendly, numpy-compatible)
- Designing CLI tooling for cluster management: node status, job submission, resource inspection, log retrieval, and administrative operations
- Defining the API versioning and evolution strategy: how new calls are introduced, how deprecated calls are retired, compatibility guarantees across minor and major versions
- Producing API reference documentation that is precise enough to serve as a normative source alongside the formal spec
- Specifying example applications that exercise non-trivial API paths and serve as integration test targets
Responsibilities this council does **not** own: kernel implementation (owned by subsystem councils); formal verification of API contracts (owned by `council-verify`); security policy enforcement (owned by `council-sec`, though `council-api` designs the authentication and authorisation API surface in coordination with it); monitoring and metering calls (owned by `council-telemetry`, though `council-api` exposes the SDK surface for those).
---
## 2. Research Domains
### 2.1 POSIX Compatibility vs. Clean-Slate Design
POSIX (IEEE 1003.1) defines the canonical Unix system call interface. Its strengths are: near-universal language runtime support, a mature ecosystem of tools, and decades of developer familiarity. Its weaknesses in a GPU-cluster OS context are: blocking I/O semantics that assume CPU-thread models, file-descriptor-centric resource management ill-suited to GPU memory objects, and no native concept of distributed operations or remote memory.
Two design philosophies must be fully researched before the council can decide:
- **POSIX-compatible extension:** Retain the full POSIX interface and extend it with GPU and distributed primitives as optional add-ons. Applications written for Linux run unmodified; GPU-aware applications opt into extensions. This is the approach taken by CUDA (which layers a driver API on top of the OS) and by ROCm/HIP.
- **Clean-slate design:** Design an interface optimal for the DistOS hardware target without backward-compatibility constraints. This allows stronger type safety, async-native semantics, and a capability-based resource model from the first call. Plan 9 (Pike et al.) and Fuchsia (Zircon) are the primary existence proofs.
- **Layered model:** Provide a clean-slate primary API and a POSIX compatibility layer implemented on top of it. This is the architectural recommendation for evaluation. The compatibility layer has a defined cost budget.
Key references:
- The Open Group. *The Single UNIX Specification (SUSv4/POSIX.1-2017)*. The normative POSIX reference.
- Pike, R. et al. "Plan 9 from Bell Labs." *USENIX Summer 1990 Technical Conference*. Plan 9's contribution is the 9P protocol: everything is a file, including processes and network connections. The simplicity of the resource model is instructive even if DistOS does not adopt 9P verbatim.
- Pike, R. "The Use of Name Spaces in Plan 9." *EUUG Newsletter* 12(1), 1992.
- Google. *Fuchsia OS: Zircon Kernel Objects*. https://fuchsia.dev/fuchsia-src/concepts/kernel. Zircon uses a capability-based object system with handles as the only way to reference kernel objects. This is the most complete modern clean-slate OS design and must be studied in depth.
### 2.2 GPU-Native System Calls
The CUDA Driver API provides the lowest-level GPU control surface available: `cuInit`, `cuDeviceGet`, `cuCtxCreate`, `cuMemAlloc`, `cuLaunchKernel`, `cuEventRecord`, `cuStreamWaitEvent`. It is the reference for what a GPU system call interface must cover.
Agents must evaluate the tradeoffs between:
- **Driver-level API** (CUDA Driver API / ROCm HIP Low-Level): explicit context management, explicit stream management, maximum control, verbose
- **Runtime API** (CUDA Runtime / ROCm): implicit context, automatic stream assignment, less control, more ergonomic
- **Graph-based execution** (CUDA Graphs / HIP Graphs): capture a sequence of operations as a graph for repeated execution with lower launch overhead. Critical for the 1024-node deployment where kernel launch overhead accumulates.
Key references:
- NVIDIA. *CUDA Driver API Reference Manual*. https://docs.nvidia.com/cuda/cuda-driver-api/. Normative reference for GPU system call semantics.
- NVIDIA. *CUDA C Programming Guide* (Chapter 3: Programming Interface). Covers the Runtime API and its relationship to the Driver API.
- NVIDIA. *CUDA Graphs* documentation. https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#cuda-graphs. The graph execution model is essential for understanding low-latency repeated workloads on Hopper and Blackwell.
- Khronos Group. *OpenCL 3.0 Specification*. https://www.khronos.org/opencl/. The vendor-neutral GPU programming API. DistOS must decide whether to support OpenCL alongside CUDA semantics.
- Khronos Group. *SYCL 2020 Specification*. https://www.khronos.org/sycl/. SYCL provides a C++ abstraction over OpenCL and oneAPI targets. Intel's oneAPI unifies GPU programming across vendors and is a candidate for the DistOS higher-level SDK layer.
- Intel. *oneAPI Programming Guide*. https://www.intel.com/content/www/us/en/developer/tools/oneapi/programming-guide.html.
- NVIDIA. *NVLink and NVSwitch Architecture Overview*. https://www.nvidia.com/en-us/data-center/nvlink/. GPU-to-GPU direct access semantics affect memory system call design.
Blackwell-specific: The GB200 NVL72 introduces NVLink Switch System connecting 72 GPUs in a single flat memory domain. System calls for `cuMemAdvise` and `cuMemPrefetchAsync` take on new semantics in this topology. Agents must review:
- NVIDIA. *NVIDIA Blackwell Architecture Technical Brief*. 2024.
### 2.3 Distributed System Calls
System calls that span nodes are novel: POSIX has no notion of them. The design space covers:
- **Remote procedure invocation:** How does a process on node A invoke a procedure on node B? Synchronous blocking (simple, latency-bound), asynchronous with futures (complex, scalable), or continuation-passing. gRPC is the de facto standard for service-to-service RPC in the cloud but carries HTTP/2 overhead.
- **Distributed locks:** Lease-based locks (Chubby/Zookeeper model), RDMA-based compare-and-swap (best latency), or consensus-based locks for strong guarantees. Each has different failure semantics.
- **Barriers:** Collective synchronisation across node groups. MPI_Barrier semantics are well understood; the question is how to expose this in a general-purpose OS API.
- **Collective operations:** AllReduce, AllGather, Broadcast, Reduce-Scatter. These are first-class operations for distributed ML workloads (the dominant use case on a 1024-node GPU cluster) and must be surfaced as OS-level calls, not just library calls, so the OS can optimise placement and routing.
Key references:
- Birrell, A. and Nelson, B. "Implementing Remote Procedure Calls." *ACM Transactions on Computer Systems* 2(1), 1984. The foundational RPC paper.
- Google. *gRPC*. https://grpc.io/. The current industry standard for typed RPC. Protocol Buffers schema evolution strategy is directly applicable to DistOS API versioning.
- Google. *Chubby: A Lock Service for Loosely-Coupled Distributed Systems*. Burrows, M. OSDI 2006.
- Hunt, P. et al. "ZooKeeper: Wait-free Coordination for Internet-scale Systems." *USENIX ATC 2010*.
- Message Passing Interface Forum. *MPI: A Message-Passing Interface Standard, Version 4.1*. 2023. The collective operations specification is normative for `council-api`'s collective call design.
- Mellanox/NVIDIA. *RDMA Programming Guide*. InfiniBand verbs API (ibv_post_send, ibv_post_recv, ibv_create_qp) provides the lowest-latency distributed memory access primitives available on the target cluster.
### 2.4 Async-First API Design
A GPU cluster OS serving AI workloads will have I/O patterns dominated by deep asynchrony: thousands of in-flight kernel launches, streaming data from Weka FS, collective comms across 1024 nodes. A synchronous API is a fundamental design mistake. Agents must research:
- **Rust async/await:** The Rust async model (futures, the `Poll` trait, the executor model) provides zero-cost abstraction over async I/O. The `tokio` runtime is the dominant executor. The DistOS Rust SDK must integrate naturally with tokio.
- **io_uring (Linux 5.1+):** The io_uring interface provides a shared ring-buffer interface between kernel and userspace that eliminates syscall overhead for I/O. Its submission/completion queue model is the reference for how DistOS should design its own async system call interface.
- **Go channels and goroutines:** Go's concurrency model maps well to distributed operations. The DistOS Go SDK must express distributed calls as channels or via the `context.Context` cancellation pattern.
- **Python asyncio:** The Python SDK must be usable from `async def` coroutines. NumPy compatibility for GPU tensor operations should be considered (compatibility with the Numba/CuPy interface).
Key references:
- Axboe, J. *io_uring and the new Linux async I/O API*. https://kernel.dk/io_uring.pdf. 2019. This paper is essential for understanding the state of the art in async syscall design.
- The Rust Async Book. https://rust-lang.github.io/async-book/. Normative reference for Rust async design patterns.
- Grigorik, I. *High Performance Browser Networking* (Chapter 2 on event loop and async I/O patterns). 2013. O'Reilly. Useful background on event-driven I/O design.
### 2.5 Error Handling Conventions
A cluster OS at this scale will produce a high volume of partial failures: a node goes dark, a GPU kernel faults, a network partition isolates a subsystem. The error handling convention must be:
- **Structured:** Every error carries a type, a severity, a source identifier (node, subsystem, call), and a correlation ID that links it to a UCXL-addressed event in the distributed log.
- **Actionable:** The API must distinguish between errors that the caller should retry (transient), errors that require intervention (permanent), and errors that indicate a usage mistake (programmer error).
- **Traceable:** Error correlation IDs must be UCXL-compatible so that an error returned to a Python application can be resolved to the full distributed event chain using the UCXL resolver.
Key references:
- Google. *Google Cloud API Design Guide: Errors*. https://cloud.google.com/apis/design/errors. The most systematic public treatment of structured API error design. The canonical status codes (OK, INVALID_ARGUMENT, NOT_FOUND, UNAVAILABLE, etc.) should be adopted or adapted.
- Klabnik, S. and Nichols, C. *The Rust Programming Language* (Chapter 9: Error Handling). The Rust approach to `Result<T, E>` and the `?` operator represents the state of the art for recoverable errors in a systems language.
- Syme, D. et al. "Exceptional Syntactic Support for Error Handling in F#." *Haskell Symposium 2020*. Relevant to the higher-level SDK error design.
The UCXL response code integration specifically means that API error structs carry a `ucxl_trace` field containing the UCXL address of the distributed event that caused the failure:
```
error.ucxl_trace = "ucxl://council-fault:monitor@DistOS:fault-tolerance/^^/events/node-042-timeout-2026-03-01T14:22:00Z"
```
### 2.6 SDK Design for Multiple Languages
The SDK must present a coherent surface across four languages with different idioms. The design principles are:
- **C ABI as the foundation:** The canonical system call interface is a C ABI. All other language SDKs are generated or hand-written wrappers over the C ABI. This ensures ABI stability and FFI compatibility with every language.
- **Rust SDK:** Idiomatic, zero-cost wrappers. Use Rust's ownership system to enforce resource lifetimes at compile time (e.g., a `GpuBuffer<T>` type that is `Send` but not `Sync`, reflecting GPU buffer ownership semantics). The Rust SDK should use `#[repr(C)]` structs for ABI compatibility.
- **Go SDK:** Ergonomic wrappers using `cgo` for the C ABI. Expose distributed operations as channel-returning functions. Context-aware: all calls accept `context.Context` for cancellation and timeout propagation.
- **Python SDK:** High-level, NumPy-compatible. Consider auto-generating stub code from a schema. Must be `asyncio`-compatible. Integrate with the Python type system via `Protocol` and `TypedDict`.
Key references:
- Klabnik, S. and Nichols, C. *The Rust Programming Language*. https://doc.rust-lang.org/book/. Idiomatic Rust patterns.
- Go Authors. *Effective Go*. https://go.dev/doc/effective_go. Idiomatic Go patterns.
- Google. *Google Cloud API Design Guide*. https://cloud.google.com/apis/design. The most comprehensive public API design guide, covering resource-oriented design, standard methods, naming conventions, and backwards compatibility.
- Smith, P. *Designing for Compatibility in Evolving APIs*. IEEE Software 39(4), 2022.
### 2.7 CLI Tooling Design
The cluster management CLI (`distos-ctl` or equivalent) must follow modern CLI design principles:
- Machine-readable output (JSON/YAML with `--output json`) for scripting
- Structured logging with log levels
- Human-readable default output with colour and progress indicators
- Completion generation for bash/zsh/fish
- Subcommand structure: `node`, `job`, `gpu`, `net`, `storage`, `secret`, `log`
Key references:
- Sigurdsson, A. et al. *Command Line Interface Guidelines*. https://clig.dev/. The community-written standard for modern CLI design. Should be treated as the style guide for `distos-ctl`.
- Hashicorp. *Vault CLI design*. The Vault CLI is an exemplar of a well-structured cluster management tool with consistent subcommand and flag conventions.
- Kubernetes. `kubectl` source and documentation. The de facto standard for distributed cluster management CLIs. The DistOS CLI should match `kubectl` conventions where applicable to reduce cognitive load.
### 2.8 API Versioning and Evolution Strategy
A system call interface must be stable. The versioning strategy must address:
- **Compatibility guarantees:** What changes are backwards-compatible (adding optional parameters, adding new calls) vs. breaking (changing parameter semantics, removing calls)?
- **Deprecation lifecycle:** Minimum deprecation notice period, deprecation markers in the SDK, removal schedule.
- **Version negotiation:** How does a client indicate the API version it was compiled against? How does the kernel report available versions?
- **Experimental APIs:** A clearly marked experimental tier for new calls before they enter the stable surface.
Key references:
- Google. *Google Cloud API Versioning*. https://cloud.google.com/apis/design/versioning. URL-based versioning for REST APIs; the principles apply to system call versioning.
- Klabnik, S. "Stability as a Deliverable." https://blog.rust-lang.org/2014/10/30/Stability.html. Rust's stability commitment is a model for how a systems project can make and keep compatibility promises.
- Semantic Versioning Specification. https://semver.org/. The DistOS SDK and ABI will follow SemVer 2.0.
### 2.9 Plan 9 and Fuchsia Zircon Deep Dive
These two systems represent the clearest non-POSIX OS API designs and must be studied in depth:
- **Plan 9:** The 9P protocol represents all system resources as files served over a file system protocol. Network connections, processes, and graphics are files. The simplicity is extreme. The DistOS clean-slate layer need not adopt 9P but should understand its design philosophy.
- Pike, R. et al. "The Use of Name Spaces in Plan 9." *EUUG Newsletter* 12(1), 1992.
- Dorward, S. et al. "The Inferno Operating System." *Bell Labs Technical Journal* 2(1), 1997.
- **Fuchsia / Zircon:** Zircon is a microkernel with capabilities as the security primitive. Every kernel resource is a `zx_handle_t`. Handles are passed between processes explicitly; there is no global namespace for kernel objects. This is the preferred model for DistOS's capability integration with `council-sec`.
- Google. *Zircon Kernel Concepts*. https://fuchsia.dev/fuchsia-src/concepts/kernel/concepts.
- Google. *Zircon Syscall Reference*. https://fuchsia.dev/fuchsia-src/reference/syscalls.
---
## 3. Agent Roles
| Role | Count | Responsibilities |
|------|-------|-----------------|
| Lead API Architect | 1 | Decides overall API philosophy; coordinates with all subsystem councils; owns the master API specification document; resolves conflicts between API and subsystem requirements |
| POSIX Compatibility Analysts | 4 | Audit which POSIX calls must be retained; design the compatibility shim layer; document compatibility coverage gaps |
| GPU Syscall Designers | 6 | Design GPU-native system calls for kernel launch, memory, streams, events, graphs; ensure Hopper/Blackwell/Grace specifics are covered |
| Distributed Syscall Designers | 5 | Design RPC, distributed lock, barrier, and collective operation system calls; consult MPI and RDMA references |
| SDK Designers | 8 | Design language-specific SDKs: 2 per language (C, Rust, Go, Python); responsible for ergonomics, idiom conformance, and ABI stability |
| Async API Specialists | 4 | Design the async call model; specify io_uring-style ring buffer interface; ensure Rust/Go/Python async integration |
| CLI Designers | 3 | Design `distos-ctl` command structure, output formats, and completions |
| Error Handling Architects | 3 | Design structured error types, UCXL trace integration, and error propagation conventions across all SDK layers |
| API Versioning Strategists | 2 | Develop the versioning policy, deprecation lifecycle, compatibility matrix, and experimental API tier |
| Developer Experience Reviewers | 4 | Evaluate API usability; write developer-facing documentation and example applications; run internal "dogfooding" walkthroughs |
**Total:** 40 agents
---
## 4. Key Deliverables
All artifacts use the pattern `ucxl://council-api:{role}@DistOS:api/^^/{artifact-type}/{name}`.
### 4.1 Master API Philosophy Decision Record
```
ucxl://council-api:lead-api-architect@DistOS:api/^^/decisions/dr-api-01-philosophy.md
```
Covers the layered model decision: clean-slate primary API, POSIX compatibility shim, and the cost budget for the shim.
### 4.2 GPU System Call Specification
```
ucxl://council-api:gpu-syscall-designer@DistOS:api/^^/specs/gpu-syscalls.md
```
Full specification of all GPU-native system calls with parameter types, semantics, error codes, and Hopper/Blackwell/Grace specifics.
### 4.3 Distributed System Call Specification
```
ucxl://council-api:distributed-syscall-designer@DistOS:api/^^/specs/distributed-syscalls.md
```
### 4.4 Async Call Interface Specification
```
ucxl://council-api:async-api-specialist@DistOS:api/^^/specs/async-interface.md
```
Documents the submission/completion ring model, back-pressure semantics, and language runtime integration.
### 4.5 C ABI Reference
```
ucxl://council-api:sdk-designer@DistOS:api/^^/specs/c-abi-reference.h
```
The normative C header file. All other SDKs are derived from this.
### 4.6 Language SDK Specifications
```
ucxl://council-api:sdk-designer@DistOS:api/^^/specs/sdk-rust.md
ucxl://council-api:sdk-designer@DistOS:api/^^/specs/sdk-go.md
ucxl://council-api:sdk-designer@DistOS:api/^^/specs/sdk-python.md
```
### 4.7 Error Type Catalogue
```
ucxl://council-api:error-handling-architect@DistOS:api/^^/specs/error-catalogue.md
```
All structured error types with UCXL trace integration, severity levels, and retry guidance.
### 4.8 CLI Specification
```
ucxl://council-api:cli-designer@DistOS:api/^^/specs/distos-ctl-spec.md
```
Full command reference including all subcommands, flags, output formats, and completion scripts.
### 4.9 API Versioning Policy
```
ucxl://council-api:api-versioning-strategist@DistOS:api/^^/policies/versioning-policy.md
```
### 4.10 POSIX Compatibility Coverage Matrix
```
ucxl://council-api:posix-compatibility-analyst@DistOS:api/^^/specs/posix-compatibility-matrix.md
```
Tabulates every POSIX call: supported natively, supported via shim, not supported (with rationale).
### 4.11 Example Applications
```
ucxl://council-api:developer-experience-reviewer@DistOS:api/^^/examples/hello-distributed-gpu.md
ucxl://council-api:developer-experience-reviewer@DistOS:api/^^/examples/allreduce-collective.md
ucxl://council-api:developer-experience-reviewer@DistOS:api/^^/examples/weka-fs-streaming-io.md
```
---
## 5. Decision Points
All DRs use the address pattern `ucxl://council-api:lead-api-architect@DistOS:api/^^/decisions/{dr-id}.md`.
### DP-A01: POSIX vs. Clean-Slate vs. Layered
The foundational design philosophy choice. The default recommendation is the layered model, but this must be validated against: the cost of maintaining the shim layer, the risk of semantic leakage from POSIX into the clean-slate layer, and the developer familiarity benefit.
**Deciding parties:** Lead API Architect, POSIX Compatibility Analysts, `council-synth`
### DP-A02: Async System Call Mechanism
Choose between: io_uring-inspired ring buffer (lowest overhead, Linux precedent), a POSIX-extended `aio_*` interface (familiarity, limited expressiveness), or a fully custom completion port model. This decision is tightly coupled to the `council-mem` memory model (the ring buffer requires shared memory between kernel and userspace).
**Deciding parties:** Async API Specialists, `council-mem`, `council-verify` (for ABI safety check)
### DP-A03: GPU Memory API at the Syscall Layer vs. Library Layer
Should GPU memory allocation (`cuMemAlloc` equivalent) be a kernel-mediated system call (allowing the OS to account for and schedule GPU memory as a first-class resource) or a library call that bypasses the kernel after initial device setup? Kernel mediation adds latency; bypass reduces accounting fidelity.
**Deciding parties:** GPU Syscall Designers, `council-mem`, `council-telemetry`
### DP-A04: RPC Mechanism for Distributed System Calls
Choose the wire protocol for remote procedure calls: gRPC (typed, HTTP/2, mature), a custom binary protocol over RDMA (lowest latency, highest implementation cost), or a two-tier model (gRPC for control plane, RDMA for data plane). The choice directly affects the latency budget for distributed system calls.
**Deciding parties:** Distributed Syscall Designers, `council-net`
### DP-A05: SDK Code Generation vs. Hand-Written Wrappers
Decide whether to generate the Rust, Go, and Python SDKs from a schema definition (IDL, such as Protocol Buffers or a custom DSL) or maintain hand-written wrappers. Generated code is more consistent; hand-written code can be more idiomatic. A hybrid (generate the boilerplate, hand-write ergonomic wrappers) is the likely outcome.
**Deciding parties:** SDK Designers, API Versioning Strategists
### DP-A06: Authentication and Authorisation API
How does a process prove its identity to the kernel and acquire capabilities? Options: token-based (JWT or similar), capability handles (Zircon model), certificate-based (X.509 with a cluster CA), or UCXL-scoped credentials. This decision must be made jointly with `council-sec`.
**Deciding parties:** Lead API Architect, `council-sec`
---
## 6. Dependencies on Other Councils
`council-api` is the integrating council: every subsystem council produces functionality, and `council-api` exposes that functionality through a coherent surface. It is therefore a downstream consumer of requirements from all councils and an upstream provider to `council-docs` and `council-verify`.
| Council | Relationship | What council-api consumes | What council-api produces |
|---------|-------------|--------------------------|--------------------------|
| `council-sched` | Consuming requirements | Job submission semantics, priority model, queue management APIs | Scheduler-facing system calls in API spec |
| `council-mem` | Bidirectional | Memory model, allocation semantics, consistency guarantees | Memory system call specs; async memory API |
| `council-net` | Bidirectional | Network abstraction primitives, RDMA capabilities | Network system calls; distributed RPC wire protocol choice |
| `council-fault` | Consuming requirements | Failure notification model, recovery primitives | Fault-tolerance-related error codes; node failure event API |
| `council-sec` | Bidirectional | Capability model, identity primitives, isolation guarantees | Authentication/authorisation API surface; capability handle design |
| `council-telemetry` | Consuming requirements | Metering call semantics, SLO query interface | Telemetry-facing SDK surface; metering call specs |
| `council-verify` | Providing for verification | N/A | API interface contracts for formal verification |
| `council-qa` | Providing for test design | N/A | API spec enables QA to design conformance tests |
| `council-synth` | Receiving directives | Cross-council conflict resolutions affecting API design | Updates to API spec when directed by synth |
| `council-docs` | Providing for documentation | N/A | All API specs feed directly into the reference documentation |
**Critical path constraint:** `council-api` cannot finalise the distributed system call interface until `council-net` has committed to its RPC and RDMA model (DP-A04 depends on this). GPU system call design can proceed independently from Day 1.
---
## 7. WHOOSH Configuration
### 7.1 Team Formation
```yaml
council_id: council-api
display_name: "API Surface and Developer Experience Council"
target_size: 40
formation_strategy: competency_weighted
required_roles:
- role: lead-api-architect
count: 1
persona: systems-analyst
competencies: [api-design, posix, distributed-systems, gpu-programming, developer-experience]
- role: posix-compatibility-analyst
count: 4
persona: technical-specialist
competencies: [posix, linux-kernel, system-calls, abi-stability]
- role: gpu-syscall-designer
count: 6
persona: technical-specialist
competencies: [cuda, rocm, gpu-memory, hopper-architecture, blackwell-architecture, nvlink]
- role: distributed-syscall-designer
count: 5
persona: technical-specialist
competencies: [rpc, rdma, mpi-collectives, distributed-locks, grpc]
- role: sdk-designer
count: 8
persona: technical-specialist
competencies: [c-abi, rust-async, go-concurrency, python-asyncio, ffi, sdk-design]
- role: async-api-specialist
count: 4
persona: technical-specialist
competencies: [io-uring, async-io, rust-futures, event-driven-design]
- role: cli-designer
count: 3
persona: technical-specialist
competencies: [cli-design, ux, kubectl-conventions, shell-completion]
- role: error-handling-architect
count: 3
persona: systems-analyst
competencies: [error-design, structured-errors, distributed-tracing, ucxl]
- role: api-versioning-strategist
count: 2
persona: systems-analyst
competencies: [api-versioning, semver, deprecation-policy, compatibility]
- role: developer-experience-reviewer
count: 4
persona: technical-writer
competencies: [developer-documentation, api-usability, example-applications, dogfooding]
```
### 7.2 Quorum Rules
```yaml
quorum:
decision_threshold: 0.65 # 65% of active agents must agree on API design decisions
lead_architect_veto: true # Lead API Architect can block any interface decision
breaking_change_threshold: 0.85 # Breaking changes require 85% supermajority
cross_council_approval:
trigger: api_affects_subsystem
required: [affected_council_lead, council-synth]
response_sla_hours: 6
developer_experience_review:
trigger: new_public_call
required: [developer-experience-reviewer_count >= 2]
purpose: "Ensure every new call meets ergonomics standard before it enters the spec"
```
### 7.3 Subchannels
```yaml
subchannels:
- id: api-posix-compat
subscribers: [posix-compatibility-analyst, lead-api-architect]
purpose: "POSIX coverage analysis, shim design, compatibility gap triage"
ucxl_feed: "ucxl://council-api:posix-compatibility-analyst@DistOS:api/^^/specs/posix-*"
- id: api-gpu-syscalls
subscribers: [gpu-syscall-designer, lead-api-architect, async-api-specialist]
purpose: "GPU-native system call design; Hopper/Blackwell capability integration"
ucxl_feed: "ucxl://council-api:gpu-syscall-designer@DistOS:api/^^/specs/gpu-*"
- id: api-distributed-syscalls
subscribers: [distributed-syscall-designer, lead-api-architect]
purpose: "Distributed call design; RPC and RDMA protocol negotiation with council-net"
ucxl_feed: "ucxl://council-api:distributed-syscall-designer@DistOS:api/^^/specs/distributed-*"
- id: api-sdk-coordination
subscribers: [sdk-designer, async-api-specialist, developer-experience-reviewer]
purpose: "Cross-language SDK consistency; ABI stability coordination"
ucxl_feed: "ucxl://council-api:sdk-designer@DistOS:api/^^/specs/sdk-*"
- id: api-error-and-versioning
subscribers: [error-handling-architect, api-versioning-strategist, lead-api-architect]
purpose: "Error catalogue development; versioning policy; UCXL trace integration"
ucxl_feed: "ucxl://council-api:error-handling-architect@DistOS:api/^^/specs/error-*"
- id: api-cross-council-requirements
subscribers: [lead-api-architect, distributed-syscall-designer, gpu-syscall-designer]
purpose: "Inbound requirements from all subsystem councils; tracks what each council needs exposed"
ucxl_feed: "ucxl://council-*:*@DistOS:*/^^/requirements/api-*"
- id: api-devex-review
subscribers: [developer-experience-reviewer, lead-api-architect]
purpose: "Developer experience walkthroughs; example application drafts; usability feedback"
ucxl_feed: "ucxl://council-api:developer-experience-reviewer@DistOS:api/^^/examples/*"
```
---
## 8. Success Criteria
1. **Complete API surface:** The master API specification covers all system calls required by all six core subsystem councils. No subsystem has an unaddressed API requirement at the end of Phase 4.
2. **POSIX coverage documented:** The POSIX compatibility matrix exists and classifies every POSIX.1-2017 system call as supported, shim-supported, or explicitly unsupported with rationale.
3. **GPU system calls complete:** All GPU-native system calls for Hopper, Grace, and Blackwell are specified with parameter types, semantics, and error codes. NVLink/NVSwitch topology-aware calls are included.
4. **Distributed system calls complete:** All distributed calls (RPC, locks, barriers, collectives) are specified with failure semantics and consistency guarantees matching the `council-fault` and `council-net` specs.
5. **Four-language SDK specs complete:** C ABI, Rust, Go, and Python SDK specifications exist and have been reviewed for idiomatic correctness by SDK Designers.
6. **Error handling consistent:** All error types are catalogued and every public API call has a documented error table. Every error carries a UCXL trace field.
7. **Versioning policy ratified:** The versioning policy is agreed with `council-synth` and published. The experimental API tier is defined.
8. **Verification-ready contracts:** All interface contracts have been delivered to `council-verify` in Alloy-compatible form by Day 8.
9. **Developer experience validated:** At least three example applications have been written by Developer Experience Reviewers and cover: a simple GPU computation, a distributed collective operation, and a Weka FS streaming I/O pattern.
10. **CLI specification complete:** `distos-ctl` subcommand structure and all primary flags are specified.
---
## 9. Timeline
### Phase 1: Research (Days 13)
- POSIX Compatibility Analysts catalogue POSIX.1-2017 system calls and assess coverage feasibility
- GPU Syscall Designers survey CUDA Driver API, CUDA Graphs, Hopper/Blackwell architecture documentation, NVLink topology implications
- Distributed Syscall Designers survey MPI collectives, gRPC, RDMA verbs, ZooKeeper/Chubby lock models
- SDK Designers survey language ecosystems: Rust async patterns, Go `cgo` patterns, Python asyncio/CuPy
- Async API Specialists study io_uring interface in depth
- Lead API Architect drafts the API philosophy options paper for DP-A01
- Deliverable: `ucxl://council-api:lead-api-architect@DistOS:api/^^/research/api-philosophy-options.md`
### Phase 2: Architecture (Days 36)
- Resolve DP-A01 (philosophy), DP-A02 (async mechanism), DP-A04 (RPC wire protocol), DP-A06 (auth/authz) — all in consultation with relevant councils
- Lead API Architect drafts the call taxonomy: which calls belong in which layer (kernel/shim/library)
- GPU Syscall Designers draft the GPU system call prototype spec for Hopper and Blackwell
- Distributed Syscall Designers draft the distributed call prototype spec, contingent on DP-A04 resolution
- Error Handling Architects draft the error type taxonomy and UCXL trace integration
- Deliverable: `ucxl://council-api:lead-api-architect@DistOS:api/^^/research/call-taxonomy.md`
### Phase 3: Formal Specification (Days 610)
- Full API spec written: GPU syscalls, distributed syscalls, async interface, C ABI reference
- Language SDK specifications written in parallel by SDK Designers
- Error catalogue completed and UCXL trace integration specified
- Alloy interface contracts delivered to `council-verify` for structural verification
- CLI specification drafted by CLI Designers
- POSIX compatibility matrix completed
- Deliverable: `ucxl://council-api:gpu-syscall-designer@DistOS:api/^^/specs/gpu-syscalls.md` and all companion specs
### Phase 4: Integration (Days 1012)
- Resolve any outstanding API requirements from subsystem councils surfaced during their Phase 3 spec work
- DP-A03 and DP-A05 resolved with full DR records
- API versioning policy ratified by `council-synth`
- Developer Experience Reviewers conduct walkthroughs of all three example applications
- Deliver final interface contracts to `council-verify` for re-verification after any Phase 3 changes
- Deliverable: Versioning policy, three example applications
### Phase 5: Documentation (Days 1214)
- Developer Experience Reviewers produce the developer-facing API reference document
- SDK Designers produce getting-started guides for each language
- All specs integrated into the master DistOS specification document via `council-docs`
- Final UCXL navigability check: every API call traces back to the council decision that introduced it
- Deliverable: `ucxl://council-api:developer-experience-reviewer@DistOS:api/^^/docs/api-reference.md`