ByteOr v1 failure and lifecycle
Synced from repo docs
This page is synced from docs/spec/v1-failure-lifecycle.md via docs/public-docs.json. Edit the owning repo source instead of this generated copy. GitHub source: https://github.com/byteor-systems/byteor/blob/master/docs/spec/v1-failure-lifecycle.md
This document defines the normative failure and lifecycle behavior for the ByteOr OSS v1 runtime surface.
RFC 2119/8174 keywords in this document are normative.
Validation before execution
- Executors MUST validate a spec before running it.
- Backing/open helpers MUST validate SHM headers and layout compatibility before exposing the mapping to runtime code.
- Openers MUST treat existing mappings as untrusted input.
Startup
- A runtime MUST either complete initialization successfully or fail with an error.
- If a peer dies during SHM initialization, openers MUST fail after a bounded wait rather than hang indefinitely.
- A partially initialized mapping MUST NOT be treated as a valid live runtime.
Runtime failure handling
- If any stage thread returns an error, the runtime MUST request a cooperative global stop.
- If any stage thread panics, the runtime MUST request a cooperative global stop.
stop_and_join()style runtime shutdown MUST join the worker threads it created before returning to the caller.- Error reporting SHOULD identify the first failing stage when that information is available.
Backpressure and progress
- Producer progress MUST stop before overrunning the minimum active gating sequence.
- A slow or blocked downstream stage MAY stall upstream publication through backpressure.
- Executors MUST preserve the validated dependency/barrier structure; they MUST NOT skip declared dependencies to "keep things moving".
SHM lifecycle
ByteOr v1 uses file-backed SHM mappings for the SHM path.
- Mapping files MAY remain on disk after an abnormal exit.
- A stale file MUST be revalidated before reuse.
- Layout mismatch, missing required capabilities, truncated mappings, or malformed headers MUST be treated as hard-open failures.
- The OSS runtime does not promise durable recovery from crashes; stale artifacts are an operator cleanup concern.
Non-goals
The current v1 OSS lifecycle contract does not include:
- dynamic stage attach/detach
- automatic recovery from arbitrary corrupted SHM state
- durable replay guarantees for every backing/runtime combination
- best-effort execution of known-invalid specs