Cloud Incident and Replay
ByteOr Cloud provides an integrated incident capture and replay workflow through the control plane.
Artifact Collection
Agents upload artifacts to the control plane:
- Incident bundles from
incident-bundlecaptures - State snapshots from
snapshotcaptures - Runtime diagnostic outputs
Artifacts are stored with metadata including agent, environment, capture time, and content hash.
Investigation Flow
- Identify the incident, agent, deployment, and environment
- Open the stored artifact record in the Cloud UI
- Confirm artifact metadata and capture time
- Launch a dry-run replay to validate the reconstructed input
- Review the replay audit result
- Escalate to execute-mode replay only when the environment posture and approval coverage allow it
Dry-Run Replay
Use dry run for investigation:
- Inspect behavior without executing against live infrastructure
- Validate that the captured input reproduces the issue
- Confirm the pipeline version matches the incident
- Test remediation strategies
Execute-Mode Replay
Use execute mode only when:
- Dry run confirms the issue
- The environment posture permits governed execution
- A matching pipeline version can be resolved
- Approval coverage is attached
The backend validates all prerequisites before allowing execution to begin.
Monitoring Replay
Cloud surfaces replay status:
- Replay request creation and authorization check
- Pipeline version resolution
- Execution status and progress
- Resulting audit records
Operational Escalation
Escalate when:
- The source environment cannot be resolved
- The artifact spec hash does not map to a known pipeline version
- Approval coverage is missing or rejected
- The agent shows repeated authorization failures during investigation