Benchmarking
This page defines the public benchmarking posture for the byteor-enterprise workspace.
The checked-in benchmark surface lives under bench/byteor-enterprise-bench, with regression thresholds in bench/perf_baseline.kv and automation through cargo run -p xtask -- perf.
What this page covers
- how enterprise benchmarks should be interpreted publicly
- which checked-in harnesses back published numbers
- what context must accompany product-level performance claims
Public benchmark posture
Enterprise benchmark numbers are reproducible baseline evidence for product surfaces such as EdgePlane, ActionGraph, DataGuard, and SingleRing-backed execution.
They are not hosted-service SLOs and they are not a substitute for operator readiness checks.
As noted in Operator Flow, doctor output is a contract check, not a benchmark.
Current harness and gate
The public enterprise benchmark flow is:
- run the harness with
cargo run -p byteor-enterprise-benchfor developer iteration orcargo run -p byteor-enterprise-bench -- --machinefor lower-noise machine output - enforce or refresh the checked-in gate with
cargo run -p xtask -- perfandcargo run -p xtask -- perf --update - compare against
bench/perf_baseline.kv, which stores conservative throughput, p99, and CPU thresholds for each enterprise bench name
The harness supports the same tuning surfaces the runtime uses, including memlock, CPU pinning, scheduler policy, and RT priority.
Bench families currently covered
The checked-in enterprise baseline currently includes:
edgeplane_identity_stageactiongraph_http_post_dry_rundataguard_chain_mapkvdataguard_fused_mapkvsingle_ring_chainsingle_ring_dagsingle_ring_sharded
Use those bench names directly when reporting results so readers can line them up with the checked-in baseline.
Minimum methodology disclosure
When you publish or compare enterprise numbers, include at least:
- the exact bench name and product surface
- whether the run used default or machine mode
- CPU allowlist, pinning preset, scheduler policy, and RT priority if any
- memlock posture, tmpdir backing, and host/kernel details
- throughput plus tail latency and CPU cost
If the run depends on Linux privilege posture, say so explicitly. mlockall, SCHED_FIFO, and SCHED_RR are meaningful only when the host limits and capabilities allow them.
Reading the baseline safely
bench/perf_baseline.kv is a regression gate, not a marketing table.
- throughput thresholds are lower bounds
- p99 and CPU thresholds are upper bounds
- the stored numbers include headroom so the gate remains stable across reasonable CI and lab noise
Use the file to detect regressions and to compare host postures. Do not reuse it as a universal promise for unrelated deployments.