Benchmarking

Synced from repo docs

This page is synced from docs/guides/benchmarking.md via docs/public-docs.json. Edit the owning repo source instead of this generated copy. GitHub source: https://github.com/byteor-systems/byteor/blob/master/docs/guides/benchmarking.md

This page explains how the ByteOr OSS workspace publishes benchmark evidence and how to compare those numbers safely.

The checked-in benchmark surface for this repo lives in bench/byteor-bench, with conservative regression thresholds in bench/perf_baseline.kv and automation under cargo run -p xtask -- perf.

What these numbers are for

catch regressions in the transport and execution hot paths
compare like-for-like topology changes in a reproducible way
publish conservative baseline evidence for the OSS runtime surface

They are not blanket production guarantees. Treat them as harness-backed baselines for a stated topology and host posture.

Public benchmark surface

The current public OSS benchmark flow is:

run the harness with cargo run -p byteor-bench --release for quick smoke output or cargo run -p byteor-bench --release -- --machine for machine-parseable results
enforce or refresh the checked-in baseline with cargo run -p xtask -- perf and cargo run -p xtask -- perf --update
compare the run against bench/perf_baseline.kv, which stores throughput, p99 latency, and CPU budgets for each bench name

The baseline is intentionally conservative because CI and shared lab hosts vary.

Current bench families

The OSS harness currently covers:

event-lane roundtrip and slot-forward paths
fan-in and MPSC queue shapes over SHM backing
SingleRing chain, DAG, and sharded execution shapes
startup-inclusive versus steady-state SingleRing comparisons

Use the bench names from the machine output when reporting results so they line up with the checked-in baseline keys.

Methodology expectations

When you publish or compare numbers, record at least:

the exact bench name and payload shape
whether the run is startup-inclusive or steady-state
SHM backing location and whether mappings were prefaulted
CPU placement, scheduler policy, memlock posture, and NUMA locality
throughput plus tail latency, not throughput alone

If you change host posture materially, document it next to the result. CPU governor changes, huge-page setup, BYTEOR_BENCH_TMPDIR, BYTEOR_BENCH_SHM_PREFAULT, and BYTEOR_BENCH_SINGLE_RING_MODE can all move the outcome.

How to read the baseline file

bench/perf_baseline.kv stores three limits per bench:

min_ops_per_s: lower bound for throughput
max_p99_ns_per_op: upper bound for tail latency
max_cpu_ns_per_op: upper bound for CPU cost per operation

xtask perf enforces those with headroom rather than expecting exact reproduction. That makes the file useful as a regression gate without pretending every host is identical.

v1 production profile
Workspace map
SHM threat model

Provenance

Need the source docs?

Use the public hub to orient yourself, then jump to repo-owned docs or rustdoc when you need contract-level detail.

Reference hub

#Benchmarking

#What these numbers are for

#Public benchmark surface

#Current bench families

#Methodology expectations

#How to read the baseline file

#Related docs