Performance Tuning
This page covers performance tuning for IndexBus transport and ByteOr runtime deployments.
IndexBus Tuning
Wait Strategy
Use spin for latency-sensitive production workloads with dedicated cores. Use backoff for development, testing, and shared-host deployments.
SHM Placement
- Place SHM files on
tmpfsfor general use - Use
hugetlbfsfor large regions to reduce TLB pressure - Ensure SHM backing is on local storage, not network-mounted
- Clean stale SHM files before starting new deployments
Lane Sizing
- Size lane capacity based on expected burst size, not average throughput
- Over-provisioning capacity wastes memory; under-provisioning causes backpressure
- Monitor router counters for routing distribution and drop counts
Memory Locking
- Use
mlockallto prevent page faults in the hot path - Verify memory-lock limits with
doctor isolated-coreprofile requests memory locking by default
CPU Tuning
Pinning
Scheduling
RT scheduling requires permissions. Verify with doctor.
Core Isolation
For isolated-core profile:
- Use
isolcpuskernel parameter to dedicate cores - Ensure no other workloads scheduled on isolated cores
- Verify isolation with
doctor
Monitoring
Router Counters
Monitor IndexBus router counters for:
- Total routed messages
- Per-output distribution
- Drop counts under pressure
Runtime Metrics
Cloud exposes metrics at /metrics including:
- Request counters by route
- Auth success/failure rates
- Rate limiting events
- Worker job states
Agent Telemetry
Monitor agent-reported metrics:
- Heartbeat intervals
- Applied vs. requested tuning
- Degraded tuning reasons
- Artifact upload success rates
Benchmarking
Use the baseline benchmark suite for reproducible measurements:
- Run on isolated hardware for consistent results
- Compare against the published baseline numbers
- Report both throughput and latency percentiles (p50, p99, p99.9)
- Document the exact hardware and kernel configuration used
Performance Baselines
Conservative CI-safe thresholds from bench/perf_baseline.kv (OSS readiness gate):
These are conservative CI-safe thresholds, not peak numbers. The perf gate enforces throughput ≥ baseline × 95% and latency ≤ baseline × 105%.