Benchmarking Service Mesh Latency the Right Way

2026-01-15

If you have ever evaluated service mesh proxies, you have probably seen impressive latency numbers on a vendor slide deck. "Sub-microsecond overhead," they claim. But when you dig into the methodology, you discover the benchmark was run on localhost, with a single connection, zero contention, and the timer started after the TCP handshake was already complete.

At Cox, we believe benchmarks should reflect reality, not best-case fantasies. Here is our methodology. First, we measure end-to-end: the timer starts when the client sends the request and stops when the response is fully received. Second, we run under load: benchmarks include realistic concurrency, multiple services, and background health checks. Third, we report distributions, not averages: we publish P50, P95, P99, and P99.9, because the tail is where production pain lives.

We also run every benchmark on the same commodity hardware our clients use — no custom kernel patches, no DPDK, no dedicated NUMA nodes. If a number only holds on a $50,000 custom rig, it is not a useful number for most teams.

Since adopting this discipline, we have found that our published numbers consistently match what clients see in production, which has done more for our reputation than any marketing campaign.