Benchmarking Execution Latency the Right Way

2026-01-15

If you have ever evaluated trading technology, you have probably seen impressive latency numbers on a vendor slide deck. "Sub-microsecond execution," they claim. But when you dig into the methodology, you discover the benchmark was run on localhost, with a single instrument, zero contention, and the timer started after the order object was already constructed.

At Cox, we believe benchmarks should reflect reality, not best-case fantasies. Here is our methodology. First, we measure end-to-end: the timer starts when a market data packet arrives at the network card and stops when the order acknowledgement comes back from the exchange. Second, we run under load: benchmarks include realistic concurrency, multiple instruments, and background risk checks. Third, we report distributions, not averages: we publish P50, P95, P99, and P99.9, because the tail is where production pain lives.

We also run every benchmark on the same commodity hardware our clients use—no FPGA accelerators, no co-located servers with exotic kernel patches. If a number only holds on a $50,000 custom rig, it is not a useful number for most firms.

Since adopting this discipline, we have found that our published numbers consistently match what clients see in production, which has done more for our reputation than any marketing campaign.