Quality · query latency · storage

Which retriever sits on the Pareto frontier?

A production-oriented comparison of dense, late-interaction, and hybrid retrieval systems across standard, reasoning-heavy, and separation benchmarks.

Quality vs. latency

All measured systems

--
Model Backend Quality Query p50 ms Index GB