Pragmatic Replay-Oriented Architecture

Feb 20, 2020

How can we simplify concurrency, improve logging, improve maintainability, and reduce regression defects without throwing away everything we have and re-building it from scratch using a new framework? or without spending countless hours every release reviewing each function to make sure all the bases are covered?

The auction engine at the Intercontinental Exchange that sets the global benchmark price of Gold and Silver was organically evolved from monolith to modular monolith, and then to message-driven modular monolith. The final architecture gave us simpler, actor-model-like concurrency, and the ability to implement production traffic replays in non-prod.

We’d like to share with you our pragmatic approach to using event buses for reproducing prod scenarios in non-prod, and to test software changes with days/months/years of real production traffic. We will share the event bus implementations we used, how we integrated the replay tests into our development process, and the limitations of replay-based testing.