Making Stream Processing Stateless
Stream processing architectures are applied in operational reporting, IoT, real-time bidding, monitoring. Still, processing in all of the areas utilises data aggregation and associated state management.
Many of frameworks available on the market, including Apache Storm, provide state persistent capabilities. It is not surprise they quickly make their way to production. But in heavily loaded systems disk- and SSD based storages easily become performance bottleneck and complicate software evolution. For a typical data consistency and performance requirements, the design can become a taxing and hardly maintainable choice.
In this talk, we will discuss benefits of Kappa stateless stream processing architecture on an example of a Apache Kafka/Apache Storm based system processing millions of messages per second. We will review pros and cons volatile in-memory state, inspect technology agnostic patterns reemerging in multiple applications including event reprocessing, derived logical time and synchronization, and precision/performance trade offs.
Do stuff