Distributed stateful stream processing is challenging, especially in regard to handling failures and recovery. We are using Apache Flink, a distributed stream processing engine that has long provided exactly-once semantics within the Flink application itself. Flink generates checkpoints on a regular, configurable interval and then writes them to a persistent storage system.
Sort: