article thumbnail

Optimize checkpointing in your Amazon Managed Service for Apache Flink applications with buffer debloating and unaligned checkpoints – Part 2

AWS Big Data

We’ve already discussed how checkpoints, when triggered by the job manager, signal all source operators to snapshot their state, which is then broadcasted as a special record called a checkpoint barrier. When barriers from all upstream partitions have arrived, the sub-task takes a snapshot of its state.

article thumbnail

Amazon Managed Service for Apache Flink now supports Apache Flink version 1.18

AWS Big Data

The DataStream API now supports features like side outputs and broadcast state, and gaps on windowing API have been closed. where the operator state couldn’t be properly restored when snapshot compression is enabled. Also, we recommend testing the updated application before proceeding with the update.

article thumbnail

Optimize checkpointing in your Amazon Managed Service for Apache Flink applications with buffer debloating and unaligned checkpoints – Part 1

AWS Big Data

Each of the distributed components of an application asynchronously snapshots its state to an external persistent datastore. The challenge is taking snapshots guaranteeing exactly-once consistency. When a downstream operator’s sub-task receives all checkpoint barriers from all input channels, it starts snapshotting its state.