article thumbnail

Use Apache Iceberg in a data lake to support incremental data processing

AWS Big Data

Whenever there is an update to the Iceberg table, a new snapshot of the table is created, and the metadata pointer points to the current table metadata file. At the top of the hierarchy is the metadata file, which stores information about the table’s schema, partition information, and snapshots. tableProperty("format-version", "2").partitionedBy($"product_category").createOrReplace()

Data Lake 117
article thumbnail

Modernize a legacy real-time analytics application with Amazon Managed Service for Apache Flink

AWS Big Data

In this post, we discuss ways to modernize your legacy, on-premises, real-time analytics architecture to build serverless data analytics solutions on AWS using Amazon Managed Service for Apache Flink. EventTime – This is the time when the call center stream is sent (via the KDG in this post).