article thumbnail

Use AWS Glue ETL to perform merge, partition evolution, and schema evolution on Apache Iceberg

AWS Big Data

However, altering schema and table partitions in traditional data lakes can be a disruptive and time-consuming task, requiring renaming or recreating entire tables and reprocessing large datasets. Iceberg creates snapshots for the table contents. Each snapshot is a complete set of data files in the table at a point in time.

Snapshot 113
article thumbnail

How to Use Apache Iceberg in CDP’s Open Lakehouse

Cloudera

Time Travel: Reproduce a query as of a given time or snapshot ID, which can be used for historical audits and rollback of erroneous operations, as an example. 7 2002 5271359. But if the partition scheme needs changing, you’ll typically have to recreate the table from scratch. 1 2008 7009728. 2 2007 7453215. 3 2006 7141922.