article thumbnail

Use AWS Glue ETL to perform merge, partition evolution, and schema evolution on Apache Iceberg

AWS Big Data

Iceberg creates snapshots for the table contents. Each snapshot is a complete set of data files in the table at a point in time. Data files in snapshots are stored in one or more manifest files that contain a row for each data file in the table, its partition data, and its metrics.

Snapshot 108
article thumbnail

How to Use Apache Iceberg in CDP’s Open Lakehouse

Cloudera

Time Travel: Reproduce a query as of a given time or snapshot ID, which can be used for historical audits and rollback of erroneous operations, as an example. 7 2002 5271359. We see that as of the first snapshot ( 7445571238522489274) we had data from the years 1995 to 2005 in the table. 1 2008 7009728. 2 2007 7453215.