article thumbnail

Use AWS Glue ETL to perform merge, partition evolution, and schema evolution on Apache Iceberg

AWS Big Data

Apache Iceberg manages these schema changes in a backward-compatible way through its innovative metadata table evolution architecture. With Lake Formation, you can manage fine-grained access control for your data lake data on Amazon S3 and its metadata in the Data Catalog. Iceberg maintains the table state in metadata files.

Snapshot 111
article thumbnail

Speed up queries with the cost-based optimizer in Amazon Athena

AWS Big Data

Starting today, the Athena SQL engine uses a cost-based optimizer (CBO), a new feature that uses table and column statistics stored in the AWS Glue Data Catalog as part of the table’s metadata. By using these statistics, CBO improves query run plans and boosts the performance of queries run in Athena.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

Themes and Conferences per Pacoid, Episode 10

Domino Data Lab

It also represents part of the current focus for Project Jupyter : adding support for collaboration, enhanced security, projects as top-level entities, data registry, metadata management, and telemetry about usage. my answer was almost immediate: Daniel Kahneman.

article thumbnail

A CDO’s Guide to the Data Catalog

Alation

In 2002, Capital One became the first company to appoint a Chief Data Officer (CDO). Through serving as a centralized conduit for discovering and requesting access to data, a data catalog provides CDOs and their data governance teams with information and metadata to determine which people should see and access what data.