Data Lake, Management and Metadata

Data Lake

Management

Metadata

Use Apache Iceberg in your data lake with Amazon S3, AWS Glue, and Snowflake

AWS Big Data

APRIL 3, 2024

licensed, 100% open-source data table format that helps simplify data processing on large datasets stored in data lakes. Data engineers use Apache Iceberg because it’s fast, efficient, and reliable at any scale and keeps records of how datasets change over time.

Use Apache Iceberg in your data lake with Amazon S3, AWS Glue, and Snowflake

Migrate an existing data lake to a transactional data lake using Apache Iceberg

Webinars

Trending Sources

Multicloud data lake analytics with Amazon Athena

Webinars

Use Apache Iceberg in a data lake to support incremental data processing

Choosing an open table format for your transactional data lake on AWS

Understanding the Differences Between Data Lakes and Data Warehouses

Query your Iceberg tables in data lake using Amazon Redshift (Preview)

Simplify operational data processing in data lakes using AWS Glue and Apache Hudi

Orca Security’s journey to a petabyte-scale data lake with Apache Iceberg and AWS Analytics

Improve operational efficiencies of Apache Iceberg tables built on Amazon S3 data lakes

Build a serverless transactional data lake with Apache Iceberg, Amazon EMR Serverless, and Amazon Athena

Use AWS Glue ETL to perform merge, partition evolution, and schema evolution on Apache Iceberg

Build a real-time GDPR-aligned Apache Iceberg data lake

Salesforce debuts Zero Copy Partner Network to ease data integration

Build a data lake with Apache Flink on Amazon EMR

Empower your Jira data in a data lake with Amazon AppFlow and AWS Glue

AWS Lake Formation 2023 year in review

Efficiently crawl your data lake and improve data access with an AWS Glue crawler using partition indexes

How Knowledge Graphs Power Data Mesh and Data Fabric

Implement tag-based access control for your data lake and Amazon Redshift data sharing with AWS Lake Formation

Where Do Data Catalogs Fit in Metadata Management?

Introducing MongoDB Atlas metadata collection with AWS Glue crawlers

How Cargotec uses metadata replication to enable cross-account data sharing

Introducing AWS Glue crawlers using AWS Lake Formation permission management

Data Lakes on Cloud & it’s Usage in Healthcare

The Data Lakehouse: Blending Data Warehouses and Data Lakes

How Ruparupa gained updated insights with an Amazon S3 data lake, AWS Glue, Apache Hudi, and Amazon QuickSight

Informatica’s new data management clouds target health, finance services

Modernize your ETL platform with AWS Glue Studio: A case study from BMS

Data Lakes: What Are They and Who Needs Them?

Introducing Apache Hudi support with AWS Glue crawlers

Petabyte-scale log analytics with Amazon S3, Amazon OpenSearch Service, and Amazon OpenSearch Ingestion

Introducing AWS Glue crawler and create table support for Apache Iceberg format

Build and manage your modern data stack using dbt and AWS Glue through dbt-glue, the new “trusted” dbt adapter

Use Amazon Athena with Spark SQL for your open-source transactional table formats

Apache Iceberg optimization: Solving the small files problem in Amazon EMR

Data governance in the age of generative AI

Data Management Requirements for the Enterprise Data Lake

How Morningstar used tag-based access controls in AWS Lake Formation to manage permissions for an Amazon Redshift data warehouse

Collibra Brings Effective Data Governance to Line-of-Business

Manage your data warehouse cost allocations with Amazon Redshift Serverless tagging

Configure cross-Region table access with the AWS Glue Catalog and AWS Lake Formation

How to use foundation models and trusted governance to manage AI workflow risk

Query your Apache Hive metastore with AWS Lake Formation permissions

Stay Connected