Data Lake and Metadata - Data Leaders Brief

Use Apache Iceberg in your data lake with Amazon S3, AWS Glue, and Snowflake

AWS Big Data

APRIL 3, 2024

licensed, 100% open-source data table format that helps simplify data processing on large datasets stored in data lakes. Data engineers use Apache Iceberg because it’s fast, efficient, and reliable at any scale and keeps records of how datasets change over time.

Use Apache Iceberg in your data lake with Amazon S3, AWS Glue, and Snowflake

Migrate an existing data lake to a transactional data lake using Apache Iceberg

Webinars

Trending Sources

Multicloud data lake analytics with Amazon Athena

Webinars

Use Apache Iceberg in a data lake to support incremental data processing

Understanding the Differences Between Data Lakes and Data Warehouses

Build a serverless transactional data lake with Apache Iceberg, Amazon EMR Serverless, and Amazon Athena

Build a real-time GDPR-aligned Apache Iceberg data lake

Use AWS Glue ETL to perform merge, partition evolution, and schema evolution on Apache Iceberg

Data Lakes on Cloud & it’s Usage in Healthcare

Efficiently crawl your data lake and improve data access with an AWS Glue crawler using partition indexes

The Data Lakehouse: Blending Data Warehouses and Data Lakes

How Knowledge Graphs Power Data Mesh and Data Fabric

Salesforce debuts Zero Copy Partner Network to ease data integration

Data Lakes: What Are They and Who Needs Them?

Modernize your ETL platform with AWS Glue Studio: A case study from BMS

Introducing AWS Glue crawler and create table support for Apache Iceberg format

Where Do Data Catalogs Fit in Metadata Management?

Petabyte-scale log analytics with Amazon S3, Amazon OpenSearch Service, and Amazon OpenSearch Ingestion

Use Amazon Athena with Spark SQL for your open-source transactional table formats

Apache Iceberg optimization: Solving the small files problem in Amazon EMR

Data governance in the age of generative AI

Configure cross-Region table access with the AWS Glue Catalog and AWS Lake Formation

Query your Apache Hive metastore with AWS Lake Formation permissions

Collibra Brings Effective Data Governance to Line-of-Business

Enforce fine-grained access control on Open Table Formats via Amazon EMR integrated with AWS Lake Formation

The Future of the Data Lakehouse – Open

Gartner Data & Analytics Sydney 2022

The Future of the Data Lakehouse – Open

Migrate Hive data from CDH to CDP public cloud

What is a Data Mesh?

Informatica’s new data management clouds target health, finance services

Accelerate HiveQL with Oozie to Spark SQL migration on Amazon EMR

How gaming companies can use Amazon Redshift Serverless to build scalable analytical applications faster and easier

AWS Lake Formation 2022 year in review

Doing Cloud Migration and Data Governance Right the First Time

Simplify operational data processing in data lakes using AWS Glue and Apache Hudi

Educating ChatGPT on Data Lakehouse

What is a data architect? Skills, salaries, and how to become a data framework master

What Is a Data Catalog?

Query your Iceberg tables in data lake using Amazon Redshift (Preview)

How Cloudera Supports Zero Trust for Data

Cloud Data Science News – Beta 6

Operational Database Security – Part 2

Putting the Business Back Into Business Innovation

Stay Connected