Data Lake, IT and Metadata - Data Leaders Brief

Data Lake

Metadata

Migrate an existing data lake to a transactional data lake using Apache Iceberg

AWS Big Data

OCTOBER 3, 2023

A data lake is a centralized repository that you can use to store all your structured and unstructured data at any scale. You can store your data as-is, without having to first structure the data and then run different types of analytics for better business insights.

Migrate an existing data lake to a transactional data lake using Apache Iceberg

Use Apache Iceberg in your data lake with Amazon S3, AWS Glue, and Snowflake

Webinars

Trending Sources

Multicloud data lake analytics with Amazon Athena

Webinars

Use Apache Iceberg in a data lake to support incremental data processing

Understanding the Differences Between Data Lakes and Data Warehouses

Choosing an open table format for your transactional data lake on AWS

Query your Iceberg tables in data lake using Amazon Redshift (Preview)

Simplify operational data processing in data lakes using AWS Glue and Apache Hudi

How to modernize data lakes with a data lakehouse architecture

Improve operational efficiencies of Apache Iceberg tables built on Amazon S3 data lakes

Orca Security’s journey to a petabyte-scale data lake with Apache Iceberg and AWS Analytics

Build a serverless transactional data lake with Apache Iceberg, Amazon EMR Serverless, and Amazon Athena

Build a real-time GDPR-aligned Apache Iceberg data lake

Use AWS Glue ETL to perform merge, partition evolution, and schema evolution on Apache Iceberg

Data Swamp, Data Lake, Data Lakehouse: What to Know

Empower your Jira data in a data lake with Amazon AppFlow and AWS Glue

Build a data lake with Apache Flink on Amazon EMR

How Cargotec uses metadata replication to enable cross-account data sharing

Introducing MongoDB Atlas metadata collection with AWS Glue crawlers

Data Lakes on Cloud & it’s Usage in Healthcare

Salesforce debuts Zero Copy Partner Network to ease data integration

Efficiently crawl your data lake and improve data access with an AWS Glue crawler using partition indexes

Implement tag-based access control for your data lake and Amazon Redshift data sharing with AWS Lake Formation

AWS Lake Formation 2023 year in review

How Ruparupa gained updated insights with an Amazon S3 data lake, AWS Glue, Apache Hudi, and Amazon QuickSight

Data Cataloging in the Data Lake: Alation + Kylo

How Knowledge Graphs Power Data Mesh and Data Fabric

Regeneron turns to IT to accelerate drug discovery

Driving Business Value and ROI from a Hybrid Cloud Data Lake

Governing data in relational databases using Amazon DataZone

Salesforce readies Einstein Copilot to unleash generative AI across its offerings

Data Lakes: What Are They and Who Needs Them?

Modernize your ETL platform with AWS Glue Studio: A case study from BMS

Use AWS Glue Data Catalog views to analyze data

Introducing Apache Hudi support with AWS Glue crawlers

Introducing AWS Glue crawler and create table support for Apache Iceberg format

Where Do Data Catalogs Fit in Metadata Management?

Petabyte-scale log analytics with Amazon S3, Amazon OpenSearch Service, and Amazon OpenSearch Ingestion

What is an open data lakehouse and why you should care?

Data Mesh 101: What it is and Why You Should Care

Unstructured data management and governance using AWS AI/ML and analytics services

Use Amazon Athena with Spark SQL for your open-source transactional table formats

6 BI challenges IT teams must address

Get Your Analytics Insights Instantly – Without Abandoning Central IT

Stay Connected