Analytics, Data Lake and Metadata

Analytics

Data Lake

Metadata

Multicloud data lake analytics with Amazon Athena

AWS Big Data

MARCH 18, 2024

Many organizations operate data lakes spanning multiple cloud data stores. In these cases, you may want an integrated query layer to seamlessly run analytical queries across these diverse cloud stores and streamline your data analytics processes. This serves as the S3 data lake data for this post.

Multicloud data lake analytics with Amazon Athena

Migrate an existing data lake to a transactional data lake using Apache Iceberg

Webinars

Trending Sources

Use Apache Iceberg in your data lake with Amazon S3, AWS Glue, and Snowflake

Webinars

Use Apache Iceberg in a data lake to support incremental data processing

Query your Iceberg tables in data lake using Amazon Redshift (Preview)

Choosing an open table format for your transactional data lake on AWS

Orca Security’s journey to a petabyte-scale data lake with Apache Iceberg and AWS Analytics

Simplify operational data processing in data lakes using AWS Glue and Apache Hudi

Build a serverless transactional data lake with Apache Iceberg, Amazon EMR Serverless, and Amazon Athena

Improve operational efficiencies of Apache Iceberg tables built on Amazon S3 data lakes

Build a real-time GDPR-aligned Apache Iceberg data lake

Petabyte-scale log analytics with Amazon S3, Amazon OpenSearch Service, and Amazon OpenSearch Ingestion

Use AWS Glue ETL to perform merge, partition evolution, and schema evolution on Apache Iceberg

Empower your Jira data in a data lake with Amazon AppFlow and AWS Glue

How Cargotec uses metadata replication to enable cross-account data sharing

Build a data lake with Apache Flink on Amazon EMR

Efficiently crawl your data lake and improve data access with an AWS Glue crawler using partition indexes

Introducing MongoDB Atlas metadata collection with AWS Glue crawlers

Data Lakes on Cloud & it’s Usage in Healthcare

Implement tag-based access control for your data lake and Amazon Redshift data sharing with AWS Lake Formation

AWS Lake Formation 2023 year in review

Use AWS Glue Data Catalog views to analyze data

How Ruparupa gained updated insights with an Amazon S3 data lake, AWS Glue, Apache Hudi, and Amazon QuickSight

Gartner Data & Analytics Sydney 2022

Amazon Redshift announcements at AWS re:Invent 2023 to enable analytics on all your data

The Data Lakehouse: Blending Data Warehouses and Data Lakes

Governing data in relational databases using Amazon DataZone

How Knowledge Graphs Power Data Mesh and Data Fabric

Data Lakes: What Are They and Who Needs Them?

Introducing Apache Hudi support with AWS Glue crawlers

Modernize your ETL platform with AWS Glue Studio: A case study from BMS

Introducing AWS Glue crawler and create table support for Apache Iceberg format

Use Amazon Athena with Spark SQL for your open-source transactional table formats

How Amazon Devices scaled and optimized real-time demand and supply forecasts using serverless analytics

Where Do Data Catalogs Fit in Metadata Management?

Apache Iceberg optimization: Solving the small files problem in Amazon EMR

How gaming companies can use Amazon Redshift Serverless to build scalable analytical applications faster and easier

Data governance in the age of generative AI

Configure cross-Region table access with the AWS Glue Catalog and AWS Lake Formation

Use your corporate identities for analytics with Amazon EMR and AWS IAM Identity Center

Implement a serverless CDC process with Apache Iceberg using Amazon DynamoDB and Amazon Athena

Introducing AWS Glue crawlers using AWS Lake Formation permission management

Build a multi-Region and highly resilient modern data architecture using AWS Glue and AWS Lake Formation

How BMO improved data security with Amazon Redshift and AWS Lake Formation

Stay Connected