Big Data, Blog and Data Lake - Data Leaders Brief

Big Data

Blog

Data Lake

Multicloud data lake analytics with Amazon Athena

AWS Big Data

MARCH 18, 2024

Many organizations operate data lakes spanning multiple cloud data stores. In these cases, you may want an integrated query layer to seamlessly run analytical queries across these diverse cloud stores and streamline your data analytics processes. Upload the data file to the S3 bucket created by the CloudFormation stack.

Multicloud data lake analytics with Amazon Athena

Migrate an existing data lake to a transactional data lake using Apache Iceberg

Webinars

Trending Sources

Differentiating Between Data Lakes and Data Warehouses

Webinars

Is Data Virtualization the Secret Behind Operationalizing Data Lakes?

Use Apache Iceberg in a data lake to support incremental data processing

Enable business users to analyze large datasets in your data lake with Amazon QuickSight

Data Lakes on Cloud & it’s Usage in Healthcare

Create an Apache Hudi-based near-real-time transactional data lake using AWS DMS, Amazon Kinesis, AWS Glue streaming ETL, and data visualization using Amazon QuickSight

Data Modeling 301 for the cloud: data lake and NoSQL data modeling and design

Modern Data Architecture: Data Warehousing, Data Lakes, and Data Mesh Explained

Efficiently crawl your data lake and improve data access with an AWS Glue crawler using partition indexes

Build a transactional data lake using Apache Iceberg, AWS Glue, and cross-account data shares using AWS Lake Formation and Amazon Athena

Enforce fine-grained access control on Open Table Formats via Amazon EMR integrated with AWS Lake Formation

Migrate data from Azure Blob Storage to Amazon S3 using AWS Glue

Apache Iceberg optimization: Solving the small files problem in Amazon EMR

The Future of the Data Lakehouse – Open

Enhance data security and governance for Amazon Redshift Spectrum with VPC endpoints

Automated data governance with AWS Glue Data Quality, sensitive data detection, and AWS Lake Formation

Automate large-scale data validation using Amazon EMR and Apache Griffin

Announcing the AWS Well-Architected Data Analytics Lens

AI and ML: No Longer the Stuff of Science Fiction

Using AWS AppSync and AWS Lake Formation to access a secure data lake through a GraphQL API

Deploy and Optimize Your Snowflake Environment Faster With Accelerators

Handle UPSERT data operations using open-source Delta Lake and AWS Glue

Enhance query performance using AWS Glue Data Catalog column-level statistics

4 ways generative AI addresses manufacturing challenges

Data architecture strategy for data quality

Query your Iceberg tables in data lake using Amazon Redshift (Preview)

Simplify operational data processing in data lakes using AWS Glue and Apache Hudi

Set up advanced rules to validate quality of multiple datasets with AWS Glue Data Quality

Munich Re Launches Enterprise-Wide Data-Driven Platform for Analytics

Keys to Ensure that Data isn’t Slowing Down your Innovation Efforts

TDC Digital leverages IBM Cloud for transparent billing and improved customer satisfaction

10 Things AWS Can Do for Your SaaS Company

Dive deep into AWS Glue 4.0 for Apache Spark

AWS Glue Data Quality is Generally Available

Generic orchestration framework for data warehousing workloads using Amazon Redshift RSQL

How SumUp made digital analytics more accessible using AWS Glue

Implement slowly changing dimensions in a data lake using AWS Glue and Delta

Interact with Apache Iceberg tables using Amazon Athena and cross account fine-grained permissions using AWS Lake Formation

2020 Data Impact Award Winner Spotlight: United Overseas Bank

What Is a Data Catalog?

Data science vs data analytics: Unpacking the differences

My introduction and my love for DATA

Stay Connected