article thumbnail

Use Apache Iceberg in your data lake with Amazon S3, AWS Glue, and Snowflake

AWS Big Data

licensed, 100% open-source data table format that helps simplify data processing on large datasets stored in data lakes. Data engineers use Apache Iceberg because it’s fast, efficient, and reliable at any scale and keeps records of how datasets change over time.

article thumbnail

AWS Lake Formation 2022 year in review

AWS Big Data

In this post, we are excited to summarize the features that the AWS Glue Data Catalog, AWS Glue crawler, and Lake Formation teams delivered in 2022. Whether you are a data platform builder, data engineer, data scientist, or any technology leader interested in data lake solutions, this post is for you.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

Use Apache Iceberg in a data lake to support incremental data processing

AWS Big Data

Iceberg has become very popular for its support for ACID transactions in data lakes and features like schema and partition evolution, time travel, and rollback. In early 2022, AWS announced general availability of Athena ACID transactions, powered by Apache Iceberg. and later supports the Apache Iceberg framework for data lakes.

Data Lake 116
article thumbnail

Enable business users to analyze large datasets in your data lake with Amazon QuickSight

AWS Big Data

Events and many other security data types are stored in Imperva’s Threat Research Multi-Region data lake. Imperva harnesses data to improve their business outcomes. As part of their solution, they are using Amazon QuickSight to unlock insights from their data.

article thumbnail

Build a real-time GDPR-aligned Apache Iceberg data lake

AWS Big Data

Data lakes are a popular choice for today’s organizations to store their data around their business activities. As a best practice of a data lake design, data should be immutable once stored. A data lake built on AWS uses Amazon Simple Storage Service (Amazon S3) as its primary storage environment.

article thumbnail

Build a transactional data lake using Apache Iceberg, AWS Glue, and cross-account data shares using AWS Lake Formation and Amazon Athena

AWS Big Data

Building a data lake on Amazon Simple Storage Service (Amazon S3) provides numerous benefits for an organization. However, many use cases, like performing change data capture (CDC) from an upstream relational database to an Amazon S3-based data lake, require handling data at a record level.

article thumbnail

Real estate CIOs drive deals with data

CIO Business Intelligence

“The only thing we have on premise, I believe, is a data server with a bunch of unstructured data on it for our legal team,” says Grady Ligon, who was named Re/Max’s first CIO in October 2022. billion in 2022, resource industries $82.1 billion in 2022, and personal and consumer services at $82.6 billion in 2022.