Remove 2022 Remove Data Lake Remove Marketing Remove Metadata
article thumbnail

Use Apache Iceberg in your data lake with Amazon S3, AWS Glue, and Snowflake

AWS Big Data

licensed, 100% open-source data table format that helps simplify data processing on large datasets stored in data lakes. Data engineers use Apache Iceberg because it’s fast, efficient, and reliable at any scale and keeps records of how datasets change over time.

article thumbnail

Demystifying Modern Data Platforms

Cloudera

July brings summer vacations, holiday gatherings, and for the first time in two years, the return of the Massachusetts Institute of Technology (MIT) Chief Data Officer symposium as an in-person event. A key area of focus for the symposium this year was the design and deployment of modern data platforms.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

The Very Group adopts a data catalog to better organize and leverage its online retail capabilities

CIO Business Intelligence

Very has come full circle as a business built on catalog data, but it took some introspection in order to figure out the best way to get there. Understanding what data you’ve got locked in all these different stores is a big part of the jigsaw puzzle.”. We’ve done it in our financial services area, and some of our marketing area.

IT 77
article thumbnail

What Is Alation Connected Sheets? Q&A with the Creators

Alation

introduces Alation Connected Sheets, a new product under Alation Cloud Service that empowers spreadsheet users with access to trusted data. Talo Thomson, head of content marketing, Alation: Krishna, Sathish: Thanks for agreeing to speak with me today. And it’s very difficult to manage these silos of data analysis.

article thumbnail

Choosing an open table format for your transactional data lake on AWS

AWS Big Data

A modern data architecture enables companies to ingest virtually any type of data through automated pipelines into a data lake, which provides highly durable and cost-effective object storage at petabyte or exabyte scale.

Data Lake 113
article thumbnail

Backtesting index rebalancing arbitrage with Amazon EMR and Apache Iceberg

AWS Big Data

Major market indexes, such as S&P 500, are subject to periodic inclusions and exclusions for reasons beyond the scope of this post (for an example, refer to CoStar Group, Invitation Homes Set to Join S&P 500; Others to Join S&P 100, S&P MidCap 400, and S&P SmallCap 600 ).

article thumbnail

Implement tag-based access control for your data lake and Amazon Redshift data sharing with AWS Lake Formation

AWS Big Data

Data-driven organizations treat data as an asset and use it across different lines of business (LOBs) to drive timely insights and better business decisions. This leads to having data across many instances of data warehouses and data lakes using a modern data architecture in separate AWS accounts.