Data Integration, Data Lake, Enterprise and Snapshot

Data Integration

Data Lake

Enterprise

Snapshot

Use Apache Iceberg in your data lake with Amazon S3, AWS Glue, and Snowflake

AWS Big Data

APRIL 3, 2024

Businesses are constantly evolving, and data leaders are challenged every day to meet new requirements. For many enterprises and large organizations, it is not feasible to have one processing engine or tool to deal with the various business requirements. This post is co-written with Andries Engelbrecht and Scott Teal from Snowflake.

Data Lake

Data Lake Snapshot Metadata Data Architecture

Migrate an existing data lake to a transactional data lake using Apache Iceberg

AWS Big Data

OCTOBER 3, 2023

A data lake is a centralized repository that you can use to store all your structured and unstructured data at any scale. You can store your data as-is, without having to first structure the data and then run different types of analytics for better business insights.

Data Lake

Data Lake Metadata Snapshot Recreation/Entertainment

Join 52,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Leading the Development of Profitable and Sustainable Products

How To Get Promoted In Product Management

MORE WEBINARS

Trending Sources

Analytics Vidhya

A Closer Look at The Next Phase of Cloudera’s Hybrid Data Lakehouse

Cloudera

MARCH 5, 2024

But even with its rise, AI is still a struggle for some enterprises. AI, and any analytics for that matter, are only as good as the data upon which they are based. Cloudera is now the only provider to offer an open data lakehouse with Apache Iceberg for cloud and on-premises. And that’s where the rub is.

Snapshot

Snapshot Data Lake Enterprise Data Governance

Webinars

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Leading the Development of Profitable and Sustainable Products

How To Get Promoted In Product Management

MORE WEBINARS

Simplifying data processing at Capitec with Amazon Redshift integration for Apache Spark

AWS Big Data

NOVEMBER 10, 2023

The data sourcing problem To ensure the reliability of PySpark data pipelines, it’s essential to have consistent record-level data from both dimensional and fact tables stored in the Enterprise Data Warehouse (EDW). These tables are then joined with tables from the Enterprise Data Lake (EDL) at runtime.

Data Processing

Data Processing Data Lake Data Warehouse Optimization

Introducing native support for Apache Hudi, Delta Lake, and Apache Iceberg on AWS Glue for Apache Spark, Part 2: AWS Glue Studio Visual Editor

AWS Big Data

MARCH 20, 2023

In the first post of this series , we described how AWS Glue for Apache Spark works with Apache Hudi, Linux Foundation Delta Lake, and Apache Iceberg datasets tables using the native support of those data lake formats. Even without prior experience using Hudi, Delta Lake or Iceberg, you can easily achieve typical use cases.

Visualization

Visualization Data Lake Snapshot Big Data

5 Reasons to Use Apache Iceberg on Cloudera Data Platform (CDP)

Cloudera

MARCH 23, 2022

By being a truly open table format, Apache Iceberg fits well within the vision of the Cloudera Data Platform (CDP). Let’s highlight some of those benefits, and why choosing CDP and Iceberg can future proof your next generation data architecture. . 4: Enterprise grade. 1: Multi-function analytics . Financial regulation.

Metadata

Metadata Data Architecture Machine Learning Cost-Benefit

How Tricentis unlocks insights across the software development lifecycle at speed and scale using Amazon Redshift

AWS Big Data

MARCH 3, 2023

Tricentis is the global leader in continuous testing for DevOps, cloud, and enterprise applications. Finally, data integrity is of paramount importance. Every event in the data source can be relevant, and our customers don’t tolerate data loss, poor data quality, or discrepancies between the source and Tricentis Analytics.

Software

Software Data Lake Testing Cost-Benefit

Chose Both: Data Fabric and Data Lakehouse

Cloudera

SEPTEMBER 12, 2022

A data fabric answers perhaps the biggest question of all: what data do we have to work with? Managing and making individual data sources available through traditional enterprise data integration, and when end users request them, simply does not scale — especially in light of a growing number of sources and volume.

Unstructured Data

Unstructured Data Data Architecture Data Lake Snapshot

Build and manage your modern data stack using dbt and AWS Glue through dbt-glue, the new “trusted” dbt adapter

AWS Big Data

NOVEMBER 29, 2023

We have seen a strong customer demand to expand its scope to cloud-based data lakes because data lakes are increasingly the enterprise solution for large-scale data initiatives due to their power and capabilities.

Data Lake

Data Lake Management Metrics Data Warehouse

Data Leaders Brief

Use Apache Iceberg in your data lake with Amazon S3, AWS Glue, and Snowflake

Migrate an existing data lake to a transactional data lake using Apache Iceberg

Webinars

Trending Sources

A Closer Look at The Next Phase of Cloudera’s Hybrid Data Lakehouse

Webinars

Simplifying data processing at Capitec with Amazon Redshift integration for Apache Spark

Introducing native support for Apache Hudi, Delta Lake, and Apache Iceberg on AWS Glue for Apache Spark, Part 2: AWS Glue Studio Visual Editor

5 Reasons to Use Apache Iceberg on Cloudera Data Platform (CDP)

How Tricentis unlocks insights across the software development lifecycle at speed and scale using Amazon Redshift

Chose Both: Data Fabric and Data Lakehouse

Build and manage your modern data stack using dbt and AWS Glue through dbt-glue, the new “trusted” dbt adapter

Stay Connected