Remove 2022 Remove Blog Remove Data Transformation Remove Testing
article thumbnail

How to Use Apache Iceberg in CDP’s Open Lakehouse

Cloudera

In June 2022, Cloudera announced the general availability of Apache Iceberg in the Cloudera Data Platform (CDP). The general availability covers Iceberg running within some of the key data services in CDP, including Cloudera Data Warehouse ( CDW ), Cloudera Data Engineering ( CDE ), and Cloudera Machine Learning ( CML ).

article thumbnail

Cloudera Data Engineering 2021 Year End Review

Cloudera

This enabled new use-cases with customers that were using a mix of Spark and Hive to perform data transformations. . As exciting 2021 has been as we delivered killer features for our customers, we are even more excited for what’s in store in 2022. Test Drive CDP Pubic Cloud. Figure 3: CDE Pipeline authoring UI.

Snapshot 115
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

Automate alerting and reporting for AWS Glue job resource usage

AWS Big Data

Data transformation plays a pivotal role in providing the necessary data insights for businesses in any organization, small and large. To gain these insights, customers often perform ETL (extract, transform, and load) jobs from their source systems and output an enriched dataset. 1X 1 4 16 64 G.2X 2X 2 8 32 128 G.4X

article thumbnail

DataOps Observability: Taming the Chaos (part 1)

DataKitchen

DataOps Observability can help you ensure that your complex data pipelines and processes are accurate and that they deliver as designed. Observability also validates that your data transformations, models, and reports are performing as expected. to monitor your data operations. without replacing staff or systems?to

Testing 169
article thumbnail

DataOps Observability: Taming the Chaos (part 1)

DataKitchen

DataOps Observability can help you ensure that your complex data pipelines and processes are accurate and that they deliver as designed. Observability also validates that your data transformations, models, and reports are performing as expected. to monitor your data operations. without replacing staff or systems?to

Testing 130
article thumbnail

Supercharge Your Data Lakehouse with Apache Iceberg in Cloudera Data Platform

Cloudera

These tools empower analysts and data scientists to easily collaborate on the same data, with their choice of tools and analytic engines. No more lock-in, unnecessary data transformations, or data movement across tools and clouds just to extract insights out of the data.

article thumbnail

Build and manage your modern data stack using dbt and AWS Glue through dbt-glue, the new “trusted” dbt adapter

AWS Big Data

dbt is an open source, SQL-first templating engine that allows you to write repeatable and extensible data transforms in Python and SQL. dbt is predominantly used by data warehouses (such as Amazon Redshift ) customers who are looking to keep their data transform logic separate from storage and engine.

Data Lake 102