Remove Data Transformation Remove Data Warehouse Remove Interactive Remove Snapshot
article thumbnail

How to Use Apache Iceberg in CDP’s Open Lakehouse

Cloudera

The general availability covers Iceberg running within some of the key data services in CDP, including Cloudera Data Warehouse ( CDW ), Cloudera Data Engineering ( CDE ), and Cloudera Machine Learning ( CML ). Cloudera Data Engineering (Spark 3) with Airflow enabled. Cloudera Machine Learning . Time travel.

article thumbnail

Enforce fine-grained access control on Open Table Formats via Amazon EMR integrated with AWS Lake Formation

AWS Big Data

Next, we create an AWS Cloud9 interactive development environment (IDE). The following are some highlighted steps: Run a snapshot query. %%sql You also can use transactional data lake features such as running snapshot queries, incremental queries, time travel, and DML query. Choose Create key pair.

article thumbnail

Build and manage your modern data stack using dbt and AWS Glue through dbt-glue, the new “trusted” dbt adapter

AWS Big Data

dbt is an open source, SQL-first templating engine that allows you to write repeatable and extensible data transforms in Python and SQL. dbt is predominantly used by data warehouses (such as Amazon Redshift ) customers who are looking to keep their data transform logic separate from storage and engine.

Data Lake 103