Data Transformation, Interactive, Publishing and Testing

Data Transformation

Interactive

Publishing

Testing

Introducing Cloudera DataFlow Designer: Self-service, No-Code Dataflow Design

Cloudera

DECEMBER 9, 2022

Developers need to onboard new data sources, chain multiple data transformation steps together, and explore data as it travels through the flow. Interactivity when needed while saving costs. To meet this need we’ve introduced a new concept called test sessions with the DataFlow Designer. .

Testing

Testing Cost-Benefit Interactive Visualization

Cloudera DataFlow Designer: The Key to Agile Data Pipeline Development

Cloudera

MARCH 14, 2023

Allows them to iteratively develop processing logic and test with as little overhead as possible. Plays nice with existing CI/CD processes to promote a data pipeline to production. Provides monitoring, alerting, and troubleshooting for production data pipelines.

Testing

Testing Publishing Metadata Interactive

Join 52,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

How to Optimize the Developer Experience for Monumental Impact

Generative AI Deep Dive: Advancing from Proof of Concept to Production

Understanding User Needs and Satisfying Them

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Leading the Development of Profitable and Sustainable Products

MORE WEBINARS

Trending Sources

Simplify Metrics on Apache Druid With Rill Data and Cloudera

Cloudera

JULY 21, 2022

As creators and experts in Apache Druid, Rill understands the data store’s importance as the engine for real-time, highly interactive analytics. Cloudera Data Warehouse). Efficient batch data processing. Complex data transformations. Figure 1: Rill and Cloudera Architecture. Apache Hive. Windowing functions.

Metrics

Metrics Slice and Dice Data Warehouse Dashboards

Webinars

How to Optimize the Developer Experience for Monumental Impact

Generative AI Deep Dive: Advancing from Proof of Concept to Production

Understanding User Needs and Satisfying Them

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Leading the Development of Profitable and Sustainable Products

MORE WEBINARS

The Ultimate Guide to Modern Data Quality Management (DQM) For An Effective Data Quality Control Driven by The Right Metrics

datapine

SEPTEMBER 29, 2022

from the business interactions), but if not available, then through confirmation techniques of an independent nature. It will indicate whether data is void of significant errors. Also known as data validation, integrity refers to the structural testing of data to ensure that the data complies with procedures.

Data Quality

Data Quality Metrics Data-driven Management

End-to-end development lifecycle for data engineers to build a data integration pipeline using AWS Glue

AWS Big Data

JULY 26, 2023

To grow the power of data at scale for the long term, it’s highly recommended to design an end-to-end development lifecycle for your data integration pipelines. The following are common asks from our customers: Is it possible to develop and test AWS Glue data integration jobs on my local laptop?

Data Integration

Data Integration Snapshot Testing Visualization

Gain insights from historical location data using Amazon Location Service and AWS analytics services

AWS Big Data

MARCH 13, 2024

Developers can use the support in Amazon Location Service for publishing device position updates to Amazon EventBridge to build a near-real-time data pipeline that stores locations of tracked assets in Amazon Simple Storage Service (Amazon S3). You can test this solution yourself using the AWS Samples GitHub repository.

Analytics

Analytics IoT Metadata Internet of Things

AI, the Power of Knowledge and the Future Ahead: An Interview with Head of Ontotext’s R&I Milena Yankova

Ontotext

APRIL 4, 2019

Within a large enterprise, there is a huge amount of data accumulated over the years – many decisions have been made and different methods have been tested. Milena Yankova : What we did for the BBC in the previous Olympics was that we helped journalists publish their reports faster. I think artists can relax.

Recreation/Entertainment

Recreation/Entertainment Testing Enterprise Knowledge Discovery

Modernize a legacy real-time analytics application with Amazon Managed Service for Apache Flink

AWS Big Data

OCTOBER 11, 2023

We introduce you to Amazon Managed Service for Apache Flink Studio and get started querying streaming data interactively using Amazon Kinesis Data Streams. Traditionally, such a legacy call center analytics platform would be built on a relational database that stores data from streaming sources.

Management

Management Metadata Analytics Dashboards

Use AWS Glue DataBrew recipes in your AWS Glue Studio visual ETL jobs

AWS Big Data

JULY 27, 2023

DataBrew is a visual data preparation tool that enables you to clean and normalize data without writing any code. The over 200 transformations it provides are now available to be used in an AWS Glue Studio visual job. Create a DataBrew recipe Start by registering the data store for the claims file.

Visualization

Visualization Cost-Benefit Data Quality Interactive

Migrate from Amazon Kinesis Data Analytics for SQL Applications to Amazon Kinesis Data Analytics Studio

AWS Big Data

JUNE 29, 2023

In this post, we discuss why AWS recommends moving from Kinesis Data Analytics for SQL Applications to Amazon Kinesis Data Analytics for Apache Flink to take advantage of Apache Flink’s advanced streaming capabilities. Kinesis Data Analytics Studio allows us to create a notebook, which is a web-based development environment.

Data Analytics

Data Analytics Analytics IoT Data Lake

How Alation’s Data Team Uses the Modern Data Stack to Power Insights

Alation

OCTOBER 27, 2022

Few actors in the modern data stack have inspired the enthusiasm and fervent support as dbt. This data transformation tool enables data analysts and engineers to transform, test and document data in the cloud data warehouse. Jason: What’s the value of using dbt with the data catalog ?

Metrics

Metrics Dashboards Sales Reporting

Improve observability across Amazon MWAA tasks

AWS Big Data

FEBRUARY 6, 2023

For data pipeline orchestration, the Apache Airflow UI is a user-friendly tool that provides detailed views into your data pipeline. When it comes to pipeline health management, each service that your tasks are interacting with could be storing or publishing logs to different locations, such as an S3 bucket or Amazon CloudWatch logs.

Management

Management Interactive Metadata Publishing

Data platform trinity: Competitive or complementary?

IBM Big Data Hub

JANUARY 18, 2023

For these workloads, data lake vendors usually recommend extracting data into flat files to be used solely for model training and testing purposes. This adds an additional ETL step, making the data even more stale. Data lakehouse was created to solve these problems. Data discoverability.

Data Lake

Data Lake Data Warehouse Data-driven Metadata

Exploring the AI and data capabilities of watsonx

IBM Big Data Hub

JULY 17, 2023

While they require task-specific labeled data for fine tuning, they also offer clients the best cost performance trade-off for non-generative use cases. offers a Prompt Lab, where users can interact with different prompts using prompt engineering on generative AI models for both zero-shot prompting and few-shot prompting.

Machine Learning

Machine Learning Data Warehouse Modeling Cost-Benefit

Build and manage your modern data stack using dbt and AWS Glue through dbt-glue, the new “trusted” dbt adapter

AWS Big Data

NOVEMBER 29, 2023

dbt is an open source, SQL-first templating engine that allows you to write repeatable and extensible data transforms in Python and SQL. dbt is predominantly used by data warehouses (such as Amazon Redshift ) customers who are looking to keep their data transform logic separate from storage and engine.

Data Lake

Data Lake Management Metrics Data Warehouse

What is Data Mapping?

Jet Global

FEBRUARY 23, 2024

This field guide to data mapping will explore how data mapping connects volumes of data for enhanced decision-making. Why Data Mapping is Important Data mapping is a critical element of any data management initiative, such as data integration, data migration, data transformation, data warehousing, or automation.

Data Warehouse

Data Warehouse Reporting Data Transformation Sales

What Is Embedded Analytics?

Jet Global

MAY 1, 2023

This is in contrast to traditional BI, which extracts insight from data outside of the app. As rich, data-driven user experiences are increasingly intertwined with our daily lives, end users are demanding new standards for how they interact with their business data. Yes—but basic dashboards won’t be enough.

Analytics

Analytics Cost-Benefit Visualization Dashboards

Data Leaders Brief

Introducing Cloudera DataFlow Designer: Self-service, No-Code Dataflow Design

Cloudera DataFlow Designer: The Key to Agile Data Pipeline Development

Webinars

Trending Sources

Simplify Metrics on Apache Druid With Rill Data and Cloudera

Webinars

The Ultimate Guide to Modern Data Quality Management (DQM) For An Effective Data Quality Control Driven by The Right Metrics

End-to-end development lifecycle for data engineers to build a data integration pipeline using AWS Glue

Gain insights from historical location data using Amazon Location Service and AWS analytics services

AI, the Power of Knowledge and the Future Ahead: An Interview with Head of Ontotext’s R&I Milena Yankova

Modernize a legacy real-time analytics application with Amazon Managed Service for Apache Flink

Use AWS Glue DataBrew recipes in your AWS Glue Studio visual ETL jobs

Migrate from Amazon Kinesis Data Analytics for SQL Applications to Amazon Kinesis Data Analytics Studio

How Alation’s Data Team Uses the Modern Data Stack to Power Insights

Improve observability across Amazon MWAA tasks

Data platform trinity: Competitive or complementary?

Exploring the AI and data capabilities of watsonx

Build and manage your modern data stack using dbt and AWS Glue through dbt-glue, the new “trusted” dbt adapter

What is Data Mapping?

What Is Embedded Analytics?

Stay Connected