Data Analytics, Data Lake, Data Quality and Testing

Data Analytics

Data Lake

Data Quality

Testing

Measure performance of AWS Glue Data Quality for ETL pipelines

AWS Big Data

MARCH 12, 2024

In recent years, data lakes have become a mainstream architecture, and data quality validation is a critical factor to improve the reusability and consistency of the data. In this post, we provide benchmark results of running increasingly complex data quality rulesets over a predefined test dataset.

Data Quality

Data Quality Measurement Testing Visualization

Visualize data quality scores and metrics generated by AWS Glue Data Quality

AWS Big Data

JUNE 6, 2023

AWS Glue Data Quality allows you to measure and monitor the quality of data in your data repositories. It’s important for business users to be able to see quality scores and metrics to make confident business decisions and debug data quality issues. An AWS Glue crawler crawls the results.

Data Quality

Data Quality Metrics Visualization Dashboards

Join 52,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

How to Optimize the Developer Experience for Monumental Impact

Generative AI Deep Dive: Advancing from Proof of Concept to Production

Understanding User Needs and Satisfying Them

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Leading the Development of Profitable and Sustainable Products

MORE WEBINARS

Trending Sources

Fire Your Super-Smart Data Consultants with DataOps

DataKitchen

JANUARY 25, 2022

Ensuring that data is available, secure, correct, and fit for purpose is neither simple nor cheap. Companies end up paying outside consultants enormous fees while still having to suffer the effects of poor data quality and lengthy cycle time. . When a job is automated, there is little advantage to outsourcing. .

Consulting

Consulting Testing Data Lake Data Quality

Webinars

How to Optimize the Developer Experience for Monumental Impact

Generative AI Deep Dive: Advancing from Proof of Concept to Production

Understanding User Needs and Satisfying Them

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Leading the Development of Profitable and Sustainable Products

MORE WEBINARS

Modernize your ETL platform with AWS Glue Studio: A case study from BMS

AWS Big Data

DECEMBER 13, 2023

For the past 5 years, BMS has used a custom framework called Enterprise Data Lake Services (EDLS) to create ETL jobs for business users. Manually upgrading, testing, and deploying over 5,000 jobs every few quarters was time consuming, error prone, costly, and not sustainable.

Metadata

Metadata Data Lake Visualization Data Transformation

What is a Data Mesh?

DataKitchen

AUGUST 3, 2021

First-generation – expensive, proprietary enterprise data warehouse and business intelligence platforms maintained by a specialized team drowning in technical debt. Second-generation – gigantic, complex data lake maintained by a specialized team drowning in technical debt. See the pattern?

Data Architecture

Data Architecture Data Lake Cost-Benefit Data Warehouse

A Day in the Life of a DataOps Engineer

DataKitchen

OCTOBER 11, 2021

The biggest challenge is broken data pipelines due to highly manual processes. Figure 1 shows a manually executed data analytics pipeline. First, a business analyst consolidates data from some public websites, an SFTP server and some downloaded email attachments, all into Excel.

Testing

Testing Metadata Dashboards Statistics

Create a modern data platform using the Data Build Tool (dbt) in the AWS Cloud

AWS Big Data

NOVEMBER 9, 2023

A modern data platform entails maintaining data across multiple layers, targeting diverse platform capabilities like high performance, ease of development, cost-effectiveness, and DataOps features such as CI/CD, lineage, and unit testing. AWS Glue – AWS Glue is used to load files into Amazon Redshift through the S3 data lake.

Data Warehouse

Data Warehouse Testing Data Quality Reporting

How SumUp made digital analytics more accessible using AWS Glue

AWS Big Data

JUNE 6, 2023

Unless, of course, the rest of their data also resides in the Google Cloud. In this post we showcase how we used AWS Glue to move siloed digital analytics data, with inconsistent arrival times, to AWS S3 (our Data Lake) and our central data warehouse (DWH), Snowflake.

Analytics

Analytics Data Lake Testing Optimization

How Tricentis unlocks insights across the software development lifecycle at speed and scale using Amazon Redshift

AWS Big Data

MARCH 3, 2023

Tricentis is the global leader in continuous testing for DevOps, cloud, and enterprise applications. Speed changes everything, and continuous testing across the entire CI/CD lifecycle is the key. Tricentis instills that confidence by providing software tools that enable Agile Continuous Testing (ACT) at scale.

Software

Software Data Lake Testing Cost-Benefit

Accomplish Agile Business Intelligence & Analytics For Your Business

datapine

APRIL 15, 2020

Therefore, we will walk you through this beginner’s guide on agile business intelligence and analytics to help you understand how they work and the methodology behind them. Your Chance: Want to test an agile business intelligence solution? What Is Agile Analytics And BI? Agile Business Intelligence & Analytics Methodology.

Business Intelligence

Business Intelligence Analytics Testing Dashboards

Handle UPSERT data operations using open-source Delta Lake and AWS Glue

AWS Big Data

JANUARY 30, 2023

Many customers need an ACID transaction (atomic, consistent, isolated, durable) data lake that can log change data capture (CDC) from operational data sources. There is also demand for merging real-time data into batch data. Delta Lake framework provides these two capabilities.

Insurance

Insurance Data Lake Data-driven Management

Decoding Data Analyst Job Description: Skills, Tools, and Career Paths

FineReport

MARCH 24, 2024

Rapid technological advancements and extensive networking have propelled the evolution of data analytics, fundamentally reshaping decision-making practices across various sectors. In this landscape, data analysts assume a pivotal role, tasked with interpreting data to drive informed decision-making.

Statistics

Statistics Data mining Visualization Reporting

Build efficient ETL pipelines with AWS Step Functions distributed map and redrive feature

AWS Big Data

DECEMBER 18, 2023

Solution overview One of the common functionalities involved in data pipelines is extracting data from multiple data sources and exporting it to a data lake or synchronizing the data to another database. His core area of expertise include Technology Strategy, Data Analytics, and Data Science.

Metadata

Metadata Visualization Data Lake Data-driven

Configure end-to-end data pipelines with Etleap, Amazon Redshift, and dbt

AWS Big Data

JULY 12, 2023

Amazon Redshift helps you break down the data silos and allows you to run unified, self-service, real-time, and predictive analytics on all data across your operational databases, data lake, data warehouse, and third-party datasets with built-in governance.

Data Warehouse

Data Warehouse Modeling Dashboards Data Lake

The Gartner 2021 Leadership Vision for Data & Analytics Leaders Webinar Q&A

Andrew White

JANUARY 11, 2021

Note: Delivery of data, analytics solutions and the sustainment of technology, data and services is a question. Does Data warehouse as a software tool will play role in future of Data & Analytics strategy? Data lakes don’t offer this nor should they. Governance. Product Management.

Data Analytics

Data Analytics Analytics Data-driven Finance

AWS Glue streaming application to process Amazon MSK data using AWS Glue Schema Registry

AWS Big Data

JUNE 12, 2023

Organizations across the world are increasingly relying on streaming data, and there is a growing need for real-time data analytics, considering the growing velocity and volume of data being collected. Therefore, it’s crucial to keep the schema definition in the Schema Registry and the Data Catalog table in sync.

Management

Management Metadata Testing Internet of Things

Unlock The Power of Your Data With These 19 Big Data & Data Analytics Books

datapine

AUGUST 29, 2022

The saying “knowledge is power” has never been more relevant, thanks to the widespread commercial use of big data and data analytics. The rate at which data is generated has increased exponentially in recent years. Essential Big Data And Data Analytics Insights. million searches per day and 1.2

Big Data

Big Data Data Analytics Analytics Data mining

Did Big Data Deliver Business Transformation & Improved CX?

Alation

AUGUST 4, 2022

Many CIOs argue the rise of big data pushed people to use data more proactively for business decision-making. Big data got“ more leaders and people in the organization to use data, analytics, and machine learning in their decision making,” says former CIO Isaac Sacolick. Big data can grow too big fast.

Big Data

Big Data Digital Transformation Data Lake Data-driven

Data Leaders Brief

Measure performance of AWS Glue Data Quality for ETL pipelines

Visualize data quality scores and metrics generated by AWS Glue Data Quality

Webinars

Trending Sources

Fire Your Super-Smart Data Consultants with DataOps

Webinars

Modernize your ETL platform with AWS Glue Studio: A case study from BMS

What is a Data Mesh?

A Day in the Life of a DataOps Engineer

Create a modern data platform using the Data Build Tool (dbt) in the AWS Cloud

How SumUp made digital analytics more accessible using AWS Glue

How Tricentis unlocks insights across the software development lifecycle at speed and scale using Amazon Redshift

Accomplish Agile Business Intelligence & Analytics For Your Business

Handle UPSERT data operations using open-source Delta Lake and AWS Glue

Decoding Data Analyst Job Description: Skills, Tools, and Career Paths

Build efficient ETL pipelines with AWS Step Functions distributed map and redrive feature

Configure end-to-end data pipelines with Etleap, Amazon Redshift, and dbt

The Gartner 2021 Leadership Vision for Data & Analytics Leaders Webinar Q&A

AWS Glue streaming application to process Amazon MSK data using AWS Glue Schema Registry

Unlock The Power of Your Data With These 19 Big Data & Data Analytics Books

Did Big Data Deliver Business Transformation & Improved CX?

Stay Connected