article thumbnail

Enhance monitoring and debugging for AWS Glue jobs using new job observability metrics, Part 3: Visualization and trend analysis using Amazon QuickSight

AWS Big Data

In Part 2 of this series, we discussed how to enable AWS Glue job observability metrics and integrate them with Grafana for real-time monitoring. Grafana provides powerful customizable dashboards to view pipeline health. QuickSight makes it straightforward for business users to visualize data in interactive dashboards and reports.

Metrics 108
article thumbnail

The Ultimate Guide to Modern Data Quality Management (DQM) For An Effective Data Quality Control Driven by The Right Metrics

datapine

6) Data Quality Metrics Examples. Reporting being part of an effective DQM, we will also go through some data quality metrics examples you can use to assess your efforts in the matter. It involves: Reviewing data in detail Comparing and contrasting the data to its own metadata Running statistical models Data quality reports.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Introducing Amazon MWAA larger environment sizes

AWS Big Data

Running Apache Airflow at scale puts proportionally greater load on the Airflow metadata database, sometimes leading to CPU and memory issues on the underlying Amazon Relational Database Service (Amazon RDS) cluster. A resource-starved metadata database may lead to dropped connections from your workers, failing tasks prematurely.

article thumbnail

The Future of Data Lineage and the Role of Metadata

Alation

Most of the time we think about data fields & files, columns & tables, reports & dashboards. It’s important to realize that we need visibility into lineage and relationships between all data and data-related assets, including business terms, metric definitions, policies, quality rules, access controls, algorithms, etc.

article thumbnail

Petabyte-scale log analytics with Amazon S3, Amazon OpenSearch Service, and Amazon OpenSearch Ingestion

AWS Big Data

For on-demand ingestion for past time durations where you don’t expect new objects to be created, consider using supported pipeline metrics such as recordsOut.count to create Amazon CloudWatch alarms that can stop the pipeline. For a list of supported metrics, refer to Monitoring pipeline metrics.

Data Lake 116
article thumbnail

Use Amazon OpenSearch Ingestion to migrate to Amazon OpenSearch Serverless

AWS Big Data

OSI is a fully managed, serverless data collector that delivers real-time log, metric, and trace data to OpenSearch Service domains and OpenSearch Serverless collections. Migration of metadata such as security roles and dashboard objects will be covered in another subsequent post.

article thumbnail

Deploy Amazon QuickSight dashboards to monitor AWS Glue ETL job metrics and set alarms

AWS Big Data

In this post, we explore how to combine AWS Glue usage information and metrics with centralized reporting and visualization using QuickSight. You have metrics available per job run within the AWS Glue console, but they don’t cover all available AWS Glue job metrics, and the visuals aren’t as interactive compared to the QuickSight dashboard.

Metrics 89