article thumbnail

Modernize your ETL platform with AWS Glue Studio: A case study from BMS

AWS Big Data

In addition to using native managed AWS services that BMS didn’t need to worry about upgrading, BMS was looking to offer an ETL service to non-technical business users that could visually compose data transformation workflows and seamlessly run them on the AWS Glue Apache Spark-based serverless data integration engine.

article thumbnail

Cloudera DataFlow Designer: The Key to Agile Data Pipeline Development

Cloudera

Allows them to iteratively develop processing logic and test with as little overhead as possible. Plays nice with existing CI/CD processes to promote a data pipeline to production. Provides monitoring, alerting, and troubleshooting for production data pipelines.

Testing 80
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

The Ultimate Guide to Modern Data Quality Management (DQM) For An Effective Data Quality Control Driven by The Right Metrics

datapine

This person (or group of individuals) ensures that the theory behind data quality is communicated to the development team. 2 – Data profiling. Data profiling is an essential process in the DQM lifecycle. This means there are no unintended data errors, and it corresponds to its appropriate designation (e.g.,

article thumbnail

Introducing Cloudera DataFlow Designer: Self-service, No-Code Dataflow Design

Cloudera

Developers need to onboard new data sources, chain multiple data transformation steps together, and explore data as it travels through the flow. This allows developers to make changes to their processing logic on the fly while running some test data through their flow and validating that their changes work as intended.

Testing 95
article thumbnail

Turnkey Cloud DataOps: Solution from Alation and Accenture

Alation

More specifically, IDF has been integrated with Alation at an API level; this means that all generated pipeline code, metadata attributes, configuration files, and lineage are automatically synced (representing a huge time savings). They can better understand data transformations, checks, and normalization. Transparency is key.

article thumbnail

Data Preparation and Data Mapping: The Glue Between Data Management and Data Governance to Accelerate Insights and Reduce Risks

erwin

Organizations have spent a lot of time and money trying to harmonize data across diverse platforms , including cleansing, uploading metadata, converting code, defining business glossaries, tracking data transformations and so on. So questions linger about whether transformed data can be trusted.

article thumbnail

Gain insights from historical location data using Amazon Location Service and AWS analytics services

AWS Big Data

You can also use the data transformation feature of Data Firehose to invoke a Lambda function to perform data transformation in batches. Athena is used to run geospatial queries on the location data stored in the S3 buckets. You can test this solution yourself using the AWS Samples GitHub repository.