Data Analytics, Data Architecture, Data Lake and Metrics

Data Analytics

Data Architecture

Data Lake

Metrics

Orca Security’s journey to a petabyte-scale data lake with Apache Iceberg and AWS Analytics

AWS Big Data

JULY 20, 2023

With data becoming the driving force behind many industries today, having a modern data architecture is pivotal for organizations to be successful. In this post, we describe Orca’s journey building a transactional data lake using Amazon Simple Storage Service (Amazon S3), Apache Iceberg, and AWS Analytics.

Data Lake

Data Lake Analytics Snapshot Optimization

Choosing an open table format for your transactional data lake on AWS

AWS Big Data

JUNE 9, 2023

A modern data architecture enables companies to ingest virtually any type of data through automated pipelines into a data lake, which provides highly durable and cost-effective object storage at petabyte or exabyte scale.

Data Lake

Data Lake Metadata Optimization Statistics

Join 52,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

Generative AI Deep Dive: Advancing from Proof of Concept to Production

Understanding User Needs and Satisfying Them

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Leading the Development of Profitable and Sustainable Products

How To Get Promoted In Product Management

MORE WEBINARS

Trending Sources

Analytics Vidhya

Data science vs data analytics: Unpacking the differences

IBM Big Data Hub

SEPTEMBER 19, 2023

Though you may encounter the terms “data science” and “data analytics” being used interchangeably in conversations or online, they refer to two distinctly different concepts. Meanwhile, data analytics is the act of examining datasets to extract value and find answers to specific questions.

Data Science

Data Science Data Analytics Prescriptive Analytics Analytics

Webinars

Generative AI Deep Dive: Advancing from Proof of Concept to Production

Understanding User Needs and Satisfying Them

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Leading the Development of Profitable and Sustainable Products

How To Get Promoted In Product Management

MORE WEBINARS

Eight Top DataOps Trends for 2022

DataKitchen

NOVEMBER 29, 2021

In 2022, data organizations will institute robust automated processes around their AI systems to make them more accountable to stakeholders. Quality test suites will enforce “equity,” like any other performance metric. Data Gets Meshier. 2022 will bring further momentum behind modular enterprise architectures like data mesh.

Testing

Testing Data Lake Data Architecture Manufacturing

Visualize data quality scores and metrics generated by AWS Glue Data Quality

AWS Big Data

JUNE 6, 2023

AWS Glue Data Quality allows you to measure and monitor the quality of data in your data repositories. It’s important for business users to be able to see quality scores and metrics to make confident business decisions and debug data quality issues. An AWS Glue crawler crawls the results.

Data Quality

Data Quality Metrics Visualization Dashboards

Build a multi-Region and highly resilient modern data architecture using AWS Glue and AWS Lake Formation

AWS Big Data

JANUARY 24, 2023

This solution only replicates metadata in the Data Catalog, not the actual underlying data. To have a redundant data lake using Lake Formation and AWS Glue in an additional Region, we recommend replicating the Amazon S3-based storage using S3 replication , S3 sync, aws-s3-copy-sync-using-batch or S3 Batch replication process.

Data Architecture

Data Architecture Metadata Data Lake Snapshot

Why the Data Journey Manifesto?

DataKitchen

JUNE 12, 2023

We had been talking about “Agile Analytic Operations,” “DevOps for Data Teams,” and “Lean Manufacturing For Data,” but the concept was hard to get across and communicate. I spent much time de-categorizing DataOps: we are not discussing ETL, Data Lake, or Data Science.

Testing

Testing Data Lake Dashboards Data Science

How smava makes loans transparent and affordable using Amazon Redshift Serverless

AWS Big Data

DECEMBER 21, 2023

To bring their customers the best deals and user experience, smava follows the modern data architecture principles with a data lake as a scalable, durable data store and purpose-built data stores for analytical processing and data consumption.

Data Lake

Data Lake Data Warehouse Data-driven B2B

Exploring real-time streaming for generative AI Applications

AWS Big Data

MARCH 25, 2024

Stream processing, however, can enable the chatbot to access real-time data and adapt to changes in availability and price, providing the best guidance to the customer and enhancing the customer experience. When the model finds an anomaly or abnormal metric value, it should immediately produce an alert and notify the operator.

Data Lake

Data Lake Unstructured Data Management Modeling

DataOps For Business Analytics Teams

DataKitchen

JANUARY 3, 2022

A DataOps process hub offers a way for business analytics teams to cope with fast-paced requirements without expanding staff or sacrificing quality. Analytics Hub and Spoke. The data analytics function in large enterprises is generally distributed across departments and roles. DataOps Process Hub.

Business Analytics

Business Analytics Analytics Testing Dashboards

A Day in the Life of a DataOps Engineer

DataKitchen

OCTOBER 11, 2021

First, you must understand the existing challenges of the data team, including the data architecture and end-to-end toolchain. The biggest challenge is broken data pipelines due to highly manual processes. Figure 1 shows a manually executed data analytics pipeline.

Testing

Testing Metadata Dashboards Statistics

Create an end-to-end data strategy for Customer 360 on AWS

AWS Big Data

MARCH 26, 2024

The following figure shows some of the metrics derived from the study. Plan on how you can enable your teams to use ML to move from descriptive to prescriptive analytics. The AWS modern data architecture shows a way to build a purpose-built, secure, and scalable data platform in the cloud.

Data Strategy

Data Strategy Strategy Data Warehouse Prescriptive Analytics

Build incremental data pipelines to load transactional data changes using AWS DMS, Delta 2.0, and Amazon EMR Serverless

AWS Big Data

MARCH 3, 2023

Building data lakes from continuously changing transactional data of databases and keeping data lakes up to date is a complex task and can be an operational challenge. You can then apply transformations and store data in Delta format for managing inserts, updates, and deletes.

Data Lake

Data Lake Dashboards Metrics Metadata

Introducing the AWS ProServe Hadoop Migration Delivery Kit TCO tool

AWS Big Data

FEBRUARY 6, 2023

Use case overview Migrating Hadoop workloads to Amazon EMR accelerates big data analytics modernization, increases productivity, and reduces operational cost. Refactoring coupled compute and storage to a decoupling architecture is a modern data solution. George Zhao is a Senior Data Architect at AWS ProServe.

Cost-Benefit

Cost-Benefit Data Lake Dashboards Big Data

Extend your data mesh with Amazon Athena and federated views

AWS Big Data

JULY 28, 2023

In this post, we show how to create and query views on federated data sources in a data mesh architecture featuring data producers and consumers. The term data mesh refers to a data architecture with decentralized data ownership. The following diagram depicts our data architecture.

Big Data

Big Data Data Architecture Data Lake Interactive

Architectural patterns for real-time analytics using Amazon Kinesis Data Streams, part 1

AWS Big Data

JANUARY 8, 2024

This is the first post to a blog series that offers common architectural patterns in building real-time data streaming infrastructures using Kinesis Data Streams for a wide range of use cases. Refer to Amazon Kinesis Data Streams integrations for additional details. Lambda is good for event-based and stateless processing.

Analytics

Analytics IoT Data-driven Snapshot

Create a modern data platform using the Data Build Tool (dbt) in the AWS Cloud

AWS Big Data

NOVEMBER 9, 2023

It allows users to write data transformation code, run it, and test the output, all within the framework it provides. Use case The Enterprise Data Analytics group of a large jewelry retailer embarked on their cloud journey with AWS in 2021. Third-party APIs – These provide analytics and survey data related to ecommerce websites.

Data Warehouse

Data Warehouse Testing Data Quality Reporting

Design a data mesh on AWS that reflects the envisioned organization

AWS Big Data

JANUARY 22, 2024

Parameters of success Acast succeeded in bootstrapping and scaling a new team- and domain-oriented data product and its corresponding infrastructure and setup, resulting in less friction in gathering insights and happier users and consumers. Srikant Das is an Acceleration Lab Solutions Architect at Amazon Web Services.

Data-driven

Data-driven Advertising Metadata Data Architecture

Data Leaders Brief

Orca Security’s journey to a petabyte-scale data lake with Apache Iceberg and AWS Analytics

Choosing an open table format for your transactional data lake on AWS

Webinars

Trending Sources

Data science vs data analytics: Unpacking the differences

Webinars

Eight Top DataOps Trends for 2022

Visualize data quality scores and metrics generated by AWS Glue Data Quality

Build a multi-Region and highly resilient modern data architecture using AWS Glue and AWS Lake Formation

Why the Data Journey Manifesto?

How smava makes loans transparent and affordable using Amazon Redshift Serverless

Exploring real-time streaming for generative AI Applications

DataOps For Business Analytics Teams

A Day in the Life of a DataOps Engineer

Create an end-to-end data strategy for Customer 360 on AWS

Build incremental data pipelines to load transactional data changes using AWS DMS, Delta 2.0, and Amazon EMR Serverless

Introducing the AWS ProServe Hadoop Migration Delivery Kit TCO tool

Extend your data mesh with Amazon Athena and federated views

Architectural patterns for real-time analytics using Amazon Kinesis Data Streams, part 1

Create a modern data platform using the Data Build Tool (dbt) in the AWS Cloud

Design a data mesh on AWS that reflects the envisioned organization

Stay Connected