Blog, Data Architecture, Data Processing and Metadata

Blog

Data Architecture

Data Processing

Metadata

The Top Three Entangled Trends in Data Architectures: Data Mesh, Data Fabric, and Hybrid Architectures

Cloudera

SEPTEMBER 29, 2022

Each of these trends claim to be complete models for their data architectures to solve the “everything everywhere all at once” problem. Data teams are confused as to whether they should get on the bandwagon of just one of these trends or pick a combination. First, we describe how data mesh and data fabric could be related.

Data Architecture

Data Architecture Metadata Data Warehouse Sales

Migrate an existing data lake to a transactional data lake using Apache Iceberg

AWS Big Data

OCTOBER 3, 2023

Over the years, data lakes on Amazon Simple Storage Service (Amazon S3) have become the default repository for enterprise data and are a common choice for a large set of users who query data for a variety of analytics and machine leaning use cases. Analytics use cases on data lakes are always evolving. Choose ETL Jobs.

Data Lake

Data Lake Metadata Snapshot Recreation/Entertainment

Join 52,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

How to Optimize the Developer Experience for Monumental Impact

Generative AI Deep Dive: Advancing from Proof of Concept to Production

Understanding User Needs and Satisfying Them

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Leading the Development of Profitable and Sustainable Products

MORE WEBINARS

Trending Sources

KGF 2023: Bikes To The Moon, Datastrophies, Abstract Art And A Knowledge Graph Forum To Embrace Them All

Ontotext

DECEMBER 1, 2023

Atanas Kiryakov presenting at KGF 2023 about Where Shall and Enterprise Start their Knowledge Graph Journey Only data integration through semantic metadata can drive business efficiency as “it’s the glue that turns knowledge graphs into hubs of metadata and content”.

Metadata

Metadata Sales Consulting Enterprise

Webinars

How to Optimize the Developer Experience for Monumental Impact

Generative AI Deep Dive: Advancing from Proof of Concept to Production

Understanding User Needs and Satisfying Them

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Leading the Development of Profitable and Sustainable Products

MORE WEBINARS

Announcing Alation 4.0 with Alation Connect

Alation

FEBRUARY 20, 2020

But it is the contextual usage information of data that is critical to answer data consumers’ and stewards’ questions. Cataloging both data and queries provides insight into. How recently the data was used. How recently the data was updated. What the mapping is of technical metadata to business descriptions.

Metadata

Metadata Enterprise Data Processing Data Architecture

How Cargotec uses metadata replication to enable cross-account data sharing

AWS Big Data

JUNE 7, 2023

This is a guest blog post co-written with Sumesh M R from Cargotec and Tero Karttunen from Knowit Finland. They chose AWS Glue as their preferred data integration tool due to its serverless nature, low maintenance, ability to control compute resources in advance, and scale when needed.

Metadata

Metadata Data Lake Machine Learning Big Data

Boosting Object Storage Performance with Ozone Manager

Cloudera

JULY 19, 2023

It is a replicated, highly-available service that is responsible for managing the metadata for all objects stored in Ozone. As Ozone scales to exabytes of data, it is important to ensure that Ozone Manager can perform at scale. The hardware specifications are included at the end of this blog.

Management

Management Metadata Metrics Optimization

Design a data mesh on AWS that reflects the envisioned organization

AWS Big Data

JANUARY 22, 2024

Cost and resource efficiency – This is an area where Acast observed a reduction in data duplication, and therefore cost reduction (in some accounts, removing the copy of data 100%), by reading data across accounts while enabling scaling. In this approach, teams responsible for generating data are referred to as producers.

Data-driven

Data-driven Advertising Metadata Data Architecture

Habib Bank manages data at scale with Cloudera Data Platform

Cloudera

NOVEMBER 17, 2022

While Cloudera CDH was already a success story at HBL, in 2022, HBL identified the need to move its customer data centre environment from Cloudera’s CDH to Cloudera Data Platform (CDP) Private Cloud to accommodate growing volumes of data. ” Prior to the upgrade, HBL’s 27 node cluster ran on CDH 6.1

Management

Management Data Lake Consulting Unstructured Data

Orca Security’s journey to a petabyte-scale data lake with Apache Iceberg and AWS Analytics

AWS Big Data

JULY 20, 2023

With data becoming the driving force behind many industries today, having a modern data architecture is pivotal for organizations to be successful. This data is sent to Apache Kafka, which is hosted on Amazon Managed Streaming for Apache Kafka (Amazon MSK).

Data Lake

Data Lake Analytics Snapshot Optimization

Build incremental data pipelines to load transactional data changes using AWS DMS, Delta 2.0, and Amazon EMR Serverless

AWS Big Data

MARCH 3, 2023

The Delta tables created by the EMR Serverless application are exposed through the AWS Glue Data Catalog and can be queried through Amazon Athena. Solution overview The following diagram shows the overall architecture of the solution that we implement in this post. For Name , enter emr-delta-blog. For Type , choose Spark.

Data Lake

Data Lake Dashboards Metrics Metadata

The Gartner 2022 Leadership Vision for Data and Analytics Leaders Questions and Answers

Andrew White

JANUARY 9, 2022

On Thursday January 6th I hosted Gartner’s 2022 Leadership Vision for Data and Analytics webinar. But I would search on our web site for AI and mid-size enterprise; or ask Anthony Mullen who leads our AI research in our data and analytics team. Can you remind where we can find the mentioned blog?

Analytics

Analytics Measurement Modeling Data-driven

Empowering data mesh: The tools to deliver BI excellence

erwin

APRIL 16, 2024

One such innovation gaining traction is the data mesh framework. The data mesh approach distributes data ownership and decentralizes data architecture, paving the way for enhanced agility and scalability. This empowers individual teams to own and manage their data.

Metadata

Metadata Data Quality Data Governance Modeling

How Cloudera Data Flow Enables Successful Data Mesh Architectures

Cloudera

OCTOBER 7, 2021

In this blog, I will demonstrate the value of Cloudera DataFlow (CDF) , the edge-to-cloud streaming data platform available on the Cloudera Data Platform (CDP) , as a Data integration and Democratization fabric. Components of a Data Mesh. Introduction.

Metadata

Metadata Cost-Benefit Enterprise Interactive

Data Leaders Brief

The Top Three Entangled Trends in Data Architectures: Data Mesh, Data Fabric, and Hybrid Architectures

Migrate an existing data lake to a transactional data lake using Apache Iceberg

Webinars

Trending Sources

KGF 2023: Bikes To The Moon, Datastrophies, Abstract Art And A Knowledge Graph Forum To Embrace Them All

Webinars

Announcing Alation 4.0 with Alation Connect

How Cargotec uses metadata replication to enable cross-account data sharing

Boosting Object Storage Performance with Ozone Manager

Design a data mesh on AWS that reflects the envisioned organization

Habib Bank manages data at scale with Cloudera Data Platform

Orca Security’s journey to a petabyte-scale data lake with Apache Iceberg and AWS Analytics

Build incremental data pipelines to load transactional data changes using AWS DMS, Delta 2.0, and Amazon EMR Serverless

The Gartner 2022 Leadership Vision for Data and Analytics Leaders Questions and Answers

Empowering data mesh: The tools to deliver BI excellence

How Cloudera Data Flow Enables Successful Data Mesh Architectures

Stay Connected