Data Architecture, Data Processing, Management and Metadata

Data Architecture

Data Processing

Management

Metadata

The Top Three Entangled Trends in Data Architectures: Data Mesh, Data Fabric, and Hybrid Architectures

Cloudera

SEPTEMBER 29, 2022

Each of these trends claim to be complete models for their data architectures to solve the “everything everywhere all at once” problem. Data teams are confused as to whether they should get on the bandwagon of just one of these trends or pick a combination. First, we describe how data mesh and data fabric could be related.

Data Architecture

Data Architecture Metadata Data Warehouse Sales

Migrate an existing data lake to a transactional data lake using Apache Iceberg

AWS Big Data

OCTOBER 3, 2023

Over the years, data lakes on Amazon Simple Storage Service (Amazon S3) have become the default repository for enterprise data and are a common choice for a large set of users who query data for a variety of analytics and machine leaning use cases. Analytics use cases on data lakes are always evolving.

Data Lake

Data Lake Metadata Snapshot Recreation/Entertainment

Join 52,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

The Key to Sustainable Energy Optimization: A Data-Driven Approach for Manufacturing

From Developer Experience to Product Experience: How a Shared Focus Fuels Product Success

Understanding User Needs and Satisfying Them

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

How To Get Promoted In Product Management

MORE WEBINARS

Trending Sources

Petabyte-scale log analytics with Amazon S3, Amazon OpenSearch Service, and Amazon OpenSearch Ingestion

AWS Big Data

MARCH 7, 2024

Organizations often need to manage a high volume of data that is growing at an extraordinary rate. At the same time, they need to optimize operational costs to unlock the value of this data for timely insights and do so with a consistent performance. The data can be reattached to UltraWarm when needed.

Data Lake

Data Lake Analytics Dashboards Metrics

Webinars

The Key to Sustainable Energy Optimization: A Data-Driven Approach for Manufacturing

From Developer Experience to Product Experience: How a Shared Focus Fuels Product Success

Understanding User Needs and Satisfying Them

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

How To Get Promoted In Product Management

MORE WEBINARS

KGF 2023: Bikes To The Moon, Datastrophies, Abstract Art And A Knowledge Graph Forum To Embrace Them All

Ontotext

DECEMBER 1, 2023

It enriched their understanding of the full spectrum of knowledge graph business applications and the technology partner ecosystem needed to turn data into a competitive advantage. Content and data management solutions based on knowledge graphs are becoming increasingly important across enterprises.

Metadata

Metadata Sales Consulting Enterprise

Build a serverless transactional data lake with Apache Iceberg, Amazon EMR Serverless, and Amazon Athena

AWS Big Data

MARCH 10, 2023

One important feature is to run different workloads such as business intelligence (BI), Machine Learning (ML), Data Science and data exploration, and Change Data Capture (CDC) of transactional data, without having to maintain multiple copies of data.

Data Lake

Data Lake Sales Data Warehouse Snapshot

Announcing Alation 4.0 with Alation Connect

Alation

FEBRUARY 20, 2020

Information about popular users of data is very important, as it provides knowledge from someone else, often outside of a consumer’s immediate network. For data stewards, this means answering: Where should I focus if there’s data to manage? How do I communicate semantic changes to consumers who rely on this data?

Metadata

Metadata Enterprise Data Processing Data Architecture

How Cargotec uses metadata replication to enable cross-account data sharing

AWS Big Data

JUNE 7, 2023

They chose AWS Glue as their preferred data integration tool due to its serverless nature, low maintenance, ability to control compute resources in advance, and scale when needed. To share the datasets, they needed a way to share access to the data and access to catalog metadata in the form of tables and views.

Metadata

Metadata Data Lake Machine Learning Big Data

Boosting Object Storage Performance with Ozone Manager

Cloudera

JULY 19, 2023

The Ozone Manager is a critical component of Ozone. It is a replicated, highly-available service that is responsible for managing the metadata for all objects stored in Ozone. As Ozone scales to exabytes of data, it is important to ensure that Ozone Manager can perform at scale.

Management

Management Metadata Metrics Optimization

Build SAML identity federation for Amazon OpenSearch Service domains within a VPC

AWS Big Data

FEBRUARY 7, 2024

Amazon OpenSearch Service is a fully managed search and analytics service powered by the Apache Lucene search library that can be operated within a virtual private cloud (VPC). Create an Amazon Route 53 public hosted zone such as mydomain.com to be used for routing internet traffic to your domain. Take note of the group ID.

Dashboards

Dashboards Data Processing Metadata Consulting

Habib Bank manages data at scale with Cloudera Data Platform

Cloudera

NOVEMBER 17, 2022

We needed a solution to manage our data at scale, to provide greater experiences to our customers. With Cloudera Data Platform, we aim to unlock value faster and offer consistent data security and governance to meet this goal. Aqeel Ahmed Jatoi, Lead – Architecture, Governance and Control, Habib Bank Limited.

Management

Management Data Lake Consulting Unstructured Data

SAP enhances Datasphere and SAC for AI-driven transformation

CIO Business Intelligence

MARCH 6, 2024

SAP announced today a host of new AI copilot and AI governance features for SAP Datasphere and SAP Analytics Cloud (SAC). The company is expanding its partnership with Collibra to integrate Collibra’s AI Governance platform with SAP data assets to facilitate data governance for non-SAP data assets in customer environments. “We

Unstructured Data

Unstructured Data Dashboards Business Intelligence Data Governance

Design a data mesh on AWS that reflects the envisioned organization

AWS Big Data

JANUARY 22, 2024

The existing monolith and centralized architecture was struggling to meet the growing demands of data consumers. Data engineers were finding it increasingly challenging to maintain and scale the data infrastructure, resulting in data access, data silos, and inefficiencies in data management.

Data-driven

Data-driven Advertising Metadata Data Architecture

CIOs rise to the ESG reporting challenge

CIO Business Intelligence

JANUARY 30, 2024

“Always the gatekeepers of much of the data necessary for ESG reporting, CIOs are finding that companies are even more dependent on them,” says Nancy Mentesana, ESG executive director at Labrador US, a global communications firm focused on corporate disclosure documents. The complexity is at a much higher level.”

Reporting

Reporting Data Quality Strategy Data-driven

Create an end-to-end data strategy for Customer 360 on AWS

AWS Big Data

MARCH 26, 2024

AWS provides different services for building data ingestion pipelines: AWS Glue is a serverless data integration service that ingests data in batches from on-premises databases and data stores in the cloud. Streaming data with low latency needs is stored in Amazon Kinesis Data Streams for real-time consumption.

Data Strategy

Data Strategy Strategy Data Warehouse Prescriptive Analytics

Orca Security’s journey to a petabyte-scale data lake with Apache Iceberg and AWS Analytics

AWS Big Data

JULY 20, 2023

With data becoming the driving force behind many industries today, having a modern data architecture is pivotal for organizations to be successful. Moreover, running advanced analytics and ML on disparate data sources proved challenging. To overcome these issues, Orca decided to build a data lake.

Data Lake

Data Lake Analytics Snapshot Optimization

How Novo Nordisk built distributed data governance and control at scale

AWS Big Data

APRIL 28, 2023

When building a scalable data architecture on AWS, giving autonomy and ownership to the data domains are crucial for the success of the platform. In this post, you will learn how to build decentralized governance with Lake Formation and AWS Identity and Access Management (IAM) using attribute-based access control (ABAC).

Data Governance

Data Governance Management Data-driven Data Lake

Build incremental data pipelines to load transactional data changes using AWS DMS, Delta 2.0, and Amazon EMR Serverless

AWS Big Data

MARCH 3, 2023

You can then apply transformations and store data in Delta format for managing inserts, updates, and deletes. Amazon EMR Serverless is a serverless option in Amazon EMR that makes it easy for data analysts and engineers to run open-source big data analytics frameworks without configuring, managing, and scaling clusters or servers.

Data Lake

Data Lake Dashboards Metrics Metadata

The Gartner 2022 Leadership Vision for Data and Analytics Leaders Questions and Answers

Andrew White

JANUARY 9, 2022

On Thursday January 6th I hosted Gartner’s 2022 Leadership Vision for Data and Analytics webinar. Here is a recent note on synthetic data: Maverick* Research: Forget About Your Real Data — Synthetic Data Is the Future of AI. – This is a question for our data management team.

Analytics

Analytics Measurement Modeling Data-driven

Empowering data mesh: The tools to deliver BI excellence

erwin

APRIL 16, 2024

The data mesh framework In the dynamic landscape of data management, the search for agility, scalability, and efficiency has led organizations to explore new, innovative approaches. One such innovation gaining traction is the data mesh framework. This empowers individual teams to own and manage their data.

Metadata

Metadata Data Quality Data Governance Modeling

How Cloudera Data Flow Enables Successful Data Mesh Architectures

Cloudera

OCTOBER 7, 2021

The concept of the data mesh architecture is not entirely new; Its conceptual origins are rooted in the microservices architecture, its design principles (i.e., need to integrate multiple “point solutions” used in a data ecosystem) and organization reasons (e.g., Components of a Data Mesh.

Metadata

Metadata Cost-Benefit Enterprise Interactive

How Amazon Finance Automation built a data mesh to support distributed data ownership and centralize governance

AWS Big Data

JULY 14, 2023

Consumers prioritized data discoverability, fast data access, low latency, and high accuracy of data. Producers prioritized ownership, governance, access management, and reuse of their datasets. These inputs reinforced the need of a unified data strategy across the FinOps teams.

Finance

Finance Metadata Big Data Recreation/Entertainment

How smava makes loans transparent and affordable using Amazon Redshift Serverless

AWS Big Data

DECEMBER 21, 2023

To speed up the self-service analytics and foster innovation based on data, a solution was needed to provide ways to allow any team to create data products on their own in a decentralized manner. To create and manage the data products, smava uses Amazon Redshift , a cloud data warehouse.

Data Lake

Data Lake Data Warehouse Data-driven B2B

What Is Embedded Analytics?

Jet Global

MAY 1, 2023

Now, Delta managers can get a full understanding of their data for compliance purposes. Additionally, with write-back capabilities, they can clear discrepancies and input data. This also includes IT departments that develop and manage applications used by internal stakeholders and partners.

Analytics

Analytics Cost-Benefit Visualization Dashboards

Data Leaders Brief

The Top Three Entangled Trends in Data Architectures: Data Mesh, Data Fabric, and Hybrid Architectures

Migrate an existing data lake to a transactional data lake using Apache Iceberg

Webinars

Trending Sources

Petabyte-scale log analytics with Amazon S3, Amazon OpenSearch Service, and Amazon OpenSearch Ingestion

Webinars

KGF 2023: Bikes To The Moon, Datastrophies, Abstract Art And A Knowledge Graph Forum To Embrace Them All

Build a serverless transactional data lake with Apache Iceberg, Amazon EMR Serverless, and Amazon Athena

Announcing Alation 4.0 with Alation Connect

How Cargotec uses metadata replication to enable cross-account data sharing

Boosting Object Storage Performance with Ozone Manager

Build SAML identity federation for Amazon OpenSearch Service domains within a VPC

Habib Bank manages data at scale with Cloudera Data Platform

SAP enhances Datasphere and SAC for AI-driven transformation

Design a data mesh on AWS that reflects the envisioned organization

CIOs rise to the ESG reporting challenge

Create an end-to-end data strategy for Customer 360 on AWS

Orca Security’s journey to a petabyte-scale data lake with Apache Iceberg and AWS Analytics

How Novo Nordisk built distributed data governance and control at scale

Build incremental data pipelines to load transactional data changes using AWS DMS, Delta 2.0, and Amazon EMR Serverless

The Gartner 2022 Leadership Vision for Data and Analytics Leaders Questions and Answers

Empowering data mesh: The tools to deliver BI excellence

How Cloudera Data Flow Enables Successful Data Mesh Architectures

How Amazon Finance Automation built a data mesh to support distributed data ownership and centralize governance

How smava makes loans transparent and affordable using Amazon Redshift Serverless

What Is Embedded Analytics?

Stay Connected