Data Architecture, Data Lake and Machine Learning

Use Apache Iceberg in your data lake with Amazon S3, AWS Glue, and Snowflake

AWS Big Data

APRIL 3, 2024

They understand that a one-size-fits-all approach no longer works, and recognize the value in adopting scalable, flexible tools and open data formats to support interoperability in a modern data architecture to accelerate the delivery of new solutions.

Data Lake

Data Lake Snapshot Metadata Data Architecture

Five Modern Data Architecture Trends

David Menninger's Analyst Perspectives

MARCH 30, 2020

I was recently asked to identify key modern data architecture trends. Data architectures have changed significantly to accommodate larger volumes of data as well as new types of data such as streaming and unstructured data. Here are some of the trends I see continuing to impact data architectures.

Data Architecture

Data Architecture Unstructured Data Data Lake Data Governance

Build a serverless transactional data lake with Apache Iceberg, Amazon EMR Serverless, and Amazon Athena

AWS Big Data

MARCH 10, 2023

Since the deluge of big data over a decade ago, many organizations have learned to build applications to process and analyze petabytes of data. Data lakes have served as a central repository to store structured and unstructured data at any scale and in various formats.

Data Lake

Data Lake Sales Data Warehouse Snapshot

Webinars

The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Communication

Peak Performance: Continuous Testing & Evaluation of LLM-Based Applications

Manufacturing Sustainability Surge: Your Guide to Data-Driven Energy Optimization & Decarbonization

From Developer Experience to Product Experience: How a Shared Focus Fuels Product Success

Understanding User Needs and Satisfying Them

MORE WEBINARS

Build a transactional data lake using Apache Iceberg, AWS Glue, and cross-account data shares using AWS Lake Formation and Amazon Athena

AWS Big Data

APRIL 24, 2023

Building a data lake on Amazon Simple Storage Service (Amazon S3) provides numerous benefits for an organization. However, many use cases, like performing change data capture (CDC) from an upstream relational database to an Amazon S3-based data lake, require handling data at a record level.

Data Lake

Data Lake Data Governance Cost-Benefit Machine Learning

What is a Data Mesh?

DataKitchen

AUGUST 3, 2021

The data mesh design pattern breaks giant, monolithic enterprise data architectures into subsystems or domains, each managed by a dedicated team. First-generation – expensive, proprietary enterprise data warehouse and business intelligence platforms maintained by a specialized team drowning in technical debt.

Data Architecture

Data Architecture Data Lake Cost-Benefit Data Warehouse

Petabyte-scale log analytics with Amazon S3, Amazon OpenSearch Service, and Amazon OpenSearch Ingestion

AWS Big Data

MARCH 7, 2024

At the same time, they need to optimize operational costs to unlock the value of this data for timely insights and do so with a consistent performance. With this massive data growth, data proliferation across your data stores, data warehouse, and data lakes can become equally challenging.

Data Lake

Data Lake Analytics Dashboards Metrics

AWS Lake Formation 2022 year in review

AWS Big Data

JANUARY 31, 2023

We have collected some of the key talks and solutions on data governance, data mesh, and modern data architecture published and presented in AWS re:Invent 2022, and a few data lake solutions built by customers and AWS Partners for easy reference. Starting with Amazon EMR release 6.7.0,

Data Lake

Data Lake Data Governance Data Architecture Machine Learning

What is a data architect? Skills, salaries, and how to become a data framework master

CIO Business Intelligence

OCTOBER 13, 2023

Data architecture is a complex and varied field and different organizations and industries have unique needs when it comes to their data architects. Solutions data architect: These individuals design and implement data solutions for specific business needs, including data warehouses, data marts, and data lakes.

Data Architecture

Data Architecture Data Warehouse Statistics Visualization

Real estate CIOs drive deals with data

CIO Business Intelligence

JULY 26, 2023

The CIO delights in detailing the work of Re/Max’s technology team, which is building the pipelines and cloud-native applications to deliver agents in the field the most refined and insightful data from more than 500 MLS listing serivces in the US and Canada as quickly as possible. Data Management, Digital Transformation, Machine Learning

Data Lake

Data Lake Digital Transformation Machine Learning Data Architecture

The Future of the Data Lakehouse – Open

CIO Business Intelligence

JUNE 23, 2022

Cloudera customers run some of the biggest data lakes on earth. These lakes power mission critical large scale data analytics, business intelligence (BI), and machine learning use cases, including enterprise data warehouses. On data warehouses and data lakes.

Data Lake

Data Lake Data Warehouse Machine Learning Cost-Benefit

Building a vision for real-time artificial intelligence

CIO Business Intelligence

APRIL 12, 2023

After walking his executive team through the data hops, flows, integrations, and processing across different ingestion software, databases, and analytical platforms, they were shocked by the complexity of their current data architecture and technology stack. It isn’t easy.

Machine Learning

Machine Learning Cost-Benefit Data-driven Strategy

The Future of the Data Lakehouse – Open

Cloudera

JUNE 18, 2022

Cloudera customers run some of the biggest data lakes on earth. These lakes power mission critical large scale data analytics, business intelligence (BI), and machine learning use cases, including enterprise data warehouses. On data warehouses and data lakes.

Data Lake

Data Lake Data Warehouse Machine Learning Cost-Benefit

Educating ChatGPT on Data Lakehouse

Cloudera

MARCH 17, 2023

As the use of ChatGPT becomes more prevalent, I frequently encounter customers and data users citing ChatGPT’s responses in their discussions. I love the enthusiasm surrounding ChatGPT and the eagerness to learn about modern data architectures such as data lakehouses, data meshes, and data fabrics.

Unstructured Data

Unstructured Data Data Lake Data Warehouse Machine Learning

Data science vs data analytics: Unpacking the differences

IBM Big Data Hub

SEPTEMBER 19, 2023

Overview: Data science vs data analytics Think of data science as the overarching umbrella that covers a wide range of tasks performed to find patterns in large datasets, structure data for use, train machine learning models and develop artificial intelligence (AI) applications.

Data Science

Data Science Data Analytics Prescriptive Analytics Analytics

Databricks’ new data lakehouse aims at media, entertainment sector

CIO Business Intelligence

APRIL 25, 2022

The other 10% represents the effort of initial deployment, data-loading, configuration and the setup of administrative tasks and analysis that is specific to the customer, the Henschen said. Partner solutions to boost functionality, adoption.

Recreation/Entertainment

Recreation/Entertainment Data Lake Data Warehouse Unstructured Data

Exploring real-time streaming for generative AI Applications

AWS Big Data

MARCH 25, 2024

Foundation models (FMs) are large machine learning (ML) models trained on a broad spectrum of unlabeled and generalized datasets. Both engines provide native ingestion support from Kinesis Data Streams and Amazon MSK via a separate streaming pipeline to a data lake or data warehouse for analysis.

Data Lake

Data Lake Unstructured Data Management Modeling

Snowflake Builds on Its Success

David Menninger's Analyst Perspectives

JANUARY 15, 2021

Traditional on-premises data processing solutions have led to a hugely complex and expensive set of data silos where IT spends more time managing the infrastructure than extracting value from the data.

IT

IT Data Architecture Big Data Data Processing

Simplify operational data processing in data lakes using AWS Glue and Apache Hudi

AWS Big Data

SEPTEMBER 13, 2023

The Analytics specialty practice of AWS Professional Services (AWS ProServe) helps customers across the globe with modern data architecture implementations on the AWS Cloud. Of those tables, some are larger (such as in terms of record volume) than others, and some are updated more frequently than others.

Data Lake

Data Lake Data Processing Metadata Snapshot

Modernizing Data Analytics Architecture with the Denodo Platform on Azure

Data Virtualization

JANUARY 19, 2023

Reading Time: 2 minutes Today, many businesses are modernizing their on-premises data warehouses or cloud-based data lakes using Microsoft Azure Synapse Analytics. Unfortunately, with data spread.

Data Analytics

Data Analytics Data Lake Data Warehouse Analytics

Convergent Evolution

Peter James Thomas

AUGUST 18, 2018

That was the Science, here comes the Technology… A Brief Hydrology of Data Lakes. Overlapping with the above, from around 2012, I began to get involved in also designing and implementing Big Data Architectures; initially for narrow purposes and later Data Lakes spanning entire enterprises.

Data Lake

Data Lake Data Warehouse Data mining Statistics

2020 Data Impact Award Winner Spotlight: United Overseas Bank

Cloudera

JANUARY 13, 2021

To drive the vision of becoming a data-enabled organisation, UOB developed the EDAG (Enterprise Data Architecture and Governance) platform. The platform is built on a data lake that centralises data in UOB business units across the organisation.

Digital Transformation

Digital Transformation Data-driven Data Lake Big Data

The hidden history of Db2

IBM Big Data Hub

JULY 5, 2022

In today’s world of complex data architectures and emerging technologies, databases can sometimes be undervalued and unrecognized. Empower real-time decision making and perform heavy computational analysis with built-in ML, insanely fast ingest, and querying of data in motion and at rest. Database complexity, simplified??.

Data Lake

Data Lake Data Warehouse Publishing Structured Data

Introducing the AWS ProServe Hadoop Migration Delivery Kit TCO tool

AWS Big Data

FEBRUARY 6, 2023

Refactoring coupled compute and storage to a decoupling architecture is a modern data solution. It enables compute such as EMR instances and storage such as Amazon Simple Storage Service (Amazon S3) data lakes to scale. George Zhao is a Senior Data Architect at AWS ProServe.

Cost-Benefit

Cost-Benefit Data Lake Dashboards Big Data

Our Next Phase of Growth: Enterprise Data Catalogs

Alation

FEBRUARY 13, 2020

Following a very successful year of growth in Alation’s business, this announcement marks a milestone for Alation and the enterprise data catalog market. What started six years ago as one startup trying to improve the way people work with data has become a full-blown market category – Machine Learning Data Catalogs.

Enterprise

Enterprise Data Lake Machine Learning Data-driven

Insiders Cite The Wondrous Benefits Of Big Data In Fortnite

Smart Data Collective

AUGUST 9, 2019

However, more mainstream games use big data as well. Fortnite is one of the games that uses big data to offer great service to its customers. Even Forbes Tech Council has written about the benefits of data lakes in Fortnite. They stated that other people can benefit from learning about them too.

Big Data

Big Data Data Lake Data Architecture Machine Learning

5 misconceptions about cloud data warehouses

IBM Big Data Hub

FEBRUARY 2, 2023

In today’s world, data warehouses are a critical component of any organization’s technology ecosystem. They provide the backbone for a range of use cases such as business intelligence (BI) reporting, dashboarding, and machine-learning (ML)-based predictive analytics, that enable faster decision making and insights.

Data Warehouse

Data Warehouse Cost-Benefit Unstructured Data Data Architecture

Deep dive into the AWS ProServe Hadoop Migration Delivery Kit TCO tool

AWS Big Data

FEBRUARY 6, 2023

He mainly works with enterprise customers to help data lake migration and modernization, and provides guidance and technical assistance on big data projects such as Hadoop, Spark, data warehousing, real-time data processing, and large-scale machine learning.

Dashboards

Dashboards Optimization Data Lake Cost-Benefit

The New Normal for FP&A: Data Analytics

Jedox

OCTOBER 22, 2020

In addition to using data to inform your future decisions, you can also use current data to make immediate decisions. Some of the technologies that make modern data analytics so much more powerful than they used t be include data management, data mining, predictive analytics, machine learning and artificial intelligence.

Data Analytics

Data Analytics Analytics Unstructured Data Data mining

Breaking State and Local Data Silos with Modern Data Architectures

Cloudera

AUGUST 30, 2022

Modern data architectures. To eliminate or integrate these silos, the public sector needs to adopt robust data management solutions that support modern data architectures (MDAs). Towards Data Science ). Deploying modern data architectures. Forrester ).

Data Architecture

Data Architecture Data Lake Metadata Data Warehouse

Choosing an open table format for your transactional data lake on AWS

AWS Big Data

JUNE 9, 2023

A modern data architecture enables companies to ingest virtually any type of data through automated pipelines into a data lake, which provides highly durable and cost-effective object storage at petabyte or exabyte scale.

Data Lake

Data Lake Metadata Optimization Statistics

Extract data from SAP ERP using AWS Glue and the SAP SDK

AWS Big Data

FEBRUARY 8, 2023

Solution overview Vyaire’s iDataHub powered by AWS Glue has been effectively used for data movement between SAP ERP and ServiceMax. AWS Glue a serverless data integration service that makes it easy to discover, prepare, and combine data for analytics, machine learning (ML), and application development.

Testing

Testing Data Integration Data Lake Enterprise

Orca Security’s journey to a petabyte-scale data lake with Apache Iceberg and AWS Analytics

AWS Big Data

JULY 20, 2023

With data becoming the driving force behind many industries today, having a modern data architecture is pivotal for organizations to be successful. In this post, we describe Orca’s journey building a transactional data lake using Amazon Simple Storage Service (Amazon S3), Apache Iceberg, and AWS Analytics.

Data Lake

Data Lake Analytics Snapshot Optimization

Modern Data Architecture for Telecommunications

Cloudera

SEPTEMBER 6, 2022

Data has continued to grow both in scale and in importance through this period, and today telecommunications companies are increasingly seeing data architecture as an independent organizational challenge, not merely an item on an IT checklist. Previously, there were three types of data structures in telco: .

Data Architecture

Data Architecture Cost-Benefit Digital Transformation Business Driver

The year’s top 10 enterprise AI trends — so far

CIO Business Intelligence

SEPTEMBER 21, 2023

To make all this possible, the data had to be collected, processed, and fed into the systems that needed it in a reliable, efficient, scalable, and secure way. Data warehouses then evolved into data lakes, and then data fabrics and other enterprise-wide data architectures.

Enterprise

Enterprise Consulting Modeling Cost-Benefit

Eight Top DataOps Trends for 2022

DataKitchen

NOVEMBER 29, 2021

Data Gets Meshier. 2022 will bring further momentum behind modular enterprise architectures like data mesh. The data mesh addresses the problems characteristic of large, complex, monolithic data architectures by dividing the system into discrete domains managed by smaller, cross-functional teams.

Testing

Testing Data Lake Data Architecture Manufacturing

Carhartt turns to data under new CIO

CIO Business Intelligence

NOVEMBER 25, 2022

As part of that transformation, Agusti has plans to integrate a data lake into the company’s data architecture and expects two AI proofs of concept (POCs) to be ready to move into production within the quarter. Today, we backflush our data lake through our data warehouse. We’re still in that journey.”

Data Lake

Data Lake Data Warehouse Unstructured Data Data Architecture

Amazon Redshift announcements at AWS re:Invent 2023 to enable analytics on all your data

AWS Big Data

NOVEMBER 29, 2023

Since then, customer demands for better scale, higher throughput, and agility in handling a wide variety of changing, but increasingly business critical analytics and machine learning use cases has exploded, and we have been keeping pace.

Data Warehouse

Data Warehouse Data Lake Analytics Machine Learning

Get maximum value out of your cloud data warehouse with Amazon Redshift

AWS Big Data

APRIL 19, 2023

Building an optimal data system As data grows at an extraordinary rate, data proliferation across your data stores, data warehouse, and data lakes can become a challenge. This performance innovation allows Nasdaq to have a multi-use data lake between teams.

Data Warehouse

Data Warehouse Data Lake Unstructured Data Optimization

How GamesKraft uses Amazon Redshift data sharing to support growing analytics workloads

AWS Big Data

NOVEMBER 13, 2023

Amazon Redshift is a fully managed data warehousing service that offers both provisioned and serverless options, making it more efficient to run and scale analytics without having to manage your data warehouse. Additionally, data is extracted from vendor APIs that includes data related to product, marketing, and customer experience.

Data Warehouse

Data Warehouse Data Lake Analytics Data Science

Data platform trinity: Competitive or complementary?

IBM Big Data Hub

JANUARY 18, 2023

In another decade, the internet and mobile started the generate data of unforeseen volume, variety and velocity. It required a different data platform solution. Hence, Data Lake emerged, which handles unstructured and structured data with huge volume. Metadata plays a key role here in discovering the data assets.

Data Lake

Data Lake Data Warehouse Data-driven Metadata

Habib Bank manages data at scale with Cloudera Data Platform

Cloudera

NOVEMBER 17, 2022

The Solution: CDP Private Cloud brings a next-generation hybrid architecture with cloud-native benefits to HBL’s data platform. HBL started their data journey in 2019 when data lake initiative was started to consolidate complex data sources and enable the bank to use single version of truth for decision making.

Management

Management Data Lake Consulting Unstructured Data

Building a Beautiful Data Lakehouse

CIO Business Intelligence

MARCH 9, 2022

However, they do contain effective data management, organization, and integrity capabilities. As a result, users can easily find what they need, and organizations avoid the operational and cost burdens of storing unneeded or duplicate data copies. Warehouse, data lake convergence. Meet the data lakehouse.

Data Lake

Data Lake Unstructured Data Data Warehouse Data Quality

Supercharge Your Data Lakehouse with Apache Iceberg in Cloudera Data Platform

Cloudera

JUNE 30, 2022

Today’s general availability announcement covers Iceberg running within key data services in the Cloudera Data Platform (CDP) — including Cloudera Data Warehousing ( CDW ), Cloudera Data Engineering ( CDE ), and Cloudera Machine Learning ( CML ).

Data Lake

Data Lake Data Architecture Metadata Data Warehouse

How the Public Sector Can Maximize the Value of Dark Data

Cloudera

JANUARY 30, 2023

By 2025, it’s estimated that the amount of data created, consumed, and stored will reach 180 zettabytes , with up to 90% of that unstructured and nearly all of it unused for decision making. The purpose of this blog isn’t to emphasize the cyber risk of dark data but to spotlight its implications.

IoT

IoT Data Architecture Data Lake Machine Learning

Use Apache Iceberg in your data lake with Amazon S3, AWS Glue, and Snowflake

Five Modern Data Architecture Trends

Webinars

Trending Sources

Build a serverless transactional data lake with Apache Iceberg, Amazon EMR Serverless, and Amazon Athena

Webinars

Build a transactional data lake using Apache Iceberg, AWS Glue, and cross-account data shares using AWS Lake Formation and Amazon Athena

What is a Data Mesh?

Petabyte-scale log analytics with Amazon S3, Amazon OpenSearch Service, and Amazon OpenSearch Ingestion

AWS Lake Formation 2022 year in review

What is a data architect? Skills, salaries, and how to become a data framework master

Real estate CIOs drive deals with data

The Future of the Data Lakehouse – Open

Building a vision for real-time artificial intelligence

The Future of the Data Lakehouse – Open

Educating ChatGPT on Data Lakehouse

Data science vs data analytics: Unpacking the differences

Databricks’ new data lakehouse aims at media, entertainment sector

Exploring real-time streaming for generative AI Applications

Snowflake Builds on Its Success

Simplify operational data processing in data lakes using AWS Glue and Apache Hudi

Modernizing Data Analytics Architecture with the Denodo Platform on Azure

Convergent Evolution

2020 Data Impact Award Winner Spotlight: United Overseas Bank

The hidden history of Db2

Introducing the AWS ProServe Hadoop Migration Delivery Kit TCO tool

Our Next Phase of Growth: Enterprise Data Catalogs

Insiders Cite The Wondrous Benefits Of Big Data In Fortnite

5 misconceptions about cloud data warehouses

Deep dive into the AWS ProServe Hadoop Migration Delivery Kit TCO tool

The New Normal for FP&A: Data Analytics

Breaking State and Local Data Silos with Modern Data Architectures

Choosing an open table format for your transactional data lake on AWS

Extract data from SAP ERP using AWS Glue and the SAP SDK

Orca Security’s journey to a petabyte-scale data lake with Apache Iceberg and AWS Analytics

Modern Data Architecture for Telecommunications

The year’s top 10 enterprise AI trends — so far

Eight Top DataOps Trends for 2022

Carhartt turns to data under new CIO

Amazon Redshift announcements at AWS re:Invent 2023 to enable analytics on all your data

Get maximum value out of your cloud data warehouse with Amazon Redshift

How GamesKraft uses Amazon Redshift data sharing to support growing analytics workloads

Data platform trinity: Competitive or complementary?

Habib Bank manages data at scale with Cloudera Data Platform

Building a Beautiful Data Lakehouse

Supercharge Your Data Lakehouse with Apache Iceberg in Cloudera Data Platform

How the Public Sector Can Maximize the Value of Dark Data

Stay Connected