Data Architecture, Data Lake and Strategy

Migrate an existing data lake to a transactional data lake using Apache Iceberg

AWS Big Data

OCTOBER 3, 2023

A data lake is a centralized repository that you can use to store all your structured and unstructured data at any scale. You can store your data as-is, without having to first structure the data and then run different types of analytics for better business insights. They are the same.

Data Lake

Data Lake Metadata Snapshot Recreation/Entertainment

Data architecture strategy for data quality

IBM Big Data Hub

JANUARY 5, 2023

Several factors determine the quality of your enterprise data like accuracy, completeness, consistency, to name a few. But there’s another factor of data quality that doesn’t get the recognition it deserves: your data architecture. How the right data architecture improves data quality.

Data Quality

Data Quality Data Architecture Strategy Data Lake

Build a serverless transactional data lake with Apache Iceberg, Amazon EMR Serverless, and Amazon Athena

AWS Big Data

MARCH 10, 2023

Since the deluge of big data over a decade ago, many organizations have learned to build applications to process and analyze petabytes of data. Data lakes have served as a central repository to store structured and unstructured data at any scale and in various formats.

Data Lake

Data Lake Sales Data Warehouse Snapshot

Webinars

The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Communication

Peak Performance: Continuous Testing & Evaluation of LLM-Based Applications

Manufacturing Sustainability Surge: Your Guide to Data-Driven Energy Optimization & Decarbonization

From Developer Experience to Product Experience: How a Shared Focus Fuels Product Success

Understanding User Needs and Satisfying Them

MORE WEBINARS

Architecture for the Data Lake

TDAN

JANUARY 3, 2023

For a while now, vendors have been advocating that people put their data in a data lake when they put their data in the cloud. The Data Lake The idea is that you put your data into a data lake. Then, at a later point in time, the end user analyst can come along and […].

Data Lake

Data Lake Data Architecture Data Warehouse Data Strategy

The Unexpected Cost of Data Copies

Unfortunately, data replication, transformation, and movement can result in longer time to insight, reduced efficiency, elevated costs, and increased security and compliance risk. Read this whitepaper to learn: Why organizations frequently end up with unnecessary data copies.

Data Lake

Petabyte-scale log analytics with Amazon S3, Amazon OpenSearch Service, and Amazon OpenSearch Ingestion

AWS Big Data

MARCH 7, 2024

At the same time, they need to optimize operational costs to unlock the value of this data for timely insights and do so with a consistent performance. With this massive data growth, data proliferation across your data stores, data warehouse, and data lakes can become equally challenging.

Data Lake

Data Lake Analytics Dashboards Metrics

Build a transactional data lake using Apache Iceberg, AWS Glue, and cross-account data shares using AWS Lake Formation and Amazon Athena

AWS Big Data

APRIL 24, 2023

Building a data lake on Amazon Simple Storage Service (Amazon S3) provides numerous benefits for an organization. However, many use cases, like performing change data capture (CDC) from an upstream relational database to an Amazon S3-based data lake, require handling data at a record level.

Data Lake

Data Lake Data Governance Cost-Benefit Machine Learning

What is a data architect? Skills, salaries, and how to become a data framework master

CIO Business Intelligence

OCTOBER 13, 2023

The data architect also “provides a standard common business vocabulary, expresses strategic requirements, outlines high-level integrated designs to meet those requirements, and aligns with enterprise strategy and related business architecture,” according to DAMA International’s Data Management Body of Knowledge.

Data Architecture

Data Architecture Data Warehouse Statistics Visualization

What you don’t know about data management could kill your business

CIO Business Intelligence

NOVEMBER 28, 2023

Still, to truly create lasting value with data, organizations must develop data management mastery. This means excelling in the under-the-radar disciplines of data architecture and data governance. Contributing to the general lack of data about data is complexity. Seven individuals raised their hands.

Management

Management Data Architecture Data Lake Data Strategy

Deploy and Optimize Your Snowflake Environment Faster With Accelerators

CDW Research Hub

JULY 18, 2022

While many organizations understand the business need for a data and analytics cloud platform , few can quickly modernize their legacy data warehouse due to a lack of skills, resources, and data literacy. One modern data platform solution that provides simplicity and flexibility to grow is Snowflake’s data cloud and platform.

Optimization

Optimization Data Lake Data Warehouse Manufacturing

Detect, mask, and redact PII data using AWS Glue before loading into Amazon OpenSearch Service

AWS Big Data

JANUARY 12, 2024

Ingestion: Data lake batch, micro-batch, and streaming Many organizations land their source data into their data lake in various ways, including batch, micro-batch, and streaming jobs. Amazon AppFlow can be used to transfer data from different SaaS applications to a data lake.

Data Lake

Data Lake Cost-Benefit Visualization Structured Data

AWS Lake Formation 2022 year in review

AWS Big Data

JANUARY 31, 2023

We have collected some of the key talks and solutions on data governance, data mesh, and modern data architecture published and presented in AWS re:Invent 2022, and a few data lake solutions built by customers and AWS Partners for easy reference. Starting with Amazon EMR release 6.7.0,

Data Lake

Data Lake Data Governance Data Architecture Machine Learning

Data Architecture and Strategy in the AI Era

Cloudera

MARCH 28, 2024

But, even with the backdrop of an AI-dominated future, many organizations still find themselves struggling with everything from managing data volumes and complexity to security concerns to rapidly proliferating data silos and governance challenges.

Data Architecture

Data Architecture Strategy Data Lake Data-driven

CIO Ryan Snyder on the benefits of interpreting data as a layer cake

CIO Business Intelligence

AUGUST 2, 2023

A data and analytics capability cannot emerge from an IT or business strategy alone. With both technology and business organization deeply involved in the what, why, and how of data, companies need to create cross-functional data teams to get the most out of it. That strategy is doomed to fail. What are the layers?

Manufacturing

Manufacturing Data Architecture Strategy Data Strategy

A comparative assessment of digital transformation in Italy

CIO Business Intelligence

APRIL 24, 2024

In fact, AMA collects a huge amount of structured and unstructured data from bins, collection vehicles, facilities, and user reports, and until now, this data has remained disconnected, managed by disparate systems and interfaces, through Excel spreadsheets.

Digital Transformation

Digital Transformation Business Intelligence Unstructured Data Data Lake

Building a vision for real-time artificial intelligence

CIO Business Intelligence

APRIL 12, 2023

After walking his executive team through the data hops, flows, integrations, and processing across different ingestion software, databases, and analytical platforms, they were shocked by the complexity of their current data architecture and technology stack. How is data, process, and model drift managed for reliability?

Machine Learning

Machine Learning Cost-Benefit Data-driven Strategy

Data Strategies for Getting Greater Business Value from Distributed Data

Data Virtualization

MAY 19, 2023

Reading Time: 11 minutes The post Data Strategies for Getting Greater Business Value from Distributed Data appeared first on Data Management Blog - Data Integration and Modern Data Management Articles, Analysis and Information.

Data Strategy

Data Strategy Strategy Data Integration Management

Databricks’ new data lakehouse aims at media, entertainment sector

CIO Business Intelligence

APRIL 25, 2022

The other 10% represents the effort of initial deployment, data-loading, configuration and the setup of administrative tasks and analysis that is specific to the customer, the Henschen said. Partner solutions to boost functionality, adoption. The company’s collaboration with Lovelytics is focused on baseball.

Recreation/Entertainment

Recreation/Entertainment Data Lake Data Warehouse Unstructured Data

Why the Data Journey Manifesto?

DataKitchen

JUNE 12, 2023

We had been talking about “Agile Analytic Operations,” “DevOps for Data Teams,” and “Lean Manufacturing For Data,” but the concept was hard to get across and communicate. I spent much time de-categorizing DataOps: we are not discussing ETL, Data Lake, or Data Science.

Testing

Testing Data Lake Dashboards Data Science

Achieving Trusted AI in Manufacturing

Cloudera

JANUARY 30, 2024

While AI stands to drive smart intelligent factories, optimize production processes, enable predictive maintenance and pattern analysis, personalization, sentiment analysis, knowledge management, as well as detect abnormalities, and many other use cases, without a robust data management strategy, the road to effective AI is an uphill battle.

Manufacturing

Manufacturing Contextual Data IoT Digital Transformation

Data science vs data analytics: Unpacking the differences

IBM Big Data Hub

SEPTEMBER 19, 2023

To pursue a data science career, you need a deep understanding and expansive knowledge of machine learning and AI. Having the right data strategy and data architecture is especially important for an organization that plans to use automation and AI for its data analytics.

Data Science

Data Science Data Analytics Prescriptive Analytics Analytics

Simplify operational data processing in data lakes using AWS Glue and Apache Hudi

AWS Big Data

SEPTEMBER 13, 2023

The Analytics specialty practice of AWS Professional Services (AWS ProServe) helps customers across the globe with modern data architecture implementations on the AWS Cloud. Of those tables, some are larger (such as in terms of record volume) than others, and some are updated more frequently than others.

Data Lake

Data Lake Data Processing Metadata Snapshot

Query your Iceberg tables in data lake using Amazon Redshift (Preview)

AWS Big Data

AUGUST 31, 2023

Amazon Redshift enables you to directly access data stored in Amazon Simple Storage Service (Amazon S3) using SQL queries and join data across your data warehouse and data lake. With Amazon Redshift, you can query the data in your S3 data lake using a central AWS Glue metastore from your Redshift data warehouse.

Data Lake

Data Lake Data Warehouse Metadata Data Architecture

Belcorp reimagines R&D with AI

CIO Business Intelligence

JUNE 28, 2023

The initial stage involved establishing the data architecture, which provided the ability to handle the data more effectively and systematically. “We Follow a value-focused strategy. This allowed us to derive insights more easily.” These transitions are intricate processes and mistakes are inevitable.

Digital Transformation

Digital Transformation Cost-Benefit Informatics Data mining

5 misconceptions about cloud data warehouses

IBM Big Data Hub

FEBRUARY 2, 2023

This approach has several benefits, such as streamlined migration of data from on-premises to the cloud, reduced query tuning requirements and continuity in SRE tooling, automations, and personnel. This enabled data-driven analytics at scale across the organization 4.

Data Warehouse

Data Warehouse Cost-Benefit Unstructured Data Data Architecture

DataOps For Business Analytics Teams

DataKitchen

JANUARY 3, 2022

It may take six weeks to add a new schema, but the VP may say she needs it for this Friday’s strategy summit. If the IT or data engineering team can’t respond with an enabling data platform in the required time frame, the business analyst does the necessary data work themselves. DataOps Process Hub.

Business Analytics

Business Analytics Analytics Testing Dashboards

What is Data Mesh?

Ontotext

NOVEMBER 16, 2023

Data mesh is still in its infancy, and data personas and organizations are craving clarity and specificity. It is critical to be aware of the “why” and “what” and fully understand the role that knowledge graphs play when considering adopting a data mesh strategy. The debate on what constitutes a data mesh rages on.

Metadata

Metadata Data-driven Data Quality Data Architecture

The hidden history of Db2

IBM Big Data Hub

JULY 5, 2022

In today’s world of complex data architectures and emerging technologies, databases can sometimes be undervalued and unrecognized. IBM has been the pioneer in paving the way for data management technologies and advancements for decades, from the first commercial database to quantum computing systems.

Data Lake

Data Lake Data Warehouse Publishing Structured Data

Implement slowly changing dimensions in a data lake using AWS Glue and Delta

AWS Big Data

MARCH 28, 2023

As organizations across the globe are modernizing their data platforms with data lakes on Amazon Simple Storage Service (Amazon S3), handling SCDs in data lakes can be challenging.

Data Lake

Data Lake Testing Snapshot Sales

Data Warehouse Teams Adapt to Be Data Driven

TDAN

JUNE 16, 2020

When companies embark on a journey of becoming data-driven, usually, this goes hand in and with using new technologies and concepts such as AI and data lakes or Hadoop and IoT. Suddenly, the data warehouse team and their software are not the only ones anymore that turn data […].

Data Warehouse

Data Warehouse Data-driven Data Lake IoT

Choosing an open table format for your transactional data lake on AWS

AWS Big Data

JUNE 9, 2023

A modern data architecture enables companies to ingest virtually any type of data through automated pipelines into a data lake, which provides highly durable and cost-effective object storage at petabyte or exabyte scale.

Data Lake

Data Lake Metadata Optimization Statistics

Implement tag-based access control for your data lake and Amazon Redshift data sharing with AWS Lake Formation

AWS Big Data

JULY 21, 2023

This leads to having data across many instances of data warehouses and data lakes using a modern data architecture in separate AWS accounts. We recently announced the integration of Amazon Redshift data sharing with AWS Lake Formation.

Data Lake

Data Lake Data Warehouse Marketing Management

The Advantages Of Live Data-Streaming In The Competitive Financial Services Sector (Part I)

Cloudera

AUGUST 21, 2020

To start off, what are the advantages of a forward-looking data-in-motion strategy? Data-in-motion is predominantly about streaming data so enterprises typically have two different ways or binary ways of looking at data. Hi Dinesh, thank you for joining us for today’s Q&A.

Enterprise

Enterprise Data Lake Strategy Metadata

Modern Data Architecture for Telecommunications

Cloudera

SEPTEMBER 6, 2022

Data has continued to grow both in scale and in importance through this period, and today telecommunications companies are increasingly seeing data architecture as an independent organizational challenge, not merely an item on an IT checklist. Previously, there were three types of data structures in telco: .

Data Architecture

Data Architecture Cost-Benefit Digital Transformation Business Driver

Visualize data quality scores and metrics generated by AWS Glue Data Quality

AWS Big Data

JUNE 6, 2023

He is passionate about helping customers build modern data architecture on the AWS Cloud. He has helped customers of all sizes implement data management, data warehouse, and data lake solutions. Avik Bhattacharjee is a Senior Partner Solutions Architect at AWS.

Data Quality

Data Quality Metrics Visualization Dashboards

Build a multi-Region and highly resilient modern data architecture using AWS Glue and AWS Lake Formation

AWS Big Data

JANUARY 24, 2023

This solution only replicates metadata in the Data Catalog, not the actual underlying data. To have a redundant data lake using Lake Formation and AWS Glue in an additional Region, we recommend replicating the Amazon S3-based storage using S3 replication , S3 sync, aws-s3-copy-sync-using-batch or S3 Batch replication process.

Data Architecture

Data Architecture Metadata Data Lake Snapshot

Architectural patterns for real-time analytics using Amazon Kinesis Data Streams, part 1

AWS Big Data

JANUARY 8, 2024

Kinesis Data Streams has native integrations with other AWS services such as AWS Glue and Amazon EventBridge to build real-time streaming applications on AWS. Refer to Amazon Kinesis Data Streams integrations for additional details.

Analytics

Analytics IoT Data-driven Snapshot

This Structure has Novel Features which are of Considerable Business Interest

Peter James Thomas

APRIL 3, 2020

I have been very much focussing on the start of a data journey in a series of recent articles about Data Strategy [3]. When I discussed these same figures with my sales team earlier, they came up with what I think is a sound strategy to counterpunch. Introduction.

Dashboards

Dashboards Reporting Sales Data Lake

Orca Security’s journey to a petabyte-scale data lake with Apache Iceberg and AWS Analytics

AWS Big Data

JULY 20, 2023

With data becoming the driving force behind many industries today, having a modern data architecture is pivotal for organizations to be successful. In this post, we describe Orca’s journey building a transactional data lake using Amazon Simple Storage Service (Amazon S3), Apache Iceberg, and AWS Analytics.

Data Lake

Data Lake Analytics Snapshot Optimization

The year’s top 10 enterprise AI trends — so far

CIO Business Intelligence

SEPTEMBER 21, 2023

Generative AI touches every aspect of the enterprise, and every aspect of society,” says Bret Greenstein, partner and leader of the gen AI go-to-market strategy at PricewaterhouseCoopers. Data warehouses then evolved into data lakes, and then data fabrics and other enterprise-wide data architectures.

Enterprise

Enterprise Consulting Modeling Cost-Benefit

Data Management Challenges for the Modern Enterprise

Data Virtualization

MARCH 3, 2021

Data is the fuel of the digital economy, so data-centric organizations have a distinct advantage. To remain competitive, organizations must have a data management strategy in place to effectively ingest, store, organize, and analyze data while ensuring that it is.

Enterprise

Enterprise Management Strategy Data Lake

Data Management Challenges for the Modern Enterprise

Data Virtualization

MARCH 3, 2021

Data is the fuel of the digital economy, so data-centric organizations have a distinct advantage. To remain competitive, organizations must have a data management strategy in place to effectively ingest, store, organize, and analyze data while ensuring that it is.

Management

Management Enterprise Strategy IT

Data Management Challenges for the Modern Enterprise

Data Virtualization

MARCH 3, 2021

Data is the fuel of the digital economy, so data-centric organizations have a distinct advantage. To remain competitive, organizations must have a data management strategy in place to effectively ingest, store, organize, and analyze data while ensuring that it is.

Management

Management Enterprise Strategy IT

Create an end-to-end data strategy for Customer 360 on AWS

AWS Big Data

MARCH 26, 2024

A Gartner Marketing survey found only 14% of organizations have successfully implemented a C360 solution, due to lack of consensus on what a 360-degree view means, challenges with data quality, and lack of cross-functional governance structure for customer data.

Data Strategy

Data Strategy Strategy Data Warehouse Prescriptive Analytics

Migrate an existing data lake to a transactional data lake using Apache Iceberg

Data architecture strategy for data quality

Webinars

Trending Sources

Build a serverless transactional data lake with Apache Iceberg, Amazon EMR Serverless, and Amazon Athena

Webinars

Architecture for the Data Lake

The Unexpected Cost of Data Copies

Petabyte-scale log analytics with Amazon S3, Amazon OpenSearch Service, and Amazon OpenSearch Ingestion

Build a transactional data lake using Apache Iceberg, AWS Glue, and cross-account data shares using AWS Lake Formation and Amazon Athena

What is a data architect? Skills, salaries, and how to become a data framework master

What you don’t know about data management could kill your business

Deploy and Optimize Your Snowflake Environment Faster With Accelerators

Detect, mask, and redact PII data using AWS Glue before loading into Amazon OpenSearch Service

AWS Lake Formation 2022 year in review

Data Architecture and Strategy in the AI Era

CIO Ryan Snyder on the benefits of interpreting data as a layer cake

A comparative assessment of digital transformation in Italy

Building a vision for real-time artificial intelligence

Data Strategies for Getting Greater Business Value from Distributed Data

Databricks’ new data lakehouse aims at media, entertainment sector

Why the Data Journey Manifesto?

Achieving Trusted AI in Manufacturing

Data science vs data analytics: Unpacking the differences

Simplify operational data processing in data lakes using AWS Glue and Apache Hudi

Query your Iceberg tables in data lake using Amazon Redshift (Preview)

Belcorp reimagines R&D with AI

5 misconceptions about cloud data warehouses

DataOps For Business Analytics Teams

What is Data Mesh?

The hidden history of Db2

Implement slowly changing dimensions in a data lake using AWS Glue and Delta

Data Warehouse Teams Adapt to Be Data Driven

Choosing an open table format for your transactional data lake on AWS

Implement tag-based access control for your data lake and Amazon Redshift data sharing with AWS Lake Formation

The Advantages Of Live Data-Streaming In The Competitive Financial Services Sector (Part I)

Modern Data Architecture for Telecommunications

Visualize data quality scores and metrics generated by AWS Glue Data Quality

Build a multi-Region and highly resilient modern data architecture using AWS Glue and AWS Lake Formation

Architectural patterns for real-time analytics using Amazon Kinesis Data Streams, part 1

This Structure has Novel Features which are of Considerable Business Interest

Orca Security’s journey to a petabyte-scale data lake with Apache Iceberg and AWS Analytics

The year’s top 10 enterprise AI trends — so far

Data Management Challenges for the Modern Enterprise

Data Management Challenges for the Modern Enterprise

Data Management Challenges for the Modern Enterprise

Create an end-to-end data strategy for Customer 360 on AWS

Stay Connected