Diving Deeper into the Data Lake

David Menninger's Analyst Perspectives

A data lake is a centralized repository designed to house big data in structured, semi-structured and unstructured form. Our data lake research has uncovered some points to consider in your efforts, and I’d like to offer a deeper dive into our findings.

Why Your Data Lake Needs Bad Data

David Menninger's Analyst Perspectives

Everyone talks about data quality, as they should. Our research shows that improving the quality of information is the top benefit of data preparation activities. Data quality efforts are focused on clean data. Yes, clean data is important. but so is bad data.


Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Data Lakes Meet Data Warehouses

David Menninger's Analyst Perspectives

In this analyst perspective, Dave Menninger takes a look at data lakes. He explains the term “data lake,” describes common use cases and shares his views on some of the latest market trends.

A Guide to Build your Data Lake in AWS

Analytics Vidhya

ArticleVideo Book This article was published as a part of the Data Science Blogathon. Introduction Data Lake architecture for different use cases – Elegant. The post A Guide to Build your Data Lake in AWS appeared first on Analytics Vidhya.

Ultimate Guide to the Cloud Data Lake Engine

This guide describes how to evaluate cloud data lake engine offerings based on their ability to deliver on their promise of improving performance, data accessibility, and operational efficiency as compared with earlier methods of querying the data lake.

Understanding the Differences Between Data Lakes and Data Warehouses

Smart Data Collective

Data lakes and data warehouses are probably the two most widely used structures for storing data. Data Warehouses and Data Lakes in a Nutshell. The stored data is unprocessed, and the structure is usually applied when it is retrieved.

What are the differences between Data Lake and Data Warehouse?

Analytics Vidhya

Overview Understand the meaning of data lake and data warehouse We will see what are the key differences between Data Warehouse and Data Lake. The post What are the differences between Data Lake and Data Warehouse?

Differentiating Between Data Lakes and Data Warehouses

Smart Data Collective

The market for data warehouses is booming. While there is a lot of discussion about the merits of data warehouses, not enough discussion centers around data lakes. We talked about enterprise data warehouses in the past, so let’s contrast them with data lakes.

Schema Evolution in Data Lakes


Whereas a data warehouse will need rigid data modeling and definitions, a data lake can store different types and shapes of data. In a data lake, the schema of the data can be inferred when it’s read, providing the aforementioned flexibility.

Data Virtualization: The Key to a Successful Data Lakes

Data Virtualization

If you’ve decided to implement a data lake, you might want to keep Gartner’s assessment in mind, which is that about 80% of all data lakes projects will actually fail.

Data Virtualization: The Key to a Successful Data Lake

Data Virtualization

If you’ve decided to implement a data lake, you might want to keep Gartner’s assessment in mind, which is that about 80% of all data lake projects will actually fail.

The Next-Generation Cloud Data Lake: An Open, No-Copy Data Architecture

A next-gen cloud data lake architecture has emerged that brings together the best attributes of the data warehouse and the data lake. This new open data architecture is built to maximize data access with minimal data movement and no data copies.

Data Virtualization: The Key to a Successful Data Lake

Data Virtualization

If you’ve decided to implement a data lake, you might want to keep Gartner’s assessment in mind, which is that about 80% of all data lake projects will actually fail.

The Data Lake is Dead; Long Live the Data Lake!


Martin Wilcox examines the failure of data lakes

Here’s Why Automation For Data Lakes Could Be Important

Smart Data Collective

Data Lakes are among the most complex and sophisticated data storage and processing facilities we have available to us today as human beings. Analytics Magazine notes that data lakes are among the most useful tools that an enterprise may have at its disposal when aiming to compete with competitors via innovation. There were a lot of promises made about Big Data that fell at the feet of data scientists to make happen. Big Data is, well…big.

Data Lakes vs. Data Warehouses


Understand the differences between the two most popular options for storing big data

The Differences Between Data Warehouses and Data Lakes


The amount of data being generated and stored every day has exploded. Companies of all kinds are sitting on stockpiles of data that could someday prove valuable. Instead, businesses are increasingly turning to data lakes to store massive amounts of unstructured data.

12 Considerations When Evaluating Data Lake Engine Vendors for Analytics and BI

Businesses today compete on their ability to turn big data into essential business insights. Modern enterprises leverage cloud data lakes as the platform used to store data. 57% of the enterprises currently using a data lake cite improved business agility as a benefit.

Deriving Value from Data Lakes with AI


Artificial Intelligence and machine learning are the future of every industry, especially data and analytics. Data is growing at a phenomenal rate and that’s not going to stop anytime soon. Once your data is prepared for analysis, the next question is: how else can AI help you?

Rapidminer Platform Supports Entire Data Science Lifecycle

David Menninger's Analyst Perspectives

Rapidminer is a visual enterprise data science platform that includes data extraction, data mining, deep learning, artificial intelligence and machine learning (AI/ML) and predictive analytics.

7 Key Benefits of Proper Data Lake Ingestion

Smart Data Collective

It’s impossible to deny the importance of data in several industries, but that data can get overwhelming if it isn’t properly managed. The reality is businesses that are collecting data will likely be doing so on several levels. Covering Data Types.

Navigating Data Entities, BYOD, and Data Lakes in Microsoft Dynamics

Jet Global

Consultants and developers familiar with the AX data model could query the database using any number of different tools, including a myriad of different report writers. Data Entities. Currently, over 1,700 data entities are available and counting. The Data Warehouse Approach.

The Unexpected Cost of Data Copies

This paper will discuss why organizations frequently end up with multiple data copies and how a secure "no-copy" data strategy enabled by the Dremio data lake service can help reduce complexity, boost efficiency, and dramatically reduce costs.

Unlocking the Potential of Machine Learning in a Data Lake

Data Virtualization

With data becoming the brain food to the intelligence of every organization, regardless of size or sector, it has become crucial to harness this data to achieve the best results, make the most informed decisions and improve productivity. Technology artificial intelligence big data Data integration Data Lake data virtualization Logical Data Lake Machine learning

What's the difference between data lakes and data warehouses?

IBM Big Data Hub

If you’ve heard the debate among IT professionals about data lakes versus data warehouses, you might be wondering which is better for your organization. You might even be wondering how these two approaches are different at all

Data Lakes on Cloud & it’s Usage in Healthcare


Data lakes are centralized repositories that can store all structured and unstructured data at any desired scale. Data can be stored as-is, without first structuring it, and different types of analytics can be run on it, from dashboards and visualizations to big data processing, real-time analytics, and machine learning to improve decision making. The power of the data lake lies in the fact that it often is a cost-effective way to store data.

Complexity Drives Costs: A Look Inside BYOD and Azure Data Lakes

Jet Global

The Data Security Problem: How We Got Here. You can extract data from relational databases, including Microsoft’s SQL Server using the SQL query language. In addition to reading data, however, you can also use the SQL language to insert, update, or delete records from a database.

Checklist Report: Preparing for the Next-Generation Cloud Data Architecture

Data architectures have evolved dramatically. It is time to reconsider the fundamental ways that information is accumulated, managed, and then provisioned to the different downstream data consumers.

Driving Business Value and ROI from a Hybrid Cloud Data Lake


For many enterprises, a hybrid cloud data lake is no longer a trend, but becoming reality. With an on-premise deployment, enterprises have full control over data security, data access, and data governance. Data that needs to be tightly controlled (e.g.

Data Lakes: What Are They and Who Needs Them?

Jet Global

The sheer scale of data being captured by the modern enterprise has necessitated a monumental shift in how that data is stored. From the humble database through to data warehouses , data stores have grown both in scale and complexity to keep pace with the businesses they serve, and the data analysis now required to remain competitive. What’s in a Data Lake? Data warehouses do a great job of standardizing data from disparate sources for analysis.

Data Lake Vs. Big Data Warehouse: Why You Don’t Have To Choose


Learn about the difference between a data lake and a big data warehouse, and define how to structure your big data solution in accordance with your business needs

Informatica Earns Data Digital Innovation Award for 2021

David Menninger's Analyst Perspectives

Analytics Business Intelligence Collaboration Data Governance Information Management Data Digital Technology blockchain data lakes AI & Machine Learning

Top Considerations for Building an Open Cloud Data Lake

In this paper, we explore the top considerations for building a cloud data lake including architectural principles, when to use cloud data lake engines and how to empower non-technical users.

Reality and misconceptions about big data analytics, data lakes and the future of AI

IBM Big Data Hub

With the amount of choices surrounding big data analytics, data lakes and AI, it can sometimes be difficult to tell fact from fiction.

Alternative approaches to implementing your data lake


ScienceSoft answers burning questions about big data lake design and implementation. We look at different approaches to its architecture and contemplate if there exists a preferred technology among the available stack

Providing transactional data to your Hadoop and Kafka data lake

IBM Big Data Hub

The data lake may be all about Apache Hadoop, but integrating operational data can be a challenge. Learn how to deliver real-time feeds of transactional data from mainframes and distributed environments directly into Hadoop clusters and make constantly changing data more available

Data Cataloging in the Data Lake: Alation + Kylo


We are living in a new era of data defined by two massively disruptive trends – one architectural and the other organizational. Architecturally the introduction of Hadoop, a file system designed to store massive amounts of data, radically affected the cost model of data.

Data Analytics in the Cloud for Developers and Founders

Speaker: Javier Ramírez, Senior AWS Developer Advocate, AWS

You have lots of data, and you are probably thinking of using the cloud to analyze it. But how will you move data into the cloud? In which format? How will you validate and prepare the data? What about streaming data? Can data scientists discover and use the data? Can business people create reports via drag and drop? Can operations monitor what’s going on? Will the data lake scale when you have twice as much data? Is your data secure? In this session, we address common pitfalls of building data lakes and show how AWS can help you manage data and analytics more efficiently.

Incorta Streamlines Analytics with Direct Data Access

David Menninger's Analyst Perspectives

The amount of data flowing into organizations is growing exponentially, creating a need to process more data more quickly than ever before. business intelligence embedded analytics Analytics Collaboration Data Governance Data Preparation Data Information Management (IM) data lakes

Cloudera Consolidates Its Data Platform

David Menninger's Analyst Perspectives

Organizations are dealing with exponentially increasing data that ranges broadly from customer-generated information, financial transactions, edge-generated data and even operational IT server logs.

IT 161

Data Management Requirements for the Enterprise Data Lake

In(tegrate) the Clouds

SnapLogic published Eight Data Management Requirements for the Enterprise Data Lake. They are: Storage and Data Formats. The company also recently hosted a webinar on Democratizing the Data Lake with Constellation Research and published 2 whitepapers from Mark Madsen. big data integration data lake hadoop snaplogicIngest and Delivery. Discovery and Preparation. Transformation and Analytics. Streaming. Scheduling and Workflow.

Get out of the data swamp with a governed data lake

IBM Big Data Hub

Making your data lake a “governed data lake” is the game changer. Without governance, organizations risk securing the data and as well as protecting it. When data is cataloged and governed, an organization can effectively discover, classify, track history and lineage, quality of data and thereby use it with trust and confidence. A governed data lake contains data that’s accessible, clean, trusted and protected.

Introducing Precisely for Data Integrity

David Menninger's Analyst Perspectives

Data is becoming more valuable and more important to organizations. At the same time, organizations have become more disciplined about the data on which they rely to ensure it is robust, accurate and governed properly.