Data Lakes Meet Data Warehouses

David Menninger's Analyst Perspectives

In this analyst perspective, Dave Menninger takes a look at data lakes. He explains the term “data lake,” describes common use cases and shares his views on some of the latest market trends.

Schema Evolution in Data Lakes


Whereas a data warehouse will need rigid data modeling and definitions, a data lake can store different types and shapes of data. In a data lake, the schema of the data can be inferred when it’s read, providing the aforementioned flexibility.

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

The Data Lake is Dead; Long Live the Data Lake!


Martin Wilcox examines the failure of data lakes

Deriving Value from Data Lakes with AI


Artificial Intelligence and machine learning are the future of every industry, especially data and analytics. Data is growing at a phenomenal rate and that’s not going to stop anytime soon. Once your data is prepared for analysis, the next question is: how else can AI help you?

Data Analytics in the Cloud for Developers and Founders

Speaker: Javier Ramírez, Senior AWS Developer Advocate, AWS

You have lots of data, and you are probably thinking of using the cloud to analyze it. But how will you move data into the cloud? In which format? How will you validate and prepare the data? What about streaming data? Can data scientists discover and use the data? Can business people create reports via drag and drop? Can operations monitor what’s going on? Will the data lake scale when you have twice as much data? Is your data secure? In this session, we address common pitfalls of building data lakes and show how AWS can help you manage data and analytics more efficiently.

Here’s Why Automation For Data Lakes Could Be Important

Smart Data Collective

Data Lakes are among the most complex and sophisticated data storage and processing facilities we have available to us today as human beings. Analytics Magazine notes that data lakes are among the most useful tools that an enterprise may have at its disposal when aiming to compete with competitors via innovation. There were a lot of promises made about Big Data that fell at the feet of data scientists to make happen. Big Data is, well…big.

7 Key Benefits of Proper Data Lake Ingestion

Smart Data Collective

It’s impossible to deny the importance of data in several industries, but that data can get overwhelming if it isn’t properly managed. The reality is businesses that are collecting data will likely be doing so on several levels. Covering Data Types.

Data Lakes on Cloud & it’s Usage in Healthcare


Data lakes are centralized repositories that can store all structured and unstructured data at any desired scale. The power of the data lake lies in the fact that it often is a cost-effective way to store data. Deploying Data Lakes in the cloud.

Data Lake Consolidation – the Aggregator Analogy

Perficient Data & Analytics

In my last post I introduced the concept of the Data Lake as a Consolidator and the critical success factor of applying robust Information Governance to this environment. So, a Data Lake as Consolidator. Data & Analytics Healthcare data governance data lake data lake

Data Lakes: What Are They and Who Needs Them?

Jet Global

The sheer scale of data being captured by the modern enterprise has necessitated a monumental shift in how that data is stored. To address the flood of data and the needs of enterprise businesses to store, sort, and analyze that data, a new storage solution has evolved: the data lake.

Working with the Data Lake Aggregator – Standards and Templates

Perficient Data & Analytics

In my previous blog , I described the concept of an “Information Catalog” and how it plays a vital role in ensuring communication between the Data Lake Aggregator and Suppliers and Consumers is efficient and effective due to the common language that it provides.

Hadoop and data lakes require further examination


A fter the hype comes disillusionment and the growing realization that Hadoop and data lakes do not provide the answer for all analytic tasks. Hadoop and Data Lakes Report. Request the free report now × Hadoop and Data Lakes.

Data Lake Participants ? Roles and Responsibilities

Perficient Data & Analytics

As you may recall, last time I introduced the analogy of the Aggregator to describe utilizing a Data Lake as a Consolidator of information, and I mentioned the three key roles in this model: the Supplier, the Aggregator and the Consumer. In this post I will provide a little more detail on the responsibilities possessed by each of these roles that, when carried out diligently, provide an effective environment for obtaining significant value from the Lake.

Data Lake Vs. Big Data Warehouse: Why You Don’t Have To Choose


Learn about the difference between a data lake and a big data warehouse, and define how to structure your big data solution in accordance with your business needs

Power BI + Azure Data Lake = Velocity & Scale to your Analytics

Perficient Data & Analytics

Context – Bring data together from various web, cloud and on-premise data sources and rapidly drive insights. The biggest challenge Business Analysts and BI developers have is the need to ingest and process medium to large data sets on a regular basis. Common Data Model.

Data Lake as Aggregator – The Critical Role of the Catalog

Perficient Data & Analytics

My previous blog talked about a Data Lake using a Supplier-Aggregator-Consumer analogy and talking about the roles each of these parties play.

Power BI + Azure Data Lake = Velocity & Scale to Your Analytics

Perficient Data & Analytics

Context – Bring data together from various web, cloud and on-premise data sources and rapidly drive insights. The biggest challenge Business Analysts and BI developers have is the need to ingest and process medium to large data sets on a regular basis. Common Data Model.

Unlocking the Potential of Machine Learning in a Data Lake

Data Virtualization

With data becoming the brain food to the intelligence of every organization, regardless of size or sector, it has become crucial to harness this data to achieve the best results, make the most informed decisions and improve productivity. Technology artificial intelligence big data Data integration Data Lake data virtualization Logical Data Lake Machine learning

Alternative approaches to implementing your data lake


ScienceSoft answers burning questions about big data lake design and implementation. We look at different approaches to its architecture and contemplate if there exists a preferred technology among the available stack

Data Lake and Information Governance – The Key Takeaways

Perficient Data & Analytics

A Data Lake can be a highly valuable asset to any enterprise, and there is a myriad of technology solutions available for leveraging the processes to feed, maintain and retrieve information from the Lake. This is the primary Takeaway to keep in mind when a Data Lake solution is being considered – or is already in place but needing improvement – by any organization. So, this completes my journey into Data Lakes and the Information Governance needed.

What's the difference between data lakes and data warehouses?

IBM Big Data Hub

If you’ve heard the debate among IT professionals about data lakes versus data warehouses, you might be wondering which is better for your organization. You might even be wondering how these two approaches are different at all

Data Lakes and the Information Governance Critical Success Factor

Perficient Data & Analytics

Since my last post I’ve been working for a client that is actively engaged in establishing a Data Lake for the purpose of supporting their analytics efforts, but also looking to “re-architect” the way their systems collaborate by using this Data Lake environment to control and consolidate all information-sharing interactions within their environment. Data & Analytics Healthcare Big Data Governance data governance data lake data lakes

Data Lakes, Not Just For Analytics Anymore

Perficient Data & Analytics

Data Lakes have been around since the early part of this decade as most Fortune 500 companies have a Data Lake or are building a Data Lake. The drive to lake data has predominately been driven by analytical use cases where Data Scientists can wrangle and prepare data for their study or model building. In my next blog post, I will investigate these challenges that companies are facing as Big Data becomes operational.

Five Modern Data Architecture Trends

David Menninger's Analyst Perspectives

I was recently asked to identify key modern data architecture trends. Data architectures have changed significantly to accommodate larger volumes of data as well as new types of data such as streaming and unstructured data.

Providing transactional data to your Hadoop and Kafka data lake

IBM Big Data Hub

The data lake may be all about Apache Hadoop, but integrating operational data can be a challenge.

Test principles – Data Warehouse vs Data Lake vs Data Vault

Perficient Data & Analytics

Understand Data Warehouse, Data Lake and Data Vault and their specific test principles. This blog tries to throw light on the terminologies data warehouse, data lake and data vault. Let us begin with data warehouse. What is Data Lake?

Reality and misconceptions about big data analytics, data lakes and the future of AI

IBM Big Data Hub

With the amount of choices surrounding big data analytics, data lakes and AI, it can sometimes be difficult to tell fact from fiction.

Data Management on Display at Informatica World 2019

David Menninger's Analyst Perspectives

Under that focus, Informatica's conference emphasized capabilities across six areas (all strong areas for Informatica): data integration, data management, data quality & governance, Master Data Management (MDM), data cataloging, and data security.

Data Management Requirements for the Enterprise Data Lake

In(tegrate) the Clouds

SnapLogic published Eight Data Management Requirements for the Enterprise Data Lake. They are: Storage and Data Formats. The company also recently hosted a webinar on Democratizing the Data Lake with Constellation Research and published 2 whitepapers from Mark Madsen. big data integration data lake hadoop snaplogicIngest and Delivery. Discovery and Preparation. Transformation and Analytics. Streaming. Scheduling and Workflow.

Data in 2020: Ventana Research Agenda

David Menninger's Analyst Perspectives

Ventana Research recently announced its 2020 research agenda for data, continuing the guidance we’ve offered for nearly two decades to help organizations derive optimal value and improve business outcomes. Data volumes continue to grow while data latency requirements continue to shrink.

Get out of the data swamp with a governed data lake

IBM Big Data Hub

Making your data lake a “governed data lake” is the game changer. Without governance, organizations risk securing the data and as well as protecting it. A governed data lake contains data that’s accessible, clean, trusted and protected.

3 principles for climbing the AI ladder with IBM Governed Data Lake

IBM Big Data Hub

Recently, we capped off the first leg of the “Enabling digital business with an IBM governed data lake” road shows in the Asia Pacific region with our customers and partners

Feed your data lake with change data capture for real-time integration and analytics

IBM Big Data Hub

His business units had a presence in 180 countries worldwide with geographically-dispersed data warehouses and business intelligence applications in various locations

The business value of a governed data lake

IBM Big Data Hub

Imagine a searchable data management system that would enable you to review crowdsourced, categorized and classified data. Consider that this system would apply to all types of data — structured and unstructured — and become more robust as more users analyze it

Big Data for Business: A Requirement for Today’s Business Analytics

David Menninger's Analyst Perspectives

Organizations now must store, process and use data of significantly greater volume and variety than in the past. Analytics Business Intelligence Data Governance Data Preparation Information Management Internet of Things Data Digital Technology blockchain data lakes AI and Machine Learning

Cloudera announces support for Azure’s next-generation Data Lake Store


The Cloudera platform delivers a one-stop shop that allows you to store any kind of data, process and analyze it in many different ways in a single environment, and integrate with the rest of your data infrastructure. Before they can fully realize the benefits of the cloud, they have had to adjust to new data models and new processes. Eventual consistency and other pitfalls can be a nightmare for engineers trying to migrate complex big data infrastructure to the cloud.

The Internet of Things: Real-Time Data and Analytics Enable Business Innovation

David Menninger's Analyst Perspectives

This innovation means that virtually any appropriately designed device can generate and transmit data about its operations, which can facilitate monitoring and a range of automatic functions.

AmFam's Data Journey From Legacy To Cloud: Teaching People To Fish In The Data Lake

Bruno Aziza

AmFam’s journey from a data-rich company to a data-driven company

Deliver on the Promise of the Cloud with Alation and Databricks


Partner News Alation cloud cloud migration data lake data science Databricks modern data infrastructureThe post Deliver on the Promise of the Cloud with Alation and Databricks appeared first on Alation.

Why is a data catalog essential to making your data lakes successful?

IBM Big Data Hub

However, all industries depend on data to be successful, and this impacts the way enterprises plan and execute their operations All industries—from healthcare to retail to banking—are digitally transforming themselves every day to become more agile and stay competitive.

IDG Contributor Network: How to overcome the bottlenecks between data lakes and analytics for customer engagement

CIO Business Intelligence

Many organizations in a variety of industries struggle to access the customer data they need to provide personalized and contextual experiences across all touchpoints. Recently, data lakes have been touted as the best way to manage the variety of collected customer data, with many big data and analytics solutions focused on a self-service approach to leveraging the value of the data lake.

News and Announcements from Tableau and TC18

David Menninger's Analyst Perspectives

Once again I attended Tableau's Users Conference, along with 17,000 other attendees, affectionately self-referred to as "data nerds". Big Data Data Governance Data Integration Data Preparation Tableau Software data lakes

2020 Cloud Trends: Ideas That Could Influence Your Critical Data Management Decisions

Data Virtualization

Business AI Amazon Web Services AWS Cloud Cloud Data Management COVID-19 Data Lakes data management data virtualization Denodo Denodo Global Cloud Survey Denodo Platform GDPR HIPAA hybrid cloud IaaS IT Machine learning Microsoft Azure multi-cloud PCI DSS SOC 2 Virtual Data Lakes