article thumbnail

A Detailed Introduction on Data Lakes and Delta Lakes

Analytics Vidhya

This article was published as a part of the Data Science Blogathon. Introduction A data lake is a central data repository that allows us to store all of our structured and unstructured data on a large scale.

Data Lake 244
article thumbnail

What is a Data Pipeline?

Jet Global

The key components of a data pipeline are typically: Data Sources : The origin of the data, such as a relational database , data warehouse, data lake , file, API, or other data store. This can include tasks such as data ingestion, cleansing, filtering, aggregation, or standardization.

article thumbnail

Celebrating Data Superheroes: The 2021 Data Impact Awards Winners

Cloudera

By adopting a custom developed application based on the Cloudera ecosystem, Carrefour has combined the legacy systems into one platform which provides access to customer data in a single data lake. EVA unifies data from MTN’s different operator systems, creating a 360° view of subscribers.