2012, Data Lake and Data Warehouse

2012

Data Lake

Data Warehouse

Use Apache Iceberg in your data lake with Amazon S3, AWS Glue, and Snowflake

AWS Big Data

APRIL 3, 2024

licensed, 100% open-source data table format that helps simplify data processing on large datasets stored in data lakes. Data engineers use Apache Iceberg because it’s fast, efficient, and reliable at any scale and keeps records of how datasets change over time.

Data Lake

Data Lake Snapshot Metadata Data Architecture

Petabyte-scale log analytics with Amazon S3, Amazon OpenSearch Service, and Amazon OpenSearch Ingestion

AWS Big Data

MARCH 7, 2024

At the same time, they need to optimize operational costs to unlock the value of this data for timely insights and do so with a consistent performance. With this massive data growth, data proliferation across your data stores, data warehouse, and data lakes can become equally challenging.

Data Lake

Data Lake Analytics Dashboards Metrics

Join 52,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

The Product Manager’s Guide to Optimizing DX for Systemic Impact

Understanding User Needs and Satisfying Them

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Leading the Development of Profitable and Sustainable Products

How To Get Promoted In Product Management

MORE WEBINARS

Trending Sources

Accelerate analytics on Amazon OpenSearch Service with AWS Glue through its native connector

AWS Big Data

DECEMBER 21, 2023

As the volume and complexity of analytics workloads continue to grow, customers are looking for more efficient and cost-effective ways to ingest and analyse data. OpenSearch Service is used for multiple purposes, such as observability, search analytics, consolidation, cost savings, compliance, and integration.

Analytics

Analytics IT Data Lake Visualization

Webinars

The Product Manager’s Guide to Optimizing DX for Systemic Impact

Understanding User Needs and Satisfying Them

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Leading the Development of Profitable and Sustainable Products

How To Get Promoted In Product Management

MORE WEBINARS

Convergent Evolution

Peter James Thomas

AUGUST 18, 2018

That was the Science, here comes the Technology… A Brief Hydrology of Data Lakes. Overlapping with the above, from around 2012, I began to get involved in also designing and implementing Big Data Architectures; initially for narrow purposes and later Data Lakes spanning entire enterprises. In Closing.

Data Lake

Data Lake Data Warehouse Data mining Statistics

Break data silos and stream your CDC data with Amazon Redshift streaming and Amazon MSK

AWS Big Data

DECEMBER 13, 2023

A CDC-based approach captures the data changes and makes them available in data warehouses for further analytics in real-time. usually a data warehouse) needs to reflect those changes in near real-time. This post showcases how to use streaming ingestion to bring data to Amazon Redshift.

Data Warehouse

Data Warehouse Snapshot Data Processing Management

Simplify and speed up Apache Spark applications on Amazon Redshift data with Amazon Redshift integration for Apache Spark

AWS Big Data

APRIL 20, 2023

Apache Spark enables you to build applications in a variety of languages, such as Java, Scala, and Python, by accessing the data in your Amazon Redshift data warehouse. Amazon Redshift integration for Apache Spark helps developers seamlessly build and run Apache Spark applications on Amazon Redshift data.

Data Lake

Data Lake Data Warehouse Sales Data-driven

How SumUp made digital analytics more accessible using AWS Glue

AWS Big Data

JUNE 6, 2023

Founded in 2012, SumUp is the financial partner for more than 4 million small merchants in over 35 markets worldwide, helping them start, run and grow their business. Unless, of course, the rest of their data also resides in the Google Cloud. The Data Science teams also use this data for churn prediction and CLTV modeling.

Analytics

Analytics Data Lake Testing Optimization

Accelerate HiveQL with Oozie to Spark SQL migration on Amazon EMR

AWS Big Data

APRIL 19, 2023

Many customers run big data workloads such as extract, transform, and load (ETL) on Apache Hive to create a data warehouse on Hadoop. He is passionate about big data and data analytics. Sandeep Singh is a Lead Consultant at AWS ProServe, focused on analytics, data lake architecture, and implementation.

Metadata

Metadata Testing Data Lake Consulting

Introducing Amazon Q data integration in AWS Glue

AWS Big Data

APRIL 30, 2024

It can generate data integration jobs for extracts and loads to S3 data lakes including file formats like CSV, JSON, and Parquet, and ingestion into open table formats like Apache Hudi, Delta, and Apache Iceberg. Prerequisites Before going forward with this tutorial, complete the following prerequisites: Set up AWS Glue Studio.

Data Integration

Data Integration Data Lake Data Warehouse Software

How BMO improved data security with Amazon Redshift and AWS Lake Formation

AWS Big Data

MARCH 1, 2024

One of the bank’s key challenges related to strict cybersecurity requirements is to implement field level encryption for personally identifiable information (PII), Payment Card Industry (PCI), and data that is classified as high privacy risk (HPR). Only users with required permissions are allowed to access data in clear text.

Data Lake

Data Lake Data Warehouse Management Risk

Simplify access management with Amazon Redshift and AWS Lake Formation for users in an External Identity Provider

AWS Big Data

FEBRUARY 15, 2024

You might be modernizing your data architecture using Amazon Redshift to enable access to your data lake and data in your data warehouse, and are looking for a centralized and scalable way to define and manage the data access based on IdP identities. Choose Register location.

Management

Management Data Lake Sales Data Warehouse

Themes and Conferences per Pacoid, Episode 8

Domino Data Lab

APRIL 3, 2019

Most of the data management moved to back-end servers, e.g., databases. So we had three tiers providing a separation of concerns: presentation, logic, data. Note that data warehouse (DW) and business intelligence (BI) practices both emerged circa 1990. We keep feeding the monster data. There are models everywhere.

Data Governance

Data Governance Machine Learning Metadata Big Data

Q&A with Greg Rahn – The changing Data Warehouse market

Cloudera

DECEMBER 12, 2018

After having rebuilt their data warehouse, I decided to take a little bit more of a pointed role, and I joined Oracle as a database performance engineer. I spent eight years in the real-world performance group where I specialized in high visibility and high impact data warehousing competes and benchmarks.

Data Warehouse

Data Warehouse Marketing Big Data Data Lake

How The Cloud Made ‘Data-Driven Culture’ Possible | Part 1

BizAcuity

MAY 10, 2022

2012: Amazon Redshift, the first of its kind cloud-based data warehouse service comes into existence. Fact: IBM built the world’s first data warehouse in the 1980’s. Google launches BigQuery, its own data warehousing tool and Microsoft introduces Azure SQL Data Warehouse and Azure Data Lake Store.

Data-driven

Data-driven IoT Unstructured Data Data Lake

Simplify AWS Glue job orchestration and monitoring with Amazon MWAA

AWS Big Data

MAY 19, 2023

Organizations across all industries have complex data processing requirements for their analytical use cases across different analytics systems, such as data lakes on AWS , data warehouses ( Amazon Redshift ), search ( Amazon OpenSearch Service ), NoSQL ( Amazon DynamoDB ), machine learning ( Amazon SageMaker ), and more.

Machine Learning

Machine Learning Metrics Management Big Data

Simplify external object access in Amazon Redshift using automatic mounting of the AWS Glue Data Catalog

AWS Big Data

JULY 28, 2023

Amazon Redshift is a petabyte-scale, enterprise-grade cloud data warehouse service delivering the best price-performance. Today, tens of thousands of customers run business-critical workloads on Amazon Redshift to cost-effectively and quickly analyze their data using standard SQL and existing business intelligence (BI) tools.

Data Lake

Data Lake Data Governance Data Warehouse Modeling

Data Leaders Brief

Use Apache Iceberg in your data lake with Amazon S3, AWS Glue, and Snowflake

Petabyte-scale log analytics with Amazon S3, Amazon OpenSearch Service, and Amazon OpenSearch Ingestion

Webinars

Trending Sources

Accelerate analytics on Amazon OpenSearch Service with AWS Glue through its native connector

Webinars

Convergent Evolution

Break data silos and stream your CDC data with Amazon Redshift streaming and Amazon MSK

Simplify and speed up Apache Spark applications on Amazon Redshift data with Amazon Redshift integration for Apache Spark

How SumUp made digital analytics more accessible using AWS Glue

Accelerate HiveQL with Oozie to Spark SQL migration on Amazon EMR

Introducing Amazon Q data integration in AWS Glue

How BMO improved data security with Amazon Redshift and AWS Lake Formation

Simplify access management with Amazon Redshift and AWS Lake Formation for users in an External Identity Provider

Themes and Conferences per Pacoid, Episode 8

Q&A with Greg Rahn – The changing Data Warehouse market

How The Cloud Made ‘Data-Driven Culture’ Possible | Part 1

Simplify AWS Glue job orchestration and monitoring with Amazon MWAA

Simplify external object access in Amazon Redshift using automatic mounting of the AWS Glue Data Catalog

Stay Connected