Data Lake, Data Warehouse, Modeling and Snapshot

Data Lake

Data Warehouse

Modeling

Snapshot

Build a serverless transactional data lake with Apache Iceberg, Amazon EMR Serverless, and Amazon Athena

AWS Big Data

MARCH 10, 2023

Since the deluge of big data over a decade ago, many organizations have learned to build applications to process and analyze petabytes of data. Data lakes have served as a central repository to store structured and unstructured data at any scale and in various formats.

Data Lake

Data Lake Sales Data Warehouse Snapshot

Manage your data warehouse cost allocations with Amazon Redshift Serverless tagging

AWS Big Data

MARCH 27, 2023

Amazon Redshift Serverless makes it simple to run and scale analytics without having to manage your data warehouse infrastructure. For Filter by resource type , you can filter by Workgroup , Namespace , Snapshot , and Recovery Point. Tags allows you to assign metadata to your AWS resources.

Data Warehouse

Data Warehouse Management Snapshot Data Lake

Join 52,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

Generative AI Deep Dive: Advancing from Proof of Concept to Production

Understanding User Needs and Satisfying Them

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Leading the Development of Profitable and Sustainable Products

How To Get Promoted In Product Management

MORE WEBINARS

Trending Sources

Analytics Vidhya

Build an Amazon Redshift data warehouse using an Amazon DynamoDB single-table design

AWS Big Data

JUNE 21, 2023

In traditional databases, we would model such applications using a normalized data model (entity-relation diagram). These types of queries are suited for a data warehouse. Amazon Redshift is fully managed, scalable, cloud data warehouse. To house our data, we need to define a data model.

Data Warehouse

Data Warehouse Data Lake OLAP Cost-Benefit

Webinars

Generative AI Deep Dive: Advancing from Proof of Concept to Production

Understanding User Needs and Satisfying Them

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Leading the Development of Profitable and Sustainable Products

How To Get Promoted In Product Management

MORE WEBINARS

Choosing an open table format for your transactional data lake on AWS

AWS Big Data

JUNE 9, 2023

A modern data architecture enables companies to ingest virtually any type of data through automated pipelines into a data lake, which provides highly durable and cost-effective object storage at petabyte or exabyte scale.

Data Lake

Data Lake Metadata Optimization Statistics

Simplify operational data processing in data lakes using AWS Glue and Apache Hudi

AWS Big Data

SEPTEMBER 13, 2023

A modern data architecture is an evolutionary architecture pattern designed to integrate a data lake, data warehouse, and purpose-built stores with a unified governance model. Of those tables, some are larger (such as in terms of record volume) than others, and some are updated more frequently than others.

Data Lake

Data Lake Data Processing Metadata Snapshot

Power enterprise-grade Data Vaults with Amazon Redshift – Part 2

AWS Big Data

NOVEMBER 16, 2023

Amazon Redshift is a popular cloud data warehouse, offering a fully managed cloud-based service that seamlessly integrates with an organization’s Amazon Simple Storage Service (Amazon S3) data lake, real-time streams, machine learning (ML) workflows, transactional workflows, and much more—all while providing up to 7.9x

Enterprise

Enterprise Data Warehouse Snapshot Cost-Benefit

Build and manage your modern data stack using dbt and AWS Glue through dbt-glue, the new “trusted” dbt adapter

AWS Big Data

NOVEMBER 29, 2023

dbt is an open source, SQL-first templating engine that allows you to write repeatable and extensible data transforms in Python and SQL. dbt is predominantly used by data warehouses (such as Amazon Redshift ) customers who are looking to keep their data transform logic separate from storage and engine.

Data Lake

Data Lake Management Metrics Data Warehouse

Dimensional modeling in Amazon Redshift

AWS Big Data

JULY 19, 2023

Amazon Redshift is a fully managed and petabyte-scale cloud data warehouse that is used by tens of thousands of customers to process exabytes of data every day to power their analytics workload. You can structure your data, measure business processes, and get valuable insights quickly can be done by using a dimensional model.

Modeling

Modeling Sales Data Warehouse Snapshot

Simplifying data processing at Capitec with Amazon Redshift integration for Apache Spark

AWS Big Data

NOVEMBER 10, 2023

This integration expands the possibilities for AWS analytics and machine learning (ML) solutions, making the data warehouse accessible to a broader range of applications. Your applications can seamlessly read from and write to your Amazon Redshift data warehouse while maintaining optimal performance and transactional consistency.

Data Processing

Data Processing Data Lake Data Warehouse Optimization

Exploring real-time streaming for generative AI Applications

AWS Big Data

MARCH 25, 2024

Foundation models (FMs) are large machine learning (ML) models trained on a broad spectrum of unlabeled and generalized datasets. This scale and general-purpose adaptability are what makes FMs different from traditional ML models. FMs are multimodal; they work with different data types such as text, video, audio, and images.

Data Lake

Data Lake Unstructured Data Management Modeling

Enforce fine-grained access control on Open Table Formats via Amazon EMR integrated with AWS Lake Formation

AWS Big Data

JANUARY 17, 2024

With Amazon EMR 6.15, we launched AWS Lake Formation based fine-grained access controls (FGAC) on Open Table Formats (OTFs), including Apache Hudi, Apache Iceberg, and Delta lake. Many large enterprise companies seek to use their transactional data lake to gain insights and improve decision-making.

Data Lake

Data Lake Snapshot Big Data Data-driven

MLOps and DevOps: Why Data Makes It Different

O'Reilly on Data

OCTOBER 19, 2021

Let’s start by considering the job of a non-ML software engineer: writing traditional software deals with well-defined, narrowly-scoped inputs, which the engineer can exhaustively and cleanly model in the code. Not only is data larger, but models—deep learning models in particular—are much larger than before.

IT Testing Experimentation Software

How Amazon Devices scaled and optimized real-time demand and supply forecasts using serverless analytics

AWS Big Data

FEBRUARY 1, 2023

With data volumes exhibiting a double-digit percentage growth rate year on year and the COVID pandemic disrupting global logistics in 2021, it became more critical to scale and generate near-real-time data. The response times for these data sources are critical to our key stakeholders.

Optimization

Optimization Forecasting Data Lake Metadata

Unleashing the power of Presto: The Uber case study

IBM Big Data Hub

SEPTEMBER 25, 2023

Uber’s DNA as an analytics company At its core, Uber’s business model is deceptively simple: connect a customer at point A to their destination at point B. Uber understood that digital superiority required the capture of all their transactional data, not just a sampling. It lands as raw data in HDFS.

OLAP

OLAP Data Lake Data-driven Snapshot

Enrich your customer data with geospatial insights using Amazon Redshift, AWS Data Exchange, and Amazon QuickSight

AWS Big Data

MARCH 18, 2024

With these new attributes, you are able to build a segmentation model to identify distinct groups of customers that you can target with personalized messaging. This data is available to subscribe to on AWS Data Exchange—and with data sharing, you don’t need to pay to store a copy of it in your account in order to query it.

Data Warehouse

Data Warehouse Visualization Snapshot Data-driven

Five actionable steps to GDPR compliance (Right to be forgotten) with Amazon Redshift

AWS Big Data

JULY 28, 2023

Organizations must comply with these requests provided that there are no legitimate grounds for retaining the personal data, such as legal obligations or contractual requirements. Amazon Redshift is a fully managed, petabyte-scale data warehouse service in the cloud. Amazon Redshift offers backups and snapshots of the data.

Snapshot

Snapshot Metadata Measurement Data Warehouse

Simplify external object access in Amazon Redshift using automatic mounting of the AWS Glue Data Catalog

AWS Big Data

JULY 28, 2023

Amazon Redshift is a petabyte-scale, enterprise-grade cloud data warehouse service delivering the best price-performance. Today, tens of thousands of customers run business-critical workloads on Amazon Redshift to cost-effectively and quickly analyze their data using standard SQL and existing business intelligence (BI) tools.

Data Lake

Data Lake Data Governance Data Warehouse Modeling

Snowflake and Domino: Better Together

Domino Data Lab

JANUARY 11, 2021

Arming data science teams with the access and capabilities needed to establish a two-way flow of information is one critical challenge many organizations face when it comes to unlocking value from their modeling efforts. Domino integrates with Snowflake to solve this challenge by providing a modern approach to data.

Recreation/Entertainment

Recreation/Entertainment Data Science Data Warehouse Modeling

How Tricentis unlocks insights across the software development lifecycle at speed and scale using Amazon Redshift

AWS Big Data

MARCH 3, 2023

Every event in the data source can be relevant, and our customers don’t tolerate data loss, poor data quality, or discrepancies between the source and Tricentis Analytics. While aggregating, summarizing, and aligning to a common information model, all transformations must not affect the integrity of data from its source.

Software

Software Data Lake Testing Cost-Benefit

Interview with Dominic Sartorio, Senior Vice President for Products & Development, Protegrity

Corinium

APRIL 25, 2019

Then when there is a breach, it comes as a shock, “wow, I didn’t even know that application had access to so much sensitive data”. Step One in any data security program should first be to discover and classify datasets that are sensitive, and know where that data is, and understand who really needs it to do their jobs.

Insurance

Insurance Risk IoT Cost-Benefit

Data Leaders Brief

Build a serverless transactional data lake with Apache Iceberg, Amazon EMR Serverless, and Amazon Athena

Manage your data warehouse cost allocations with Amazon Redshift Serverless tagging

Webinars

Trending Sources

Build an Amazon Redshift data warehouse using an Amazon DynamoDB single-table design

Webinars

Choosing an open table format for your transactional data lake on AWS

Simplify operational data processing in data lakes using AWS Glue and Apache Hudi

Power enterprise-grade Data Vaults with Amazon Redshift – Part 2

Build and manage your modern data stack using dbt and AWS Glue through dbt-glue, the new “trusted” dbt adapter

Dimensional modeling in Amazon Redshift

Simplifying data processing at Capitec with Amazon Redshift integration for Apache Spark

Exploring real-time streaming for generative AI Applications

Enforce fine-grained access control on Open Table Formats via Amazon EMR integrated with AWS Lake Formation

MLOps and DevOps: Why Data Makes It Different

How Amazon Devices scaled and optimized real-time demand and supply forecasts using serverless analytics

Unleashing the power of Presto: The Uber case study

Enrich your customer data with geospatial insights using Amazon Redshift, AWS Data Exchange, and Amazon QuickSight

Five actionable steps to GDPR compliance (Right to be forgotten) with Amazon Redshift

Simplify external object access in Amazon Redshift using automatic mounting of the AWS Glue Data Catalog

Snowflake and Domino: Better Together

How Tricentis unlocks insights across the software development lifecycle at speed and scale using Amazon Redshift

Interview with Dominic Sartorio, Senior Vice President for Products & Development, Protegrity

Stay Connected