Big Data, Data Lake and Workshop

Big Data

Data Lake

Workshop

Create an Apache Hudi-based near-real-time transactional data lake using AWS DMS, Amazon Kinesis, AWS Glue streaming ETL, and data visualization using Amazon QuickSight

AWS Big Data

AUGUST 3, 2023

Data analytics on operational data at near-real time is becoming a common need. Due to the exponential growth of data volume, it has become common practice to replace read replicas with data lakes to have better scalability and performance. For more information, see Changing the default settings for your data lake.

Data Lake

Data Lake Visualization Dashboards Insurance

AWS Lake Formation 2022 year in review

AWS Big Data

JANUARY 31, 2023

We have collected some of the key talks and solutions on data governance, data mesh, and modern data architecture published and presented in AWS re:Invent 2022, and a few data lake solutions built by customers and AWS Partners for easy reference. Starting with Amazon EMR release 6.7.0,

Data Lake

Data Lake Data Governance Data Architecture Machine Learning

Join 52,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Leading the Development of Profitable and Sustainable Products

How To Get Promoted In Product Management

MORE WEBINARS

Trending Sources

Analytics Vidhya

Extend your data mesh with Amazon Athena and federated views

AWS Big Data

JULY 28, 2023

Clean up To clean up the resources created for this post, complete the following steps: On the Amazon S3 console, empty the bucket athena-federation-workshop-. If you’re using the AWS CLI, delete the objects in the athena-federation-workshop- bucket with the following code. Big Data Architect on Amazon Athena.

Big Data

Big Data Data Architecture Data Lake Interactive

Webinars

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Leading the Development of Profitable and Sustainable Products

How To Get Promoted In Product Management

MORE WEBINARS

Reference guide to build inventory management and forecasting solutions on AWS

AWS Big Data

APRIL 11, 2023

By collecting data from store sensors using AWS IoT Core , ingesting it using AWS Lambda to Amazon Aurora Serverless , and transforming it using AWS Glue from a database to an Amazon Simple Storage Service (Amazon S3) data lake, retailers can gain deep insights into their inventory and customer behavior.

Forecasting

Forecasting Management IoT Data-driven

Automate the archive and purge data process for Amazon RDS for PostgreSQL using pg_partman, Amazon S3, and AWS Glue

AWS Big Data

AUGUST 22, 2023

Gain a high-level understanding of AWS Glue and its components by using the following hands-on workshop. Vivek Shrivastava is a Principal Data Architect, Data Lake in AWS Professional Services. He is a big data enthusiast and holds 14 AWS Certifications.

Data Processing

Data Processing Testing Data Lake Data Integration

Introducing Amazon EMR on EKS job submission with Spark Operator and spark-submit

AWS Big Data

JUNE 6, 2023

Amazon EMR on EKS provides a deployment option for Amazon EMR that allows organizations to run open-source big data frameworks on Amazon Elastic Kubernetes Service (Amazon EKS). With EMR on EKS, Spark applications run on the Amazon EMR runtime for Apache Spark. Replace in the following code with the bucket name.

Optimization

Optimization Data Lake Cost-Benefit Management

Automate deployment of an Amazon QuickSight analysis connecting to an Amazon Redshift data warehouse with an AWS CloudFormation template

AWS Big Data

FEBRUARY 16, 2023

For more information about automating dashboard deployment, customizing access to the QuickSight console, configuring for team collaboration, and implementing multi-tenancy and client user segregation, check out the videos Virtual Admin Workshop: Working with Amazon QuickSight APIs and Admin Level-Up Virtual Workshop, V2 on YouTube.

Data Warehouse

Data Warehouse Sales Visualization Data Processing

Orca Security’s journey to a petabyte-scale data lake with Apache Iceberg and AWS Analytics

AWS Big Data

JULY 20, 2023

With data becoming the driving force behind many industries today, having a modern data architecture is pivotal for organizations to be successful. In this post, we describe Orca’s journey building a transactional data lake using Amazon Simple Storage Service (Amazon S3), Apache Iceberg, and AWS Analytics.

Data Lake

Data Lake Analytics Snapshot Optimization

Perform upserts in a data lake using Amazon Athena and Apache Iceberg

AWS Big Data

APRIL 27, 2023

Amazon Athena supports the MERGE command on Apache Iceberg tables, which allows you to perform inserts, updates, and deletes in your data lake at scale using familiar SQL statements that are compliant with ACID (Atomic, Consistent, Isolated, Durable). The first task performs an initial copy of the full data into an S3 folder.

Data Lake

Data Lake Snapshot Optimization Data Transformation

How AWS helped Altron Group accelerate their vision for optimized customer engagement

AWS Big Data

JULY 13, 2023

Altron is a pioneer of providing data-driven solutions for their customers by combining technical expertise with in-depth customer understanding to provide highly differentiated technology solutions. Data quality for account and customer data – Altron wanted to enable data quality and data governance best practices.

Optimization

Optimization B2B Data Quality Sales

How Aura from Unity revolutionized their big data pipeline with Amazon Redshift Serverless

AWS Big Data

APRIL 4, 2024

Amazon Redshift is a recommended service for online analytical processing (OLAP) workloads such as cloud data warehouses, data marts, and other analytical data stores. You can use simple SQL to analyze structured and semi-structured data, operational databases, and data lakes to deliver the best price/performance at any scale.

Big Data

Big Data Data Warehouse Advertising OLAP

Your guide to AWS Analytics at AWS re:Invent 2023

AWS Big Data

NOVEMBER 13, 2023

2:30 PM – 3:30 PM (PDT) Mandalay Bay ANT335 | Get the most out of your data warehousing workloads. 5:30 PM – 6:30 PM (PDT) Ceasars Forum ANT349-R | Advanced real-time analytics and ML in your data warehouse [REPEAT]. 2:30 PM – 3:30 PM (PDT) Mandalay Bay ANT335 | Get the most out of your data warehousing workloads.

Analytics

Analytics Data Lake Data Warehouse Data-driven

What’s cooking with Amazon Redshift at AWS re:Invent 2023

AWS Big Data

NOVEMBER 15, 2023

Sessions can be big room breakout sessions, usually with a customer speaker, or more intimate and technical chalk talks, workshops, or builder sessions. Take a look, plan your week, and soak in the learning!

Data Lake

Data Lake Data Warehouse B2B Deep Learning

Real-time streaming data top picks you cannot miss at AWS re:Invent 2023

AWS Big Data

NOVEMBER 8, 2023

Putting your data to work with generative AI – Innovation Talk Thursday, November 30 | 12:30 – 1:30 PM PST | The Venetian Join Mai-Lan Tomsen Bukovec, Vice President, Technology at AWS to learn how you can turn your data lake into a business advantage with generative AI. Reserve your seat now! Reserve your seat now!

Data-driven

Data-driven Data Lake Machine Learning Cost-Benefit

Build a data lake with Apache Flink on Amazon EMR

AWS Big Data

JANUARY 27, 2023

Verify all table metadata is stored in the AWS Glue Data Catalog. Consume data with Athena or Amazon EMR Trino for business analysis. Update and delete source records in Amazon RDS for MySQL and validate the reflection of the data lake tables. the Flink table API/SQL can integrate with the AWS Glue Data Catalog.

Data Lake

Data Lake Metadata Business Analysis Data-driven

The Gartner 2021 Leadership Vision for Data & Analytics Leaders Webinar Q&A

Andrew White

JANUARY 11, 2021

Does Data warehouse as a software tool will play role in future of Data & Analytics strategy? You cannot get away from a formalized delivery capability focused on regular, scheduled, structured and reasonably governed data. Data lakes don’t offer this nor should they. E.g. Data Lakes in Azure – as SaaS.

Data Analytics

Data Analytics Analytics Data-driven Finance

Data Leaders Brief

Create an Apache Hudi-based near-real-time transactional data lake using AWS DMS, Amazon Kinesis, AWS Glue streaming ETL, and data visualization using Amazon QuickSight

AWS Lake Formation 2022 year in review

Webinars

Trending Sources

Extend your data mesh with Amazon Athena and federated views

Webinars

Reference guide to build inventory management and forecasting solutions on AWS

Automate the archive and purge data process for Amazon RDS for PostgreSQL using pg_partman, Amazon S3, and AWS Glue

Introducing Amazon EMR on EKS job submission with Spark Operator and spark-submit

Automate deployment of an Amazon QuickSight analysis connecting to an Amazon Redshift data warehouse with an AWS CloudFormation template

Orca Security’s journey to a petabyte-scale data lake with Apache Iceberg and AWS Analytics

Perform upserts in a data lake using Amazon Athena and Apache Iceberg

How AWS helped Altron Group accelerate their vision for optimized customer engagement

How Aura from Unity revolutionized their big data pipeline with Amazon Redshift Serverless

Your guide to AWS Analytics at AWS re:Invent 2023

What’s cooking with Amazon Redshift at AWS re:Invent 2023

Real-time streaming data top picks you cannot miss at AWS re:Invent 2023

Build a data lake with Apache Flink on Amazon EMR

The Gartner 2021 Leadership Vision for Data & Analytics Leaders Webinar Q&A

Stay Connected