article thumbnail

Create an Apache Hudi-based near-real-time transactional data lake using AWS DMS, Amazon Kinesis, AWS Glue streaming ETL, and data visualization using Amazon QuickSight

AWS Big Data

Data analytics on operational data at near-real time is becoming a common need. Due to the exponential growth of data volume, it has become common practice to replace read replicas with data lakes to have better scalability and performance. For more information, see Changing the default settings for your data lake.

article thumbnail

AWS Lake Formation 2022 year in review

AWS Big Data

We have collected some of the key talks and solutions on data governance, data mesh, and modern data architecture published and presented in AWS re:Invent 2022, and a few data lake solutions built by customers and AWS Partners for easy reference. Starting with Amazon EMR release 6.7.0,

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

Extend your data mesh with Amazon Athena and federated views

AWS Big Data

Clean up To clean up the resources created for this post, complete the following steps: On the Amazon S3 console, empty the bucket athena-federation-workshop-. If you’re using the AWS CLI, delete the objects in the athena-federation-workshop- bucket with the following code. Big Data Architect on Amazon Athena.

article thumbnail

Reference guide to build inventory management and forecasting solutions on AWS

AWS Big Data

By collecting data from store sensors using AWS IoT Core , ingesting it using AWS Lambda to Amazon Aurora Serverless , and transforming it using AWS Glue from a database to an Amazon Simple Storage Service (Amazon S3) data lake, retailers can gain deep insights into their inventory and customer behavior.

article thumbnail

Automate the archive and purge data process for Amazon RDS for PostgreSQL using pg_partman, Amazon S3, and AWS Glue

AWS Big Data

Gain a high-level understanding of AWS Glue and its components by using the following hands-on workshop. Vivek Shrivastava is a Principal Data Architect, Data Lake in AWS Professional Services. He is a big data enthusiast and holds 14 AWS Certifications.

article thumbnail

Introducing Amazon EMR on EKS job submission with Spark Operator and spark-submit

AWS Big Data

Amazon EMR on EKS provides a deployment option for Amazon EMR that allows organizations to run open-source big data frameworks on Amazon Elastic Kubernetes Service (Amazon EKS). With EMR on EKS, Spark applications run on the Amazon EMR runtime for Apache Spark. Replace in the following code with the bucket name.

article thumbnail

Automate deployment of an Amazon QuickSight analysis connecting to an Amazon Redshift data warehouse with an AWS CloudFormation template

AWS Big Data

For more information about automating dashboard deployment, customizing access to the QuickSight console, configuring for team collaboration, and implementing multi-tenancy and client user segregation, check out the videos Virtual Admin Workshop: Working with Amazon QuickSight APIs and Admin Level-Up Virtual Workshop, V2 on YouTube.