article thumbnail

Implement data warehousing solution using dbt on Amazon Redshift

AWS Big Data

For more information, refer SQL models. Snapshots – These implements type-2 slowly changing dimensions (SCDs) over mutable source tables. Seeds – These are CSV files in your dbt project (typically in your seeds directory), which dbt can load into your data warehouse using the dbt seed command. A Redshift cluster.

article thumbnail

Use Amazon Athena with Spark SQL for your open-source transactional table formats

AWS Big Data

These formats enable ACID (atomicity, consistency, isolation, durability) transactions, upserts, and deletes, and advanced features such as time travel and snapshots that were previously only available in data warehouses. For more information, refer to Amazon S3: Allows read and write access to objects in an S3 Bucket.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Enable Multi-AZ deployments for your Amazon Redshift data warehouse

AWS Big Data

For more information to create key using KMS, refer to Creating keys. Complete the following steps to create a Multi-AZ deployment restored from a snapshot: On the Amazon Redshift console, in the navigation pane, choose Clusters. Choose the Maintenance Select a snapshot and choose Restore snapshot , Restore to provisioned cluster.

article thumbnail

Join a streaming data source with CDC data for real-time serverless data analytics using AWS Glue, AWS DMS, and Amazon DynamoDB

AWS Big Data

Data lakes are not transactional by default; however, there are multiple open-source frameworks that enhance data lakes with ACID properties, providing a best of both worlds solution between transactional and non-transactional storage mechanisms. The reference data is continuously replicated from MySQL to DynamoDB through AWS DMS.

article thumbnail

How Klarna Bank AB built real-time decision-making with Amazon Kinesis Data Analytics for Apache Flink

AWS Big Data

This post presents a reference architecture for real-time queries and decision-making on AWS using Amazon Kinesis Data Analytics for Apache Flink. In addition, we explain why the Klarna Decision Tooling team selected Kinesis Data Analytics for Apache Flink for their first real-time decision query service.

article thumbnail

Architectural patterns for real-time analytics using Amazon Kinesis Data Streams, part 1

AWS Big Data

This is the first post to a blog series that offers common architectural patterns in building real-time data streaming infrastructures using Kinesis Data Streams for a wide range of use cases. In this post, we will review the common architectural patterns of two use cases: Time Series Data Analysis and Event Driven Microservices.

Analytics 111
article thumbnail

Achieve near real time operational analytics using Amazon Aurora PostgreSQL zero-ETL integration with Amazon Redshift

AWS Big Data

For complete getting started guides, refer to Working with Aurora zero-ETL integrations with Amazon Redshift and Working with zero-ETL integrations. Refer to Connect to an Aurora PostgreSQL DB cluster for the options to connect to the PostgreSQL cluster. The following diagram illustrates the architecture implemented in this post.