Big Data, Data Architecture, Data Integration and Data Lake

Use Apache Iceberg in your data lake with Amazon S3, AWS Glue, and Snowflake

AWS Big Data

APRIL 3, 2024

They understand that a one-size-fits-all approach no longer works, and recognize the value in adopting scalable, flexible tools and open data formats to support interoperability in a modern data architecture to accelerate the delivery of new solutions.

Data Lake

Data Lake Snapshot Metadata Data Architecture

Modern Data Architecture: Data Warehousing, Data Lakes, and Data Mesh Explained

Data Virtualization

OCTOBER 5, 2022

Reading Time: 3 minutes At the heart of every organization lies a data architecture, determining how data is accessed, organized, and used. For this reason, organizations must periodically revisit their data architectures, to ensure that they are aligned with current business goals.

Data Lake

Data Lake Data Architecture Data Integration Management

Migrate an existing data lake to a transactional data lake using Apache Iceberg

AWS Big Data

OCTOBER 3, 2023

A data lake is a centralized repository that you can use to store all your structured and unstructured data at any scale. You can store your data as-is, without having to first structure the data and then run different types of analytics for better business insights. They are the same.

Data Lake

Data Lake Metadata Snapshot Recreation/Entertainment

Webinars

Peak Performance: Continuous Testing & Evaluation of LLM-Based Applications

The Key to Sustainable Energy Optimization: A Data-Driven Approach for Manufacturing

From Developer Experience to Product Experience: How a Shared Focus Fuels Product Success

Understanding User Needs and Satisfying Them

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

MORE WEBINARS

Data architecture strategy for data quality

IBM Big Data Hub

JANUARY 5, 2023

Several factors determine the quality of your enterprise data like accuracy, completeness, consistency, to name a few. But there’s another factor of data quality that doesn’t get the recognition it deserves: your data architecture. How the right data architecture improves data quality.

Data Quality

Data Quality Data Architecture Strategy Data Lake

Detect, mask, and redact PII data using AWS Glue before loading into Amazon OpenSearch Service

AWS Big Data

JANUARY 12, 2024

Ingestion: Data lake batch, micro-batch, and streaming Many organizations land their source data into their data lake in various ways, including batch, micro-batch, and streaming jobs. Amazon AppFlow can be used to transfer data from different SaaS applications to a data lake.

Data Lake

Data Lake Cost-Benefit Visualization Structured Data

Dive deep into AWS Glue 4.0 for Apache Spark

AWS Big Data

MAY 18, 2023

It’s even harder when your organization is dealing with silos that impede data access across different data stores. Seamless data integration is a key requirement in a modern data architecture to break down data silos. Noritaka Sekiyama is a Principal Big Data Architect on the AWS Glue team.

Testing

Testing Data Lake Cost-Benefit Data Integration

Query your Iceberg tables in data lake using Amazon Redshift (Preview)

AWS Big Data

AUGUST 31, 2023

Amazon Redshift enables you to directly access data stored in Amazon Simple Storage Service (Amazon S3) using SQL queries and join data across your data warehouse and data lake. With Amazon Redshift, you can query the data in your S3 data lake using a central AWS Glue metastore from your Redshift data warehouse.

Data Lake

Data Lake Data Warehouse Metadata Data Architecture

Load data incrementally from transactional data lakes to data warehouses

AWS Big Data

OCTOBER 19, 2023

Data lakes and data warehouses are two of the most important data storage and management technologies in a modern data architecture. Data lakes store all of an organization’s data, regardless of its format or structure.

Data Lake

Data Lake Data Warehouse Visualization Snapshot

Extract data from SAP ERP using AWS Glue and the SAP SDK

AWS Big Data

FEBRUARY 8, 2023

Vyaire developed a custom data integration platform, iDataHub, powered by AWS services such as AWS Glue , AWS Lambda , and Amazon API Gateway. In this post, we share how we extracted data from SAP ERP using AWS Glue and the SAP SDK. Prahalathan M is the Data Integration Architect at Vyaire Medical Inc.

Testing

Testing Data Integration Data Lake Enterprise

Orca Security’s journey to a petabyte-scale data lake with Apache Iceberg and AWS Analytics

AWS Big Data

JULY 20, 2023

With data becoming the driving force behind many industries today, having a modern data architecture is pivotal for organizations to be successful. In this post, we describe Orca’s journey building a transactional data lake using Amazon Simple Storage Service (Amazon S3), Apache Iceberg, and AWS Analytics.

Data Lake

Data Lake Analytics Snapshot Optimization

A Smart Approach to Logical Data Warehousing, with Azure Synapse and the Denodo Platform

Data Virtualization

SEPTEMBER 16, 2020

Organizations are leveraging cloud analytics to extract useful insights from big data, which draws from a variety of sources such as mobile phones, Internet of. Organizations all over the world are migrating their IT infrastructures and applications to the cloud.

Big Data

Big Data Internet of Things Analytics IT

How GamesKraft uses Amazon Redshift data sharing to support growing analytics workloads

AWS Big Data

NOVEMBER 13, 2023

Amazon Redshift is a fully managed data warehousing service that offers both provisioned and serverless options, making it more efficient to run and scale analytics without having to manage your data warehouse. Additionally, data is extracted from vendor APIs that includes data related to product, marketing, and customer experience.

Data Warehouse

Data Warehouse Data Lake Analytics Data Science

Amazon Redshift announcements at AWS re:Invent 2023 to enable analytics on all your data

AWS Big Data

NOVEMBER 29, 2023

Zero-ETL integration also enables you to load and analyze data from multiple operational database clusters in a new or existing Amazon Redshift instance to derive holistic insights across many applications. Learn more about the zero-ETL integrations, data lake performance enhancements, and other announcements below.

Data Warehouse

Data Warehouse Data Lake Analytics Machine Learning

Augmented data management: Data fabric versus data mesh

IBM Big Data Hub

APRIL 27, 2022

Data fabric and data mesh are emerging data management concepts that are meant to address the organizational change and complexities of understanding, governing and working with enterprise data in a hybrid multicloud ecosystem. The good news is that both data architecture concepts are complimentary.

Management

Management Metadata Data Architecture Data Lake

Data Management Challenges for the Modern Enterprise

Data Virtualization

MARCH 3, 2021

Data is the fuel of the digital economy, so data-centric organizations have a distinct advantage. To remain competitive, organizations must have a data management strategy in place to effectively ingest, store, organize, and analyze data while ensuring that it is.

Enterprise

Enterprise Management Strategy Data Lake

How Cargotec uses metadata replication to enable cross-account data sharing

AWS Big Data

JUNE 7, 2023

This data needs to be ingested into a data lake, transformed, and made available for analytics, machine learning (ML), and visualization. For this, Cargotec built an Amazon Simple Storage Service (Amazon S3) data lake and cataloged the data assets in AWS Glue Data Catalog.

Metadata

Metadata Data Lake Machine Learning Big Data

Data Management Challenges for the Modern Enterprise

Data Virtualization

MARCH 3, 2021

Data is the fuel of the digital economy, so data-centric organizations have a distinct advantage. To remain competitive, organizations must have a data management strategy in place to effectively ingest, store, organize, and analyze data while ensuring that it is.

Management

Management Enterprise Strategy IT

Create an end-to-end data strategy for Customer 360 on AWS

AWS Big Data

MARCH 26, 2024

Data ingestion You have to build ingestion pipelines based on factors like types of data sources (on-premises data stores, files, SaaS applications, third-party data), and flow of data (unbounded streams or batch data). Data exploration Data exploration helps unearth inconsistencies, outliers, or errors.

Data Strategy

Data Strategy Strategy Data Warehouse Prescriptive Analytics

Data Management Challenges for the Modern Enterprise

Data Virtualization

MARCH 3, 2021

Data is the fuel of the digital economy, so data-centric organizations have a distinct advantage. To remain competitive, organizations must have a data management strategy in place to effectively ingest, store, organize, and analyze data while ensuring that it is.

Management

Management Enterprise Strategy IT

Exploring new ETL and ELT capabilities for Amazon Redshift from the AWS Glue Studio visual editor

AWS Big Data

APRIL 20, 2023

In a modern data architecture, unified analytics enable you to access the data you need, whether it’s stored in a data lake or a data warehouse. AWS Glue provides an extensible architecture that enables users with different data processing use cases, and works well with Amazon Redshift.

Visualization

Visualization Data Warehouse Big Data Data Lake

Your guide to AWS Analytics at AWS re:Invent 2023

AWS Big Data

NOVEMBER 13, 2023

11:30 AM – 12:30 PM (PDT) Ceasars Forum ANT318 | Accelerate innovation with end-to-end serverless data architecture. 4:30 PM – 5:30 PM (PDT) Wynn ANT207 | Understand your data with business context. 1:00 PM – 2:00 PM (PDT) Venetian ANT201 | Accelerate innovation with real-time data.

Analytics

Analytics Data Lake Data Warehouse Data-driven

AWS re:Invent 2023 Amazon Redshift Sessions Recap

AWS Big Data

DECEMBER 18, 2023

Amazon Redshift powers data-driven decisions for tens of thousands of customers every day with a fully managed, AI-powered cloud data warehouse, delivering the best price-performance for your analytics workloads. Discover how you can use Amazon Redshift to build a data mesh architecture to analyze your data.

Data Warehouse

Data Warehouse Machine Learning Data-driven Data Lake

Data democratization: How data architecture can drive business decisions and AI initiatives

IBM Big Data Hub

AUGUST 4, 2023

Today, the way businesses use data is much more fluid; data literate employees use data across hundreds of apps, analyze data for better decision-making, and access data from numerous locations. Then, it applies these insights to automate and orchestrate the data lifecycle.

Data Architecture

Data Architecture Data Lake Machine Learning Data Governance

Accelerate Amazon Redshift secure data use with Satori – Part 1

AWS Big Data

SEPTEMBER 21, 2023

Satori accelerates implementing data security controls on datawarehouses like Amazon Redshift, is straightforward to integrate, and doesn’t require any changes to your Amazon Redshift data, schema, or how your users interact with data. To learn more, start a free trial or request a demo meeting.

Data Warehouse

Data Warehouse Interactive Data Architecture Data Lake

Data Leaders Brief

Use Apache Iceberg in your data lake with Amazon S3, AWS Glue, and Snowflake

Modern Data Architecture: Data Warehousing, Data Lakes, and Data Mesh Explained

Webinars

Trending Sources

Migrate an existing data lake to a transactional data lake using Apache Iceberg

Webinars

Data architecture strategy for data quality

Detect, mask, and redact PII data using AWS Glue before loading into Amazon OpenSearch Service

Dive deep into AWS Glue 4.0 for Apache Spark

Query your Iceberg tables in data lake using Amazon Redshift (Preview)

Load data incrementally from transactional data lakes to data warehouses

Extract data from SAP ERP using AWS Glue and the SAP SDK

Orca Security’s journey to a petabyte-scale data lake with Apache Iceberg and AWS Analytics

A Smart Approach to Logical Data Warehousing, with Azure Synapse and the Denodo Platform

How GamesKraft uses Amazon Redshift data sharing to support growing analytics workloads

Amazon Redshift announcements at AWS re:Invent 2023 to enable analytics on all your data

Augmented data management: Data fabric versus data mesh

Data Management Challenges for the Modern Enterprise

How Cargotec uses metadata replication to enable cross-account data sharing

Data Management Challenges for the Modern Enterprise

Create an end-to-end data strategy for Customer 360 on AWS

Data Management Challenges for the Modern Enterprise

Exploring new ETL and ELT capabilities for Amazon Redshift from the AWS Glue Studio visual editor

Your guide to AWS Analytics at AWS re:Invent 2023

AWS re:Invent 2023 Amazon Redshift Sessions Recap

Data democratization: How data architecture can drive business decisions and AI initiatives

Accelerate Amazon Redshift secure data use with Satori – Part 1

Stay Connected