Data Architecture, Data Lake and Visualization

Data Architecture

Data Lake

Visualization

Use Apache Iceberg in your data lake with Amazon S3, AWS Glue, and Snowflake

AWS Big Data

APRIL 3, 2024

They understand that a one-size-fits-all approach no longer works, and recognize the value in adopting scalable, flexible tools and open data formats to support interoperability in a modern data architecture to accelerate the delivery of new solutions.

Data Lake

Data Lake Snapshot Metadata Data Architecture

Load data incrementally from transactional data lakes to data warehouses

AWS Big Data

OCTOBER 19, 2023

Data lakes and data warehouses are two of the most important data storage and management technologies in a modern data architecture. Data lakes store all of an organization’s data, regardless of its format or structure.

Data Lake

Data Lake Data Warehouse Visualization Snapshot

Join 52,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Communication

Peak Performance: Continuous Testing & Evaluation of LLM-Based Applications

Manufacturing Sustainability Surge: Your Guide to Data-Driven Energy Optimization & Decarbonization

From Developer Experience to Product Experience: How a Shared Focus Fuels Product Success

Understanding User Needs and Satisfying Them

MORE WEBINARS

Trending Sources

Automate replication of relational sources into a transactional data lake with Apache Iceberg and AWS Glue

AWS Big Data

FEBRUARY 14, 2023

Organizations have chosen to build data lakes on top of Amazon Simple Storage Service (Amazon S3) for many years. A data lake is the most popular choice for organizations to store all their organizational data generated by different teams, across business domains, from all different formats, and even over history.

Data Lake

Data Lake Statistics Data Architecture Finance

Webinars

The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Communication

Peak Performance: Continuous Testing & Evaluation of LLM-Based Applications

Manufacturing Sustainability Surge: Your Guide to Data-Driven Energy Optimization & Decarbonization

From Developer Experience to Product Experience: How a Shared Focus Fuels Product Success

Understanding User Needs and Satisfying Them

MORE WEBINARS

Petabyte-scale log analytics with Amazon S3, Amazon OpenSearch Service, and Amazon OpenSearch Ingestion

AWS Big Data

MARCH 7, 2024

At the same time, they need to optimize operational costs to unlock the value of this data for timely insights and do so with a consistent performance. With this massive data growth, data proliferation across your data stores, data warehouse, and data lakes can become equally challenging.

Data Lake

Data Lake Analytics Dashboards Metrics

What is a data architect? Skills, salaries, and how to become a data framework master

CIO Business Intelligence

OCTOBER 13, 2023

Data architect role Data architects are senior visionaries who translate business requirements into technology requirements and define data standards and principles, often in support of data or digital transformations. Data architect vs. data engineer The data architect and data engineer roles are closely related.

Data Architecture

Data Architecture Data Warehouse Statistics Visualization

Build a transactional data lake using Apache Iceberg, AWS Glue, and cross-account data shares using AWS Lake Formation and Amazon Athena

AWS Big Data

APRIL 24, 2023

Building a data lake on Amazon Simple Storage Service (Amazon S3) provides numerous benefits for an organization. However, many use cases, like performing change data capture (CDC) from an upstream relational database to an Amazon S3-based data lake, require handling data at a record level.

Data Lake

Data Lake Data Governance Cost-Benefit Machine Learning

Visualize data quality scores and metrics generated by AWS Glue Data Quality

AWS Big Data

JUNE 6, 2023

It provides insights and metrics related to the performance and effectiveness of data quality processes. In this post, we highlight the seamless integration of Amazon Athena and Amazon QuickSight , which enables the visualization of operational metrics for AWS Glue Data Quality rule evaluation in an efficient and effective manner.

Data Quality

Data Quality Metrics Visualization Dashboards

Detect, mask, and redact PII data using AWS Glue before loading into Amazon OpenSearch Service

AWS Big Data

JANUARY 12, 2024

Ingestion: Data lake batch, micro-batch, and streaming Many organizations land their source data into their data lake in various ways, including batch, micro-batch, and streaming jobs. Amazon AppFlow can be used to transfer data from different SaaS applications to a data lake.

Data Lake

Data Lake Cost-Benefit Visualization Structured Data

Exploring new ETL and ELT capabilities for Amazon Redshift from the AWS Glue Studio visual editor

AWS Big Data

APRIL 20, 2023

In a modern data architecture, unified analytics enable you to access the data you need, whether it’s stored in a data lake or a data warehouse. Select the Visual with a blank canvas , because we’re authoring a job from scratch, then choose Create.

Visualization

Visualization Data Warehouse Big Data Data Lake

Enhance data security and governance for Amazon Redshift Spectrum with VPC endpoints

AWS Big Data

FEBRUARY 16, 2024

Many customers are extending their data warehouse capabilities to their data lake with Amazon Redshift. They are looking to further enhance their security posture where they can enforce access policies on their data lakes based on Amazon Simple Storage Service (Amazon S3). Choose Create endpoint.

Data Lake

Data Lake Data Warehouse Testing Business Objectives

Why the Data Journey Manifesto?

DataKitchen

JUNE 12, 2023

We had been talking about “Agile Analytic Operations,” “DevOps for Data Teams,” and “Lean Manufacturing For Data,” but the concept was hard to get across and communicate. I spent much time de-categorizing DataOps: we are not discussing ETL, Data Lake, or Data Science.

Testing

Testing Data Lake Dashboards Data Science

Amazon Redshift announcements at AWS re:Invent 2023 to enable analytics on all your data

AWS Big Data

NOVEMBER 29, 2023

Zero-ETL integration also enables you to load and analyze data from multiple operational database clusters in a new or existing Amazon Redshift instance to derive holistic insights across many applications. Use one click to access your data lake tables using auto-mounted AWS Glue data catalogs on Amazon Redshift for a simplified experience.

Data Warehouse

Data Warehouse Data Lake Analytics Machine Learning

How smava makes loans transparent and affordable using Amazon Redshift Serverless

AWS Big Data

DECEMBER 21, 2023

To bring their customers the best deals and user experience, smava follows the modern data architecture principles with a data lake as a scalable, durable data store and purpose-built data stores for analytical processing and data consumption. This is the Data Mart stage.

Data Lake

Data Lake Data Warehouse Data-driven B2B

Data science vs data analytics: Unpacking the differences

IBM Big Data Hub

SEPTEMBER 19, 2023

Overview: Data science vs data analytics Think of data science as the overarching umbrella that covers a wide range of tasks performed to find patterns in large datasets, structure data for use, train machine learning models and develop artificial intelligence (AI) applications.

Data Science

Data Science Data Analytics Prescriptive Analytics Analytics

A comparative assessment of digital transformation in Italy

CIO Business Intelligence

APRIL 24, 2024

In fact, AMA collects a huge amount of structured and unstructured data from bins, collection vehicles, facilities, and user reports, and until now, this data has remained disconnected, managed by disparate systems and interfaces, through Excel spreadsheets.

Digital Transformation

Digital Transformation Business Intelligence Unstructured Data Data Lake

CIO Ryan Snyder on the benefits of interpreting data as a layer cake

CIO Business Intelligence

AUGUST 2, 2023

So Thermo Fisher Scientific CIO Ryan Snyder and his colleagues have built a data layer cake based on a cascading series of discussions that allow IT and business partners to act as one team. Martha Heller: What are the business drivers behind the data architecture ecosystem you’re building at Thermo Fisher Scientific?

Manufacturing

Manufacturing Data Architecture Strategy Data Strategy

Estimating Scope 1 Carbon Footprint with Amazon Athena

AWS Big Data

AUGUST 2, 2023

The data architecture diagram below shows an example of how you could use AWS services to calculate and visualize an organization’s estimated carbon footprint. Customers have the flexibility to choose the services in each stage of the data pipeline based on their use case. usage_therms", "gasutilization"."usage_scf"

Data Lake

Data Lake Measurement Visualization Data Architecture

Create an end-to-end data strategy for Customer 360 on AWS

AWS Big Data

MARCH 26, 2024

Strategize based on how your teams explore data, run analyses, wrangle data for downstream requirements, and visualize data at different levels. The AWS modern data architecture shows a way to build a purpose-built, secure, and scalable data platform in the cloud.

Data Strategy

Data Strategy Strategy Data Warehouse Prescriptive Analytics

Achieving Trusted AI in Manufacturing

Cloudera

JANUARY 30, 2024

Here are some of the key use cases: Predictive maintenance: With time series data (sensor data) coming from the equipment, historical maintenance logs, and other contextual data, you can predict how the equipment will behave and when the equipment or a component will fail. Eliminate data silos.

Manufacturing

Manufacturing Contextual Data IoT Digital Transformation

Exploring real-time streaming for generative AI Applications

AWS Big Data

MARCH 25, 2024

Both engines provide native ingestion support from Kinesis Data Streams and Amazon MSK via a separate streaming pipeline to a data lake or data warehouse for analysis. OpenSearch Service offers visualization capabilities powered by OpenSearch Dashboards and Kibana (1.5

Data Lake

Data Lake Unstructured Data Management Modeling

DataOps For Business Analytics Teams

DataKitchen

JANUARY 3, 2022

Data scientists derive insights from data while business analysts work closely with and tend to the data needs of business units. Business analysts sometimes perform data science, but usually, they integrate and visualize data and create reports and dashboards from data supplied by other groups.

Business Analytics

Business Analytics Analytics Testing Dashboards

Belcorp reimagines R&D with AI

CIO Business Intelligence

JUNE 28, 2023

The initial stage involved establishing the data architecture, which provided the ability to handle the data more effectively and systematically. “We This allowed us to derive insights more easily.”

Digital Transformation

Digital Transformation Cost-Benefit Informatics Data mining

Architectural patterns for real-time analytics using Amazon Kinesis Data Streams, part 1

AWS Big Data

JANUARY 8, 2024

Kinesis Data Streams has native integrations with other AWS services such as AWS Glue and Amazon EventBridge to build real-time streaming applications on AWS. Refer to Amazon Kinesis Data Streams integrations for additional details. The raw data can be streamed to Amazon S3 for archiving.

Analytics

Analytics IoT Data-driven Snapshot

Deep dive into the AWS ProServe Hadoop Migration Delivery Kit TCO tool

AWS Big Data

FEBRUARY 6, 2023

In this post, we dive deep into the tool, walking through all steps from log ingestion, transformation, visualization, and architecture design to calculate TCO. With QuickSight, you can visualize YARN log data and conduct analysis against the datasets generated by pre-built dashboard templates and a widget.

Dashboards

Dashboards Optimization Data Lake Cost-Benefit

The New Normal for FP&A: Data Analytics

Jedox

OCTOBER 22, 2020

In working with clients, these are some of the most common “pain points” I routinely address: Difficulty in extracting data out of legacy systems. Limited real-time analytics and visuals. Inability to get data quickly. Data accuracy concerns. More time spent accessing data vs. making data-driven decisions.

Data Analytics

Data Analytics Analytics Unstructured Data Data mining

How Cargotec uses metadata replication to enable cross-account data sharing

AWS Big Data

JUNE 7, 2023

Cargotec captures terabytes of IoT telemetry data from their machinery operated by numerous customers across the globe. This data needs to be ingested into a data lake, transformed, and made available for analytics, machine learning (ML), and visualization. In his spare time, he loves to play cricket and badminton.

Metadata

Metadata Data Lake Machine Learning Big Data

Build incremental data pipelines to load transactional data changes using AWS DMS, Delta 2.0, and Amazon EMR Serverless

AWS Big Data

MARCH 3, 2023

Building data lakes from continuously changing transactional data of databases and keeping data lakes up to date is a complex task and can be an operational challenge. You can then apply transformations and store data in Delta format for managing inserts, updates, and deletes.

Data Lake

Data Lake Dashboards Metrics Metadata

Showpad accelerates data maturity to unlock innovation using Amazon QuickSight

AWS Big Data

APRIL 5, 2023

In this post, we share how Showpad used QuickSight to streamline data and insights access across teams and customers. Showpad migrated over 70 dashboards with over 1,000 visuals. The company also used the opportunity to reimagine its data pipeline and architecture.

Dashboards

Dashboards Reporting Cost-Benefit Visualization

Three Trends for Modernizing Analytics and Data Warehousing in 2019

Cloudera

DECEMBER 19, 2018

The most common big data use case is data warehouse optimization. Big data architecture is used to augment different applications, operating alongside or in a discrete fashion with a data warehouse. A big data implementation may even replace a data warehouse entirely with a data lake.

Data Warehouse

Data Warehouse Analytics Big Data Data Architecture

Modern Data Architecture for Telecommunications

Cloudera

SEPTEMBER 6, 2022

Data has continued to grow both in scale and in importance through this period, and today telecommunications companies are increasingly seeing data architecture as an independent organizational challenge, not merely an item on an IT checklist. Previously, there were three types of data structures in telco: .

Data Architecture

Data Architecture Cost-Benefit Digital Transformation Business Driver

This Structure has Novel Features which are of Considerable Business Interest

Peter James Thomas

APRIL 3, 2020

Busy Executives and Managers have their information needs best served via visual exhibits that are focussed on their areas of priority and highlight things that are of specific concern to them. Without paying attention to this, your shiny warehouse or data lake will be a technological curiosity, not an indispensable business tool.

Dashboards

Dashboards Reporting Sales Data Lake

Texas Rangers data transformation modernizes stadium operations

CIO Business Intelligence

OCTOBER 18, 2022

Noel had already established a relationship with consulting firm Resultant through a smaller data visualization project. Resultant recommended a new, on-prem data infrastructure, complete with data lakes to provide stake holders with a better way to manage data reliability, accuracy, and timeliness.

Data Transformation

Data Transformation Consulting Data Lake Reporting

Augmented data management: Data fabric versus data mesh

IBM Big Data Hub

APRIL 27, 2022

Data fabric and data mesh are emerging data management concepts that are meant to address the organizational change and complexities of understanding, governing and working with enterprise data in a hybrid multicloud ecosystem. The good news is that both data architecture concepts are complimentary.

Management

Management Metadata Data Architecture Data Lake

5 Reasons to Use Apache Iceberg on Cloudera Data Platform (CDP)

Cloudera

MARCH 23, 2022

In fact, we recently announced the integration with our cloud ecosystem bringing the benefits of Iceberg to enterprises as they make their journey to the public cloud, and as they adopt more converged architectures like the Lakehouse. 1: Multi-function analytics . 2: Open formats. Flexible and open file formats.

Metadata

Metadata Data Architecture Machine Learning Cost-Benefit

Unstructured data management and governance using AWS AI/ML and analytics services

AWS Big Data

OCTOBER 25, 2023

The following is a high-level architecture of the solution we can build to process the unstructured data, assuming the input data is being ingested to the raw input object store. The steps of the workflow are as follows: Integrated AI services extract data from the unstructured data.

Unstructured Data

Unstructured Data Metadata Management Analytics

The Cloud Connection: How Governance Supports Security

Alation

APRIL 14, 2022

A useful feature for exposing patterns in the data. Visual Profiling. Supports the ability to interact with the actual data and perform analysis on it. Pushing data to a data lake and assuming it is ready for use is shortsighted. Parametrization. A technique to automate changes in iterative passes.

Metadata

Metadata Data Governance Modeling Data-driven

Flexible and secure Data-as-a-Service delivered today

Birst BI

OCTOBER 24, 2019

DaaS is a core component of modern data architecture. It provides a governed standard for accessing existing data objects and pipelines for sharing new data objects within an organization. Because it hides the underlying complexities of connecting to and preparing data sources, DaaS helps expand usage of available data.

Data Warehouse

Data Warehouse Recreation/Entertainment Data Lake Data Architecture

Use Apache Iceberg in your data lake with Amazon S3, AWS Glue, and Snowflake

Load data incrementally from transactional data lakes to data warehouses

Webinars

Trending Sources

Automate replication of relational sources into a transactional data lake with Apache Iceberg and AWS Glue

Webinars

Petabyte-scale log analytics with Amazon S3, Amazon OpenSearch Service, and Amazon OpenSearch Ingestion

What is a data architect? Skills, salaries, and how to become a data framework master

Build a transactional data lake using Apache Iceberg, AWS Glue, and cross-account data shares using AWS Lake Formation and Amazon Athena

Visualize data quality scores and metrics generated by AWS Glue Data Quality

Detect, mask, and redact PII data using AWS Glue before loading into Amazon OpenSearch Service

Exploring new ETL and ELT capabilities for Amazon Redshift from the AWS Glue Studio visual editor

Enhance data security and governance for Amazon Redshift Spectrum with VPC endpoints

Why the Data Journey Manifesto?

Amazon Redshift announcements at AWS re:Invent 2023 to enable analytics on all your data

How smava makes loans transparent and affordable using Amazon Redshift Serverless

Data science vs data analytics: Unpacking the differences

A comparative assessment of digital transformation in Italy

CIO Ryan Snyder on the benefits of interpreting data as a layer cake

Estimating Scope 1 Carbon Footprint with Amazon Athena

Create an end-to-end data strategy for Customer 360 on AWS

Achieving Trusted AI in Manufacturing

Exploring real-time streaming for generative AI Applications

DataOps For Business Analytics Teams

Belcorp reimagines R&D with AI

Architectural patterns for real-time analytics using Amazon Kinesis Data Streams, part 1

Deep dive into the AWS ProServe Hadoop Migration Delivery Kit TCO tool

The New Normal for FP&A: Data Analytics

How Cargotec uses metadata replication to enable cross-account data sharing

Build incremental data pipelines to load transactional data changes using AWS DMS, Delta 2.0, and Amazon EMR Serverless

Showpad accelerates data maturity to unlock innovation using Amazon QuickSight

Three Trends for Modernizing Analytics and Data Warehousing in 2019

Modern Data Architecture for Telecommunications

This Structure has Novel Features which are of Considerable Business Interest

Texas Rangers data transformation modernizes stadium operations

Augmented data management: Data fabric versus data mesh

5 Reasons to Use Apache Iceberg on Cloudera Data Platform (CDP)

Unstructured data management and governance using AWS AI/ML and analytics services

The Cloud Connection: How Governance Supports Security

Flexible and secure Data-as-a-Service delivered today

Stay Connected