2023, Data Integration and Data Lake

2023

Data Integration

Data Lake

Migrate an existing data lake to a transactional data lake using Apache Iceberg

AWS Big Data

OCTOBER 3, 2023

A data lake is a centralized repository that you can use to store all your structured and unstructured data at any scale. You can store your data as-is, without having to first structure the data and then run different types of analytics for better business insights.

Data Lake

Data Lake Metadata Snapshot Recreation/Entertainment

My Understanding of the Gartner® Hype Cycle™ for Finance Data and Analytics Governance, 2023

Data Virtualization

MARCH 28, 2024

As noted in the Gartner Hype Cycle for Finance Data and Analytics Governance, 2023, “Through. The post My Understanding of the Gartner® Hype Cycle™ for Finance Data and Analytics Governance, 2023 appeared first on Data Management Blog - Data Integration and Modern Data Management Articles, Analysis and Information.

Finance

Finance Digital Transformation Analytics Data Integration

Join 52,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

Peak Performance: Continuous Testing & Evaluation of LLM-Based Applications

Manufacturing Sustainability Surge: Your Guide to Data-Driven Energy Optimization & Decarbonization

From Developer Experience to Product Experience: How a Shared Focus Fuels Product Success

Understanding User Needs and Satisfying Them

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

MORE WEBINARS

Trending Sources

Enhance monitoring and debugging for AWS Glue jobs using new job observability metrics

AWS Big Data

NOVEMBER 20, 2023

For any modern data-driven company, having smooth data integration pipelines is crucial. These pipelines pull data from various sources, transform it, and load it into destination systems for analytics and reporting. When running properly, it provides timely and trustworthy information.

Metrics

Metrics Data Lake Cost-Benefit Dashboards

Webinars

Peak Performance: Continuous Testing & Evaluation of LLM-Based Applications

Manufacturing Sustainability Surge: Your Guide to Data-Driven Energy Optimization & Decarbonization

From Developer Experience to Product Experience: How a Shared Focus Fuels Product Success

Understanding User Needs and Satisfying Them

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

MORE WEBINARS

Simplifying data processing at Capitec with Amazon Redshift integration for Apache Spark

AWS Big Data

NOVEMBER 10, 2023

The data sourcing problem To ensure the reliability of PySpark data pipelines, it’s essential to have consistent record-level data from both dimensional and fact tables stored in the Enterprise Data Warehouse (EDW). These tables are then joined with tables from the Enterprise Data Lake (EDL) at runtime.

Data Processing

Data Processing Data Lake Data Warehouse Optimization

Load data incrementally from transactional data lakes to data warehouses

AWS Big Data

OCTOBER 19, 2023

Data lakes and data warehouses are two of the most important data storage and management technologies in a modern data architecture. Data lakes store all of an organization’s data, regardless of its format or structure.

Data Lake

Data Lake Data Warehouse Visualization Snapshot

Enhance monitoring and debugging for AWS Glue jobs using new job observability metrics, Part 3: Visualization and trend analysis using Amazon QuickSight

AWS Big Data

MARCH 29, 2024

You can slice data by different dimensions like job name, see anomalies, and share reports securely across your organization. With these insights, teams have the visibility to make data integration pipelines more efficient. Looking at the Skewness Job per Job visualization, there was spike on November 1, 2023.

Metrics

Metrics Visualization Dashboards Interactive

5 financial planning software capabilities that drive business value

Jedox

JANUARY 13, 2023

To that end, finance leaders can prioritize solutions that facilitate faster data integrations through prebuilt connectors and offer an intuitive user experience to drive adoption. Data integration. Efficient implementation is a key differentiator for enterprises looking to achieve quick time to value.

Software

Software Finance Forecasting Data Lake

Amazon Redshift announcements at AWS re:Invent 2023 to enable analytics on all your data

AWS Big Data

NOVEMBER 29, 2023

These announcements drive forward the AWS Zero-ETL vision to unify all your data, enabling you to better maximize the value of your data with comprehensive analytics and ML capabilities, and innovate faster with secure data collaboration within and across organizations.

Data Warehouse

Data Warehouse Data Lake Analytics Machine Learning

Improve healthcare services through patient 360: A zero-ETL approach to enable near real-time data analytics

AWS Big Data

MARCH 27, 2024

AWS has invested in a zero-ETL (extract, transform, and load) future so that builders can focus more on creating value from data, instead of having to spend time preparing data for analysis. This means you no longer have to create an external schema in Amazon Redshift to use the data lake tables cataloged in the Data Catalog.

Data Analytics

Data Analytics Analytics Data Warehouse Data Lake

Introducing native support for Apache Hudi, Delta Lake, and Apache Iceberg on AWS Glue for Apache Spark, Part 2: AWS Glue Studio Visual Editor

AWS Big Data

MARCH 20, 2023

In the first post of this series , we described how AWS Glue for Apache Spark works with Apache Hudi, Linux Foundation Delta Lake, and Apache Iceberg datasets tables using the native support of those data lake formats. Even without prior experience using Hudi, Delta Lake or Iceberg, you can easily achieve typical use cases.

Visualization

Visualization Data Lake Snapshot Big Data

Preparing the foundations for Generative AI

CIO Business Intelligence

FEBRUARY 20, 2024

Data also needs to be sorted, annotated and labelled in order to meet the requirements of generative AI. No wonder CIO’s 2023 AI Priorities study found that data integration was the number one concern for IT leaders around generative AI integration, above security and privacy and the user experience.

Cost-Benefit

Cost-Benefit Data Lake Data Warehouse Data Integration

Getting started with AWS Glue Data Quality from the AWS Glue Data Catalog

AWS Big Data

JUNE 6, 2023

AWS Glue is a serverless data integration service that makes it simple to discover, prepare, and combine data for analytics, machine learning (ML), and application development. Hundreds of thousands of customers use data lakes for analytics and ML to make data-driven business decisions.

Data Quality

Data Quality Data Lake Data-driven Metrics

Introducing AWS Glue serverless Spark UI for better monitoring and troubleshooting

AWS Big Data

NOVEMBER 20, 2023

In AWS, hundreds of thousands of customers use AWS Glue , a serverless data integration service, to discover, combine, and prepare data for analytics and machine learning. The sample job reads data from MySQL database and writes to S3 in Parquet format.

Visualization

Visualization Optimization Data Lake Management

Your guide to AWS Analytics at AWS re:Invent 2023

AWS Big Data

NOVEMBER 13, 2023

2023 AWS Analytics Superheroes We are excited to introduce the 2023 AWS Analytics Superheroes at this year’s re:Invent conference! A shapeshifting guardian and protector of data like Data Lynx? 11:30 AM – 12:30 PM (PDT) Ceasars Forum ANT318 | Accelerate innovation with end-to-end serverless data architecture.

Analytics

Analytics Data Lake Data Warehouse Data-driven

Modeling, Modernization and Automation

BI-Survey

APRIL 27, 2023

Eckerson Group wrote this report in collaboration with BARC by studying the results of a global survey of 238 data & analytics practitioners and leaders. BARC conducted the survey in December 2022 and January 2023, drawing respondents from companies of various sizes and across various industries.

Modeling

Modeling Data Warehouse Data Quality Business Driver

AWS re:Invent 2023 Amazon Redshift Sessions Recap

AWS Big Data

DECEMBER 18, 2023

Sessions ANT203 | What’s new in Amazon Redshift Watch this session to learn about the newest innovations within Amazon Redshift—the petabyte-scale AWS Cloud data warehousing solution. Easily build and train machine learning models using SQL within Amazon Redshift to generate predictive analytics and propel data-driven decision-making.

Data Warehouse

Data Warehouse Machine Learning Data-driven Data Lake

Connect your data for faster decisions with AWS

AWS Big Data

NOVEMBER 7, 2023

In this post, we discuss how we’re delivering on these investments with a number of data integration innovations that span AWS databases, analytics, business intelligence (BI), and ML services. Summary The data integration innovations we have highlighted show our commitment to empowering organizations to easily connect their data.

Dashboards

Dashboards Data-driven Data Integration Data Lake

The Enduring Significance of Data Modeling in the Modern Data-Driven Enterprise

erwin

AUGUST 31, 2023

Reduced Data Redundancy : By eliminating data duplication, it optimizes storage and enhances data quality, reducing errors and discrepancies. Efficient Development : Accurate data models expedite database development, leading to efficient data integration, migration, and application development. by Quest ®.

Data-driven

Data-driven Modeling Enterprise Structured Data

Process price transparency data using AWS Glue

AWS Big Data

MAY 4, 2023

This post walks you through the preprocessing and processing steps required to prepare data published by health insurers in light of this federal regulation using AWS Glue. getvalue(),encoding='utf-8') s3_client.put_object(Body=data, Bucket=bucket, Key=upload_path) s3_client = boto3.client('s3')

Insurance

Insurance Publishing Cost-Benefit Data Lake

Data Leaders Brief

Migrate an existing data lake to a transactional data lake using Apache Iceberg

My Understanding of the Gartner® Hype Cycle™ for Finance Data and Analytics Governance, 2023

Webinars

Trending Sources

Enhance monitoring and debugging for AWS Glue jobs using new job observability metrics

Webinars

Simplifying data processing at Capitec with Amazon Redshift integration for Apache Spark

Load data incrementally from transactional data lakes to data warehouses

Enhance monitoring and debugging for AWS Glue jobs using new job observability metrics, Part 3: Visualization and trend analysis using Amazon QuickSight

5 financial planning software capabilities that drive business value

Amazon Redshift announcements at AWS re:Invent 2023 to enable analytics on all your data

Improve healthcare services through patient 360: A zero-ETL approach to enable near real-time data analytics

Introducing native support for Apache Hudi, Delta Lake, and Apache Iceberg on AWS Glue for Apache Spark, Part 2: AWS Glue Studio Visual Editor

Preparing the foundations for Generative AI

Getting started with AWS Glue Data Quality from the AWS Glue Data Catalog

Introducing AWS Glue serverless Spark UI for better monitoring and troubleshooting

Your guide to AWS Analytics at AWS re:Invent 2023

Modeling, Modernization and Automation

AWS re:Invent 2023 Amazon Redshift Sessions Recap

Connect your data for faster decisions with AWS

The Enduring Significance of Data Modeling in the Modern Data-Driven Enterprise

Process price transparency data using AWS Glue

Stay Connected