2023, Data Lake and Interactive - Data Leaders Brief

2023

Data Lake

Interactive

Migrate an existing data lake to a transactional data lake using Apache Iceberg

AWS Big Data

OCTOBER 3, 2023

A data lake is a centralized repository that you can use to store all your structured and unstructured data at any scale. You can store your data as-is, without having to first structure the data and then run different types of analytics for better business insights.

Data Lake

Data Lake Metadata Snapshot Recreation/Entertainment

Build a serverless transactional data lake with Apache Iceberg, Amazon EMR Serverless, and Amazon Athena

AWS Big Data

MARCH 10, 2023

Since the deluge of big data over a decade ago, many organizations have learned to build applications to process and analyze petabytes of data. Data lakes have served as a central repository to store structured and unstructured data at any scale and in various formats.

Data Lake

Data Lake Sales Data Warehouse Snapshot

Join 52,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Leading the Development of Profitable and Sustainable Products

How To Get Promoted In Product Management

MORE WEBINARS

Trending Sources

Analytics Vidhya

Interact with Apache Iceberg tables using Amazon Athena and cross account fine-grained permissions using AWS Lake Formation

AWS Big Data

MARCH 23, 2023

Register the S3 path storing the table using Lake Formation We register the S3 full path in Lake Formation: Navigate to the Lake Formation console. In the navigation pane, under Register and ingest , choose Data lake locations. Jack Ye is a software engineer of the Athena Data Lake and Storage team at AWS.

Interactive

Interactive Snapshot Data Lake Software

Webinars

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Leading the Development of Profitable and Sustainable Products

How To Get Promoted In Product Management

MORE WEBINARS

Enhance monitoring and debugging for AWS Glue jobs using new job observability metrics

AWS Big Data

NOVEMBER 20, 2023

Use case A typical workload for AWS Glue for Apache Spark jobs is to load data from a relational database to a data lake with SQL-based transformations. The following is a visual representation of an example job where the number of workers is 10. When the example job ran, the workerUtilization metrics showed the following trend.

Metrics

Metrics Data Lake Cost-Benefit Dashboards

Simplifying data processing at Capitec with Amazon Redshift integration for Apache Spark

AWS Big Data

NOVEMBER 10, 2023

The data sourcing problem To ensure the reliability of PySpark data pipelines, it’s essential to have consistent record-level data from both dimensional and fact tables stored in the Enterprise Data Warehouse (EDW). These tables are then joined with tables from the Enterprise Data Lake (EDL) at runtime.

Data Processing

Data Processing Data Lake Data Warehouse Optimization

CIOs press ahead for gen AI edge — despite misgivings

CIO Business Intelligence

OCTOBER 18, 2023

If anything, 2023 has proved to be a year of reckoning for businesses, and IT leaders in particular, as they attempt to come to grips with the disruptive potential of this technology — just as debates over the best path forward for AI have accelerated and regulatory uncertainty has cast a longer shadow over its outlook in the wake of these events.

Risk

Risk Manufacturing Enterprise Technology

Enhance monitoring and debugging for AWS Glue jobs using new job observability metrics, Part 3: Visualization and trend analysis using Amazon QuickSight

AWS Big Data

MARCH 29, 2024

QuickSight makes it straightforward for business users to visualize data in interactive dashboards and reports. venvScriptsactivate.bat After this step, the subsequent steps run within the bounds of the virtual environment on the client machine and interact with the AWS account as needed. Let’s drill down into details.

Metrics

Metrics Visualization Dashboards Interactive

Load data incrementally from transactional data lakes to data warehouses

AWS Big Data

OCTOBER 19, 2023

Data lakes and data warehouses are two of the most important data storage and management technologies in a modern data architecture. Data lakes store all of an organization’s data, regardless of its format or structure.

Data Lake

Data Lake Data Warehouse Visualization Snapshot

Visualize Confluent data in Amazon QuickSight using Amazon Athena

AWS Big Data

MARCH 27, 2023

In this workflow, data is written to Amazon S3 through the Confluent S3 sink connector and then analyzed with Athena, a serverless interactive analytics service that enables you to analyze and query data stored in Amazon S3 and various other data sources using standard SQL. Choose Create data source. Choose Next.

Visualization

Visualization Data Lake Interactive Data-driven

Amazon QuickSight helps TalentReef empower its customers to make more informed hiring decisions

AWS Big Data

MARCH 17, 2023

The response has been overwhelmingly positive, leading to the development of two additional analytics dashboards, Job Postings and Onboarding, both set to be released in the first half of 2023. Developers get the data into the data lake, and then the product team pulls in the data into QuickSight and deploys it as needed.

Dashboards

Dashboards IT Data Lake Visualization

UAB IT helps fuel genomic breakthroughs

CIO Business Intelligence

MARCH 10, 2022

Next up: AI and data lake decisions. To that end, UAB’s next step is to tackle big decisions around expanding its AI and data analytics platforms, says Carver, who is not handling the long-term planning alone. But the work in support of the university’s research endeavors is rewarding. What humans can do is awesome.

IT Data Lake Digital Transformation Data Governance

Amazon Kinesis Data Streams: celebrating a decade of real-time data innovation

AWS Big Data

NOVEMBER 14, 2023

Ten years ago, we launched Amazon Kinesis Data Streams , the first cloud-native serverless streaming data service, to serve as the backbone for companies, to move data across system boundaries, breaking data silos. Real-time streaming data technologies are essential for digital transformation.

IoT

IoT Data-driven Data Lake Data Strategy

Introducing watsonx: The future of AI for business

IBM Big Data Hub

MAY 9, 2023

As AI becomes more pervasive, businesses need to feel confident that their models can be relied upon not to “hallucinate” facts or use inappropriate language when interacting with customers. A data store built on open lakehouse architecture, it runs both on premises and across multi-cloud environments.

Data Warehouse

Data Warehouse Machine Learning Cost-Benefit Metadata

Your guide to AWS Analytics at AWS re:Invent 2023

AWS Big Data

NOVEMBER 13, 2023

2023 AWS Analytics Superheroes We are excited to introduce the 2023 AWS Analytics Superheroes at this year’s re:Invent conference! A shapeshifting guardian and protector of data like Data Lynx? 11:30 AM – 12:30 PM (PDT) Ceasars Forum ANT318 | Accelerate innovation with end-to-end serverless data architecture.

Analytics

Analytics Data Lake Data Warehouse Data-driven

Exploring the AI and data capabilities of watsonx

IBM Big Data Hub

JULY 17, 2023

While they require task-specific labeled data for fine tuning, they also offer clients the best cost performance trade-off for non-generative use cases. offers a Prompt Lab, where users can interact with different prompts using prompt engineering on generative AI models for both zero-shot prompting and few-shot prompting.

Machine Learning

Machine Learning Data Warehouse Modeling Cost-Benefit

Real-time streaming data top picks you cannot miss at AWS re:Invent 2023

AWS Big Data

NOVEMBER 8, 2023

Save the date: AWS re:Invent 2023 is happening from November 27 to December 1 in Las Vegas, and you cannot miss it. In today’s data-driven landscape, the quality of data is the foundation upon which the success of organizations and innovations stands. Reserve your seat now! Reserve your seat now! Reserve your seat now!

Data-driven

Data-driven Data Lake Machine Learning Cost-Benefit

Connect your data for faster decisions with AWS

AWS Big Data

NOVEMBER 7, 2023

In this post, we discuss how we’re delivering on these investments with a number of data integration innovations that span AWS databases, analytics, business intelligence (BI), and ML services. They can then seamlessly integrate these predictions in QuickSight to create interactive dashboards that can be shared across their organization.

Dashboards

Dashboards Data-driven Data Integration Data Lake

Use the Amazon Redshift Data API to interact with Amazon Redshift Serverless

AWS Big Data

APRIL 28, 2023

As a data engineer or application developer, for some use cases, you want to interact with the Redshift Serverless data warehouse to load or query data with a simple API endpoint without having to manage persistent connections. For instructions on configuring the AWS CLI, see Setting up the AWS CLI.

Interactive

Interactive Metadata Data Warehouse Data-driven

The Enduring Significance of Data Modeling in the Modern Data-Driven Enterprise

erwin

AUGUST 31, 2023

The Structured Query Language (SQL) becomes the standardized language for interacting with relational databases. The Entity-Relationship (ER) model gains prominence as a tool for conceptual data modeling, helping to bridge the gap between business requirements and database design. Join our upcoming webinar for more insights!

Data-driven

Data-driven Modeling Enterprise Structured Data

Unleashing the power of Presto: The Uber case study

IBM Big Data Hub

SEPTEMBER 25, 2023

Presto is an open source distributed SQL query engine for data analytics and the data lakehouse, designed for running interactive analytic queries against datasets of all sizes, from gigabytes to petabytes. Uber understood that digital superiority required the capture of all their transactional data, not just a sampling.

OLAP

OLAP Data Lake Data-driven Snapshot

Process price transparency data using AWS Glue

AWS Big Data

MAY 4, 2023

AWS Glue is a serverless data integration service that makes it straightforward to discover, prepare, move, and integrate data from multiple sources for analytics, machine learning (ML), and application development. Athena provides a simplified, flexible way to analyze petabytes of data.

Insurance

Insurance Publishing Cost-Benefit Data Lake

Showpad accelerates data maturity to unlock innovation using Amazon QuickSight

AWS Big Data

APRIL 5, 2023

Showpad built new customer-facing embedded dashboards within Showpad eOSTM and migrated its legacy dashboards to Amazon QuickSight , a unified BI service providing modern interactive dashboards, natural language querying, paginated reports, machine learning (ML) insights, and embedded analytics at scale.

Dashboards

Dashboards Reporting Cost-Benefit Visualization

Migrate an existing data lake to a transactional data lake using Apache Iceberg

Build a serverless transactional data lake with Apache Iceberg, Amazon EMR Serverless, and Amazon Athena

Webinars

Trending Sources

Interact with Apache Iceberg tables using Amazon Athena and cross account fine-grained permissions using AWS Lake Formation

Webinars

Enhance monitoring and debugging for AWS Glue jobs using new job observability metrics

Simplifying data processing at Capitec with Amazon Redshift integration for Apache Spark

CIOs press ahead for gen AI edge — despite misgivings

Enhance monitoring and debugging for AWS Glue jobs using new job observability metrics, Part 3: Visualization and trend analysis using Amazon QuickSight

Load data incrementally from transactional data lakes to data warehouses

Visualize Confluent data in Amazon QuickSight using Amazon Athena

Amazon QuickSight helps TalentReef empower its customers to make more informed hiring decisions

UAB IT helps fuel genomic breakthroughs

Amazon Kinesis Data Streams: celebrating a decade of real-time data innovation

Introducing watsonx: The future of AI for business

Your guide to AWS Analytics at AWS re:Invent 2023

Exploring the AI and data capabilities of watsonx

Real-time streaming data top picks you cannot miss at AWS re:Invent 2023

Connect your data for faster decisions with AWS

Use the Amazon Redshift Data API to interact with Amazon Redshift Serverless

The Enduring Significance of Data Modeling in the Modern Data-Driven Enterprise

Unleashing the power of Presto: The Uber case study

Process price transparency data using AWS Glue

Showpad accelerates data maturity to unlock innovation using Amazon QuickSight

Stay Connected