Data Lake, Data Science and Interactive

Build a serverless transactional data lake with Apache Iceberg, Amazon EMR Serverless, and Amazon Athena

AWS Big Data

MARCH 10, 2023

Since the deluge of big data over a decade ago, many organizations have learned to build applications to process and analyze petabytes of data. Data lakes have served as a central repository to store structured and unstructured data at any scale and in various formats.

Data Lake

Data Lake Sales Data Warehouse Snapshot

Accelerate data science feature engineering on transactional data lakes using Amazon Athena with Apache Iceberg

AWS Big Data

JUNE 20, 2023

Amazon Athena is an interactive query service that makes it easy to analyze data in Amazon Simple Storage Service (Amazon S3) and data sources residing in AWS, on-premises, or other cloud systems using SQL or Python. He works with enterprise customers building data products, analytics platforms, and solutions on AWS.

Data Lake

Data Lake Data Science Recreation/Entertainment Experimentation

5 things on our data and AI radar for 2021

O'Reilly on Data

FEBRUARY 19, 2021

The Right Solution for Your Data: Cloud Data Lakes and Data Lakehouses. Data lakes have experienced a fairly robust resurgence over the last few years, specifically cloud data lakes. A Wave of Cloud-Native, Distributed Data Frameworks. What will that lead to in 2021?

Data Lake

Data Lake Data Warehouse Machine Learning Modeling

Webinars

Leading the Development of Profitable and Sustainable Products

How To Get Promoted In Product Management

MORE WEBINARS

Create an Apache Hudi-based near-real-time transactional data lake using AWS DMS, Amazon Kinesis, AWS Glue streaming ETL, and data visualization using Amazon QuickSight

AWS Big Data

AUGUST 3, 2023

Data analytics on operational data at near-real time is becoming a common need. Due to the exponential growth of data volume, it has become common practice to replace read replicas with data lakes to have better scalability and performance. For more information, see Changing the default settings for your data lake.

Data Lake

Data Lake Visualization Dashboards Insurance

Azure Data Sources for Data Science and Machine Learning

Jen Stirrup

MAY 5, 2020

Recently, I gave a Make Your Data Work Monday webinar on the complexities of the data sources for data science in Azure, and I thought it important enough to turn into an actual post. How can you differentiate the different opportunities to store your data in Azure?

Machine Learning

Machine Learning Data Science Data Lake Big Data

The Future of the Data Lakehouse – Open

Cloudera

JUNE 18, 2022

Cloudera customers run some of the biggest data lakes on earth. These lakes power mission critical large scale data analytics, business intelligence (BI), and machine learning use cases, including enterprise data warehouses. On data warehouses and data lakes. Iterations of the lakehouse.

Data Lake

Data Lake Data Warehouse Machine Learning Cost-Benefit

Why the Data Journey Manifesto?

DataKitchen

JUNE 12, 2023

We had been talking about “Agile Analytic Operations,” “DevOps for Data Teams,” and “Lean Manufacturing For Data,” but the concept was hard to get across and communicate. I spent much time de-categorizing DataOps: we are not discussing ETL, Data Lake, or Data Science.

Testing

Testing Data Lake Dashboards Data Science

The Future of the Data Lakehouse – Open

CIO Business Intelligence

JUNE 23, 2022

Cloudera customers run some of the biggest data lakes on earth. These lakes power mission critical large scale data analytics, business intelligence (BI), and machine learning use cases, including enterprise data warehouses. On data warehouses and data lakes. Iterations of the lakehouse.

Data Lake

Data Lake Data Warehouse Machine Learning Cost-Benefit

Your guide to AWS Analytics at AWS re:Invent 2023

AWS Big Data

NOVEMBER 13, 2023

11:30 AM – 12:30 PM (PDT) Ceasars Forum ANT318 | Accelerate innovation with end-to-end serverless data architecture. 4:30 PM – 5:30 PM (PDT) Wynn ANT207 | Understand your data with business context. 1:00 PM – 2:00 PM (PDT) Venetian ANT201 | Accelerate innovation with real-time data.

Analytics

Analytics Data Lake Data Warehouse Data-driven

OCBC Bank Accelerates Its Data Strategy with Cloudera

Cloudera

DECEMBER 14, 2022

OCBC identified the need to upgrade its data lake technology as part of an enterprise data science initiative to introduce a more resilient infrastructure and platform capable of managing projects with increasing volume, variety and velocity of data, while also enabling real-time analytics. .

Data Strategy

Data Strategy Strategy IT Contextual Data

What is a Data Pipeline?

Jet Global

MAY 9, 2024

Data pipelines are designed to automate the flow of data, enabling efficient and reliable data movement for various purposes, such as data analytics, reporting, or integration with other systems. This can include tasks such as data ingestion, cleansing, filtering, aggregation, or standardization.

Data Lake

Data Lake Data Warehouse Business Intelligence Machine Learning

Moving Enterprise Data From Anywhere to Any System Made Easy

Cloudera

JUNE 2, 2022

This blog aims to answer two questions: What is a universal data distribution service? Why does every organization need it when using a modern data stack? Every organization on the hybrid cloud journey needs the ability to take control of their data flows from origination through all points of consumption.

Enterprise

Enterprise Data Lake Data Collection Data-driven

How gaming companies can use Amazon Redshift Serverless to build scalable analytical applications faster and easier

AWS Big Data

MARCH 7, 2023

A data hub contains data at multiple levels of granularity and is often not integrated. It differs from a data lake by offering data that is pre-validated and standardized, allowing for simpler consumption by users. Data hubs and data lakes can coexist in an organization, complementing each other.

Analytics

Analytics Data Warehouse Data Lake Metadata

Accelerate Your Data Mesh in the Cloud with Cloudera Data Engineering and Modak NabuTM

Cloudera

OCTOBER 11, 2021

Modak Nabu automates repetitive tasks in the data preparation process and thus accelerates the data preparation by 4x. Modak Nabu reliably curates datasets for any line of business and personas, from business analysts to data scientists. Customers using Modak Nabu with CDP today have deployed Data Lakes and.

Data Lake

Data Lake Cost-Benefit Data-driven Dashboards

Create an end-to-end data strategy for Customer 360 on AWS

AWS Big Data

MARCH 26, 2024

Customer 360 (C360) provides a complete and unified view of a customer’s interactions and behavior across all touchpoints and channels. This view is used to identify patterns and trends in customer behavior, which can inform data-driven decisions to improve business outcomes. Then, you transform this data into a concise format.

Data Strategy

Data Strategy Strategy Data Warehouse Prescriptive Analytics

DIY cloud cost management: The strategic case for building your own tools

CIO Business Intelligence

APRIL 25, 2024

At a minimum, your DIY cloud cost optimization team will require an enterprise architect who understands the technology, says Garcia, who also recommends a financial developer or somebody with financial and data science experience. You can then start building data lakes and models around your data.

Management

Management Optimization Strategy Enterprise

Bring your workforce identity to Amazon EMR Studio and Athena

AWS Big Data

MARCH 5, 2024

Amazon EMR Studio is a unified data analysis environment where you can develop data engineering and data science applications. You can now develop and run interactive queries on Amazon Athena from EMR Studio (for more details, refer to Amazon EMR Studio adds interactive query editor powered by Amazon Athena ).

Data Lake

Data Lake Management Dashboards Data-driven

Turning Streams Into Data Products

Cloudera

JUNE 16, 2022

CSP was recently recognized as a leader in the 2022 GigaOm Radar for Streaming Data Platforms report. SSB provides a comprehensive interactive user interface for developers, data analysts, and data scientists to write streaming applications with industry standard SQL. Without context, streaming data is useless.”

Data Lake

Data Lake Manufacturing Metadata Dashboards

Moving Enterprise Data From Anywhere to Any System Made Easy

CIO Business Intelligence

JULY 13, 2022

This blog aims to answer two questions: What is a universal data distribution service? Why does every organization need it when using a modern data stack? Every organization on the hybrid cloud journey needs the ability to take control of their data flows from origination through all points of consumption.

Enterprise

Enterprise Data Lake Data Collection Data-driven

What a quarter century of digital transformation at PayPal looks like

CIO Business Intelligence

OCTOBER 4, 2023

At the lowest layer is the infrastructure, made up of databases and data lakes. The fourth is called the merchant, consumer, and developer experience layer, which includes the web interface, mobile applications, and APIs that allow customers to use PayPal’s service interactively and programmatically.

Digital Transformation

Digital Transformation Deep Learning Data Lake Risk

Federated Learning, Machine Learning, Decentralized Data

Cloudera

DECEMBER 8, 2020

Federated Learning is a paradigm in which machine learning models are trained on decentralized data. Instead of collecting data on a single server or data lake, it remains in place — on smartphones, industrial sensing equipment, and other edge devices — and models are trained on-device. The Turbofan Tycoon prototype.

Machine Learning

Machine Learning Data Lake Reporting Modeling

Breaking barriers in geospatial: Amazon Redshift, CARTO, and H3

AWS Big Data

MAY 16, 2024

However, visualizing and analyzing large-scale geospatial data presents a formidable challenge due to the sheer volume and intricacy of information. The need to balance detail and context while maintaining real-time interactivity can lead to issues of scalability and rendering complexity. To learn more, visit CARTO.

Data Warehouse

Data Warehouse Visualization Cost-Benefit Optimization

Celebrating Data Superheroes: The 2021 Data Impact Awards Winners

Cloudera

NOVEMBER 18, 2021

By adopting a custom developed application based on the Cloudera ecosystem, Carrefour has combined the legacy systems into one platform which provides access to customer data in a single data lake. EVA unifies data from MTN’s different operator systems, creating a 360° view of subscribers.

Data Lake

Data Lake Cost-Benefit Digital Transformation Risk

How Data is Helping Organizations to Improve the Employee Lifecycle

Cloudera

JANUARY 18, 2022

With a solution based on Cloudera Data Science Workbench (CDSW), the bank implemented a more streamlined loan approval process that reduced processing time from a week to just hours. As a result of this innovative data solution, the company helped customers while keeping its default rate low. .

Data Lake

Data Lake Digital Transformation Data-driven Dashboards

Announcing the 2020 Data Impact Award Winners

Cloudera

NOVEMBER 18, 2020

OVO UnCover enables access to real-time customer data using advanced, intelligent data analytics and machine learning to personalize the customer product interaction experience. This enabled Merck KGaA to control and maintain secure data access, and greatly increase business agility for multiple users.

Internet Publishing and Broadcasting

Internet Publishing and Broadcasting Data-driven Broadcasting Digital Transformation

Analyze Amazon S3 storage costs using AWS Cost and Usage Reports, Amazon S3 Inventory, and Amazon Athena

AWS Big Data

FEBRUARY 2, 2023

Since its launch in 2006, Amazon Simple Storage Service (Amazon S3) has experienced major growth, supporting multiple use cases such as hosting websites, creating data lakes, serving as object storage for consumer applications, storing logs, and archiving data. This could be your data lake or application S3 bucket.

Reporting

Reporting Data Lake Management Optimization

Create a Value Blizzard with Snowflake and Microsoft Azure

CDW Research Hub

DECEMBER 4, 2019

It operates as a consistent data management framework to manage, move and protect data across disparate sources. Microsoft Azure Data Lake provides a way to store the data, and Snowflake acts as the data fabric in your journey.

Data Warehouse

Data Warehouse Data mining Data Lake Dashboards

Five Strategies to Accelerate Data Product Development

Cloudera

JULY 26, 2021

Once we have identified those capabilities, the second article explores how the Cloudera Data Platform delivers those prerequisite capabilities and has enabled organizations such as IQVIA to innovate in Healthcare with the Human Data Science Cloud. . Business and Technology Forces Shaping Data Product Development.

Strategy

Strategy Data Science Marketing Unstructured Data

Machine Learning and AI Underpin Predictive Analytics to Achieve Clinical Breakthroughs

Cloudera

JULY 18, 2018

To arrive at quality data, organizations are spending significant levels of effort on data integration, visualization, and deployment activities. Additionally, organizations are increasingly restrained due to budgetary constraints and having limited data sciences resources.

Machine Learning

Machine Learning Predictive Analytics Analytics Prescriptive Analytics

How to use foundation models and trusted governance to manage AI workflow risk

IBM Big Data Hub

OCTOBER 16, 2023

They are used in everything from robotics to tools that reason and interact with humans. How to scale AL and ML with built-in governance A fit-for-purpose data store built on an open lakehouse architecture allows you to scale AI and ML while providing built-in governance tools.

Risk

Risk Modeling Management Metadata

Improve productivity by using keyboard shortcuts in Amazon Athena query editor

AWS Big Data

MARCH 7, 2023

Amazon Athena is a serverless, interactive analytics service built on open-source frameworks, supporting open-table and file formats. Athena provides a simplified, flexible way to analyze petabytes of data where it lives. He has over 5 years of experience working in the field of big data and data science.

Data Lake

Data Lake Data-driven Big Data Interactive

A hybrid approach in healthcare data warehousing with Amazon Redshift

AWS Big Data

FEBRUARY 21, 2023

The data vault approach is a method and architectural framework for providing a business with data analytics services to support business intelligence, data warehousing, analytics, and data science needs. Amazon Redshift RA3 instances and Amazon Redshift Serverless are perfect choices for a data vault.

Data Warehouse

Data Warehouse Data Lake Cost-Benefit Metadata

PODCAST: Making AI Real – Episode 4: Unlocking the Value of Enterprise AI with Data Engineering Capabilities

bridgei2i

MARCH 3, 2021

Any specific client interaction you felt Prinkan recently, maybe you can share some interesting use cases on that? Curious to know, like, what keeps you busy apart from data, lakes and technologies, what we just discussed? Thanks for making the time for this interaction today. Pavan: Prinkan, loved the conversation.

Enterprise

Enterprise Digital Transformation Data-driven Interactive

5 Trends in Financial Services That Will Change How You Think about Your Data

Data Virtualization

MARCH 23, 2023

Innovative new technologies are redefining the sector, shaping the services that financial organizations offer, the ways in which they interact with consumers, and the ways in which they apply. Reading Time: 3 minutes The financial services sector is accelerating its adoption of digital technology.

Interactive

Interactive Data Integration Technology Management

5 Trends in Financial Services That Will Change How You Think about Your Data

Data Virtualization

MARCH 23, 2023

Innovative new technologies are redefining the sector, shaping the services that financial organizations offer, the ways in which they interact with consumers, and the ways in which they apply. Reading Time: 3 minutes The financial services sector is accelerating its adoption of digital technology.

Interactive

Interactive Data Integration Technology Management

Themes and Conferences per Pacoid, Episode 12

Domino Data Lab

AUGUST 8, 2019

Paco Nathan ‘s latest monthly article covers Sci Foo as well as why data science leaders should rethink hiring and training priorities for their data science teams. In this episode I’ll cover themes from Sci Foo and important takeaways that data science teams should be tracking. Introduction.

Data Science

Data Science Machine Learning Data Governance Statistics

An A-Z Data Adventure on Cloudera’s Data Platform

Cloudera

DECEMBER 21, 2020

You will learn all the parts of Cloudera’s Data Platform that together will accelerate your everyday Data Worker tasks. This demo-guided blog aims to inspire further curiosity and learning, as well as fuel a fruitful, interactive dialogue – we welcome you to reach out to us if any particular part piques your interest. .

Dashboards

Dashboards Visualization Data Warehouse Data Lake

Building Bridges: Data and BI Teams Partnering on an Analytics Solution

Sisense

JANUARY 15, 2021

The modern data team has gained traction in large part thanks to the startups in Silicon Valley that have put an emphasis on collecting, analyzing, and commoditizing data. These younger companies have invested in talent with specific data science skills, particularly with code-driven data analytics.

Analytics

Analytics Data-driven Business Intelligence Visualization

How Amazon Finance Automation built a data mesh to support distributed data ownership and centralize governance

AWS Big Data

JULY 14, 2023

For example, data producers need to onboard their dataset to the global catalog, and complete their permissions management before they can share that with consumers. We made interaction, including producer-consumer onboarding, data access request, approvals, and governance, quicker through the self-service tools in our application.

Finance

Finance Metadata Big Data Recreation/Entertainment

Themes and Conferences per Pacoid, Episode 8

Domino Data Lab

APRIL 3, 2019

The top three items are essentially “the devil you know” for firms which want to invest in data science: data platform, integration, data prep. Data governance shows up as the fourth-most-popular kind of solution that enterprise teams were adopting or evaluating during 2019. Rinse, lather, repeat.

Data Governance

Data Governance Machine Learning Metadata Big Data

Data for All: Empowering Users With AI, ML, and Analytics

Sisense

JUNE 12, 2019

For all the people in the data, or should I say “Python” business, it’s a long way from the development environment (e.g. The operational data science pipeline should be able to ingest new data hand in hand with the continuous support of model improvement which keeps the production system stable.

Analytics

Analytics Data-driven Dashboards IoT

7 key Microsoft Azure analytics services (plus one extra)

CIO Business Intelligence

JUNE 29, 2022

The recent announcement of the Microsoft Intelligent Data Platform makes that more obvious, though analytics is only one part of that new brand. Azure Data Explorer is used to store and query data in services such as Microsoft Purview, Microsoft Defender for Endpoint, Microsoft Sentinel, and Log Analytics in Azure Monitor.

Data Lake

Data Lake Analytics Data Warehouse Machine Learning

The Gartner 2021 Leadership Vision for Data & Analytics Leaders Webinar Q&A

Andrew White

JANUARY 11, 2021

As such a head of analytics, BI and data science may emerge. Are you anticipating continued separation of “BI/Analytics” teams from “Data Science” teams or are those roles merging in the years ahead? Many data science labs are set up as shared services. That’s the idea.

Data Analytics

Data Analytics Analytics Data-driven Finance

Two Acquisitions in Two Weeks!

Rita Sallam

AUGUST 17, 2016

For the past couple of years, Gartner has been describing the next wave of BI market disruption, smart data discovery, which Beyondcore has pioneered (along with IBM Watson Analytics, SparkBeyond and DataRPM). Here is what we know about Workday’s plans for Platfora: Workday will no longer sell Platfora as a standalone offering.

Scorecard

Scorecard Visualization Sales Marketing

Build a serverless transactional data lake with Apache Iceberg, Amazon EMR Serverless, and Amazon Athena

Accelerate data science feature engineering on transactional data lakes using Amazon Athena with Apache Iceberg

Webinars

Trending Sources

5 things on our data and AI radar for 2021

Webinars

Create an Apache Hudi-based near-real-time transactional data lake using AWS DMS, Amazon Kinesis, AWS Glue streaming ETL, and data visualization using Amazon QuickSight

Azure Data Sources for Data Science and Machine Learning

The Future of the Data Lakehouse – Open

Why the Data Journey Manifesto?

The Future of the Data Lakehouse – Open

Your guide to AWS Analytics at AWS re:Invent 2023

OCBC Bank Accelerates Its Data Strategy with Cloudera

What is a Data Pipeline?

Moving Enterprise Data From Anywhere to Any System Made Easy

How gaming companies can use Amazon Redshift Serverless to build scalable analytical applications faster and easier

Accelerate Your Data Mesh in the Cloud with Cloudera Data Engineering and Modak NabuTM

Create an end-to-end data strategy for Customer 360 on AWS

DIY cloud cost management: The strategic case for building your own tools

Bring your workforce identity to Amazon EMR Studio and Athena

Turning Streams Into Data Products

Moving Enterprise Data From Anywhere to Any System Made Easy

What a quarter century of digital transformation at PayPal looks like

Federated Learning, Machine Learning, Decentralized Data

Breaking barriers in geospatial: Amazon Redshift, CARTO, and H3

Celebrating Data Superheroes: The 2021 Data Impact Awards Winners

How Data is Helping Organizations to Improve the Employee Lifecycle

Announcing the 2020 Data Impact Award Winners

Analyze Amazon S3 storage costs using AWS Cost and Usage Reports, Amazon S3 Inventory, and Amazon Athena

Create a Value Blizzard with Snowflake and Microsoft Azure

Five Strategies to Accelerate Data Product Development

Machine Learning and AI Underpin Predictive Analytics to Achieve Clinical Breakthroughs

How to use foundation models and trusted governance to manage AI workflow risk

Improve productivity by using keyboard shortcuts in Amazon Athena query editor

A hybrid approach in healthcare data warehousing with Amazon Redshift

PODCAST: Making AI Real – Episode 4: Unlocking the Value of Enterprise AI with Data Engineering Capabilities

5 Trends in Financial Services That Will Change How You Think about Your Data

5 Trends in Financial Services That Will Change How You Think about Your Data

Themes and Conferences per Pacoid, Episode 12

An A-Z Data Adventure on Cloudera’s Data Platform

Building Bridges: Data and BI Teams Partnering on an Analytics Solution

How Amazon Finance Automation built a data mesh to support distributed data ownership and centralize governance

Themes and Conferences per Pacoid, Episode 8

Data for All: Empowering Users With AI, ML, and Analytics

7 key Microsoft Azure analytics services (plus one extra)

The Gartner 2021 Leadership Vision for Data & Analytics Leaders Webinar Q&A

Two Acquisitions in Two Weeks!

Stay Connected