Data Lake, Data Processing and Strategy

Data Lake

Data Processing

Strategy

Migrate an existing data lake to a transactional data lake using Apache Iceberg

AWS Big Data

OCTOBER 3, 2023

A data lake is a centralized repository that you can use to store all your structured and unstructured data at any scale. You can store your data as-is, without having to first structure the data and then run different types of analytics for better business insights. Open AWS Glue Studio. Choose ETL Jobs.

Data Lake

Data Lake Metadata Snapshot Recreation/Entertainment

Build a serverless transactional data lake with Apache Iceberg, Amazon EMR Serverless, and Amazon Athena

AWS Big Data

MARCH 10, 2023

Since the deluge of big data over a decade ago, many organizations have learned to build applications to process and analyze petabytes of data. Data lakes have served as a central repository to store structured and unstructured data at any scale and in various formats.

Data Lake

Data Lake Sales Data Warehouse Snapshot

Join 52,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

How to Optimize the Developer Experience for Monumental Impact

Generative AI Deep Dive: Advancing from Proof of Concept to Production

Understanding User Needs and Satisfying Them

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Leading the Development of Profitable and Sustainable Products

MORE WEBINARS

Trending Sources

Orca Security’s journey to a petabyte-scale data lake with Apache Iceberg and AWS Analytics

AWS Big Data

JULY 20, 2023

With data becoming the driving force behind many industries today, having a modern data architecture is pivotal for organizations to be successful. In this post, we describe Orca’s journey building a transactional data lake using Amazon Simple Storage Service (Amazon S3), Apache Iceberg, and AWS Analytics.

Data Lake

Data Lake Analytics Snapshot Optimization

Webinars

How to Optimize the Developer Experience for Monumental Impact

Generative AI Deep Dive: Advancing from Proof of Concept to Production

Understanding User Needs and Satisfying Them

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Leading the Development of Profitable and Sustainable Products

MORE WEBINARS

Create an end-to-end data strategy for Customer 360 on AWS

AWS Big Data

MARCH 26, 2024

A Gartner Marketing survey found only 14% of organizations have successfully implemented a C360 solution, due to lack of consensus on what a 360-degree view means, challenges with data quality, and lack of cross-functional governance structure for customer data.

Data Strategy

Data Strategy Strategy Data Warehouse Prescriptive Analytics

Governing data in relational databases using Amazon DataZone

AWS Big Data

MAY 7, 2024

Amazon DataZone allows you to simply and securely govern end-to-end data assets stored in your Amazon Redshift data warehouses or data lakes cataloged with the AWS Glue data catalog. Note that a managed data asset is an asset for which Amazon DataZone can manage permissions.

Metadata

Metadata Data Lake Data Processing Data-driven

Petabyte-scale log analytics with Amazon S3, Amazon OpenSearch Service, and Amazon OpenSearch Ingestion

AWS Big Data

MARCH 7, 2024

At the same time, they need to optimize operational costs to unlock the value of this data for timely insights and do so with a consistent performance. With this massive data growth, data proliferation across your data stores, data warehouse, and data lakes can become equally challenging.

Data Lake

Data Lake Analytics Dashboards Metrics

Empowering data-driven excellence: How the Bluestone Data Platform embraced data mesh for success

AWS Big Data

FEBRUARY 27, 2024

Each data producer within the organization has its own data lake in Apache Hudi format, ensuring data sovereignty and autonomy. This enables data-driven decision-making across the organization. Lake Formation – Lake Formation emerged as a cornerstone in Bluestone’s data governance strategy.

Data-driven

Data-driven Data Lake Data Quality Data Governance

CIOs weigh where to place AI bets — and how to de-risk them

CIO Business Intelligence

MARCH 18, 2024

The CIO has strategies in place to address all three. Though a multicloud environment, the agency has most of its cloud implementations hosted on Microsoft Azure, with some on AWS and some on ServiceNow’s 311 citizen information platform. AI tools rely on the data in use in these solutions.

Risk

Risk Cost-Benefit Data Processing Testing

Dairyland powers up for a generative AI edge

CIO Business Intelligence

APRIL 9, 2024

Beginning in 2021, the Minneapolis-based Microsoft partner helped Dairyland migrate from several custom legacy applications to a commercial implementation of Dynamics 365 and an Azure data lake, which set the stage for the power company’s early foray into AI, according to the systems integrator.

Digital Transformation

Digital Transformation Machine Learning Data Lake Software

The Strategy Behind our Denodo Partner Program

Data Virtualization

SEPTEMBER 23, 2020

I’m referring not only to our technology partners, but also to our cloud partners that host the Denodo Platform, Denodo is a very partner-friendly company, and here I’d like to share some thoughts about how Denodo works with our partners.

Strategy

Strategy Data Processing Technology Digital Transformation

TDC Digital leverages IBM Cloud for transparent billing and improved customer satisfaction

IBM Big Data Hub

MAY 19, 2023

According to the research, organizations are adopting cloud ERP models to identify the best alignment with their strategy, business development, workloads and security requirements. Furthermore, TDC Digital had not used any cloud storage solution and experienced latency and downtime while hosting the application in its data center.

Unstructured Data

Unstructured Data Data Processing Manufacturing Data Lake

What a quarter century of digital transformation at PayPal looks like

CIO Business Intelligence

OCTOBER 4, 2023

One strategy, five keys From a technological point of view, the brand’s strategic engine is divided into five investment areas. At the lowest layer is the infrastructure, made up of databases and data lakes. These applications live on innumerable servers, yet some technology is hosted in the public cloud.

Digital Transformation

Digital Transformation Deep Learning Data Lake Risk

Real-time streaming data top picks you cannot miss at AWS re:Invent 2023

AWS Big Data

NOVEMBER 8, 2023

Putting your data to work with generative AI – Innovation Talk Thursday, November 30 | 12:30 – 1:30 PM PST | The Venetian Join Mai-Lan Tomsen Bukovec, Vice President, Technology at AWS to learn how you can turn your data lake into a business advantage with generative AI. Reserve your seat now! Reserve your seat now!

Data-driven

Data-driven Data Lake Machine Learning Cost-Benefit

Periscope Data Expands to Israel, Empowering Data Teams with Powerful Tools

Sisense

DECEMBER 11, 2019

The challenge is to do it right, and a crucial way to achieve it is with decisions based on data and analysis that drive measurable business results. This was the key learning from the Sisense event heralding the launch of Periscope Data in Tel Aviv, Israel — the beating heart of the startup nation. What VCs want from startups.

Data Lake

Data Lake Big Data Sales Data-driven

Running both IT and digital at Alorica

CIO Business Intelligence

JUNE 1, 2022

Finally, make sure you understand your data, because no machine learning solution will work for you if you aren’t working with the right data. Data lakes have a new consumer in AI. Many of our service-based offerings include hosting and executing our customers’ omnichannel platforms.

IT Interactive Marketing Consulting

5 ways to maximize your cloud investment

CIO Business Intelligence

JANUARY 10, 2024

FinOps is part of the equation, but from a CIO perspective, you need a top-down view that starts with the strategy before you talk about the components of it,” McMasters says. What’s the business case for use of the technology, and the strategy for a two- to three-year period, and where do we need to be two to three years from now?

Cost-Benefit

Cost-Benefit Measurement Optimization Metrics

CDP Private Cloud is a Game-changer for Partners

Cloudera

SEPTEMBER 2, 2020

Recently, Cloudera announced the release of Cloudera CDP Private Cloud, delivering the final component of our hybrid cloud strategy. Additionally, lines of business (LOBs) are able to gain access to a shared data lake that is secured and governed by the use of Cloudera Shared Data Experience (SDX).

Cost-Benefit

Cost-Benefit Data Warehouse Data Lake Machine Learning

10 Things AWS Can Do for Your SaaS Company

Smart Data Collective

FEBRUARY 20, 2022

Data storage databases. Your SaaS company can store and protect any amount of data using Amazon Simple Storage Service (S3), which is ideal for data lakes, cloud-native applications, and mobile apps. Well, let’s find out. Artificial intelligence (AI). Easy to use. Hopefully, it was informative and helpful to you.

Cost-Benefit

Cost-Benefit Data Lake Software Machine Learning

How The Cloud Made ‘Data-Driven Culture’ Possible | Part 1

BizAcuity

MAY 10, 2022

Companies planning to scale their business in the next few years without a definite cloud strategy might want to reconsider. 14 years later, in 2020, the pandemic demands for remote work, and overnight revisions to business strategy. The platform is built on S3 and EC2 using a hosted Hadoop framework. The rest is history.

Data-driven

Data-driven IoT Unstructured Data Data Lake

Accelerating revenue growth with real-time analytics: Poshmark’s journey

AWS Big Data

MARCH 20, 2023

The data from the Kinesis data stream is consumed by two applications: A Spark streaming application on Amazon EMR is used to write data from the Kinesis data stream to a data lake hosted on Amazon Simple Storage Service (Amazon S3) in a partitioned way.

Analytics

Analytics Slice and Dice Data Processing Data Lake

5 misconceptions about cloud data warehouses

IBM Big Data Hub

FEBRUARY 2, 2023

Misconception 5: Cloud data warehouses reduce control over your deployment Some DBAs believe that cloud data warehouses lack the control and flexibility of on-prem data warehouses, making it harder to respond to security threats, performance issues or disasters.

Data Warehouse

Data Warehouse Cost-Benefit Unstructured Data Data Architecture

Build incremental data pipelines to load transactional data changes using AWS DMS, Delta 2.0, and Amazon EMR Serverless

AWS Big Data

MARCH 3, 2023

Building data lakes from continuously changing transactional data of databases and keeping data lakes up to date is a complex task and can be an operational challenge. You can then apply transformations and store data in Delta format for managing inserts, updates, and deletes.

Data Lake

Data Lake Dashboards Metrics Metadata

Build efficient ETL pipelines with AWS Step Functions distributed map and redrive feature

AWS Big Data

DECEMBER 18, 2023

Solution overview One of the common functionalities involved in data pipelines is extracting data from multiple data sources and exporting it to a data lake or synchronizing the data to another database. There are multiple tables related to customers and order data in the RDS database.

Metadata

Metadata Visualization Data Lake Data-driven

Accomplish Agile Business Intelligence & Analytics For Your Business

datapine

APRIL 15, 2020

When it comes to implementing and managing a successful BI strategy we have always proclaimed: start small, use the right BI tools , and involve your team. You need to determine if you are going with an on-premise or cloud-hosted strategy. You want an organization-wide buy-in of your business intelligence strategy.

Business Intelligence

Business Intelligence Analytics Testing Dashboards

How Amazon Finance Automation built a data mesh to support distributed data ownership and centralize governance

AWS Big Data

JULY 14, 2023

Consumers prioritized data discoverability, fast data access, low latency, and high accuracy of data. These inputs reinforced the need of a unified data strategy across the FinOps teams. We decided to build a scalable data management product that is based on the best practices of modern data architecture.

Finance

Finance Metadata Big Data Recreation/Entertainment

Announcing the 2021 Data Impact Awards

Cloudera

MAY 12, 2021

2020 saw us hosting our first ever fully digital Data Impact Awards ceremony, and it certainly was one of the highlights of our year. We saw a record number of entries and incredible examples of how customers were using Cloudera’s platform and services to unlock the power of data. DATA FOR GOOD.

Digital Transformation

Digital Transformation Machine Learning Optimization Data Lake

Analyze Amazon S3 storage costs using AWS Cost and Usage Reports, Amazon S3 Inventory, and Amazon Athena

AWS Big Data

FEBRUARY 2, 2023

Since its launch in 2006, Amazon Simple Storage Service (Amazon S3) has experienced major growth, supporting multiple use cases such as hosting websites, creating data lakes, serving as object storage for consumer applications, storing logs, and archiving data. This could be your data lake or application S3 bucket.

Reporting

Reporting Data Lake Management Optimization

Unlocking Data Storage: The Traditional Data Warehouse vs. Cloud Data Warehouse

Sisense

NOVEMBER 12, 2020

The warehouse being hosted in the cloud makes it more accessible, and with a rise in cloud SaaS products, integrating a company’s myriad cloud apps (Salesforce, Marketo, etc.) with a cloud data warehouse is simple. Data lakes are essentially sets of structured and unstructured data living in flat files in some kind of data storage.

Data Warehouse

Data Warehouse Data Lake OLAP Data-driven

Federate Amazon QuickSight access with open-source identity provider Keycloak

AWS Big Data

JUNE 13, 2023

Organizations are working toward centralizing their identity and access strategy across all their applications, including on-premises and third-party. Insert your specific host domain name where the Keycloak application resides in the following URL: [link] /realms/aws-realm/protocol/saml/descriptor.

Metadata

Metadata Dashboards Business Intelligence Management

Stitch Fix seamless migration: Transitioning from self-managed Kafka to Amazon MSK

AWS Big Data

SEPTEMBER 22, 2023

At Stitch Fix, we have been powered by data science since its foundation and rely on many modern data lake and data processing technologies. In our infrastructure, Apache Kafka has emerged as a powerful tool for managing event streams and facilitating real-time data processing.

Management

Management Metrics Cost-Benefit Data Lake

Accelerate your data warehouse migration to Amazon Redshift – Part 7

AWS Big Data

OCTOBER 17, 2023

Tens of thousands of customers use Amazon Redshift to gain business insights from their data. With Amazon Redshift, you can use standard SQL to query data across your data warehouse, operational data stores, and data lake. After you install the data extraction agent, register it in AWS SCT.

Data Warehouse

Data Warehouse Data Processing Data Lake Management

Themes and Conferences per Pacoid, Episode 8

Domino Data Lab

APRIL 3, 2019

Fun fact : I co-founded an e-commerce company (realistically, a mail-order catalog hosted online) in December 1992 using one of those internetworking applications called Gopher , which was vaguely popular at the time. So we had three tiers providing a separation of concerns: presentation, logic, data. the flywheel effect.

Data Governance

Data Governance Machine Learning Metadata Big Data

Extreme data center pressure? Burst to the cloud with CDP!

Cloudera

NOVEMBER 12, 2020

Moving to a cloud-only based model allows for flexible provisioning, but the costs accrued for that strategy rapidly negate the advantage of flexibility. . For example, the bank from our example might have separate destination data lakes for their perpetual and periodic workloads to support addressing these VIP workloads separately.

Data Warehouse

Data Warehouse Reporting Risk Cost-Benefit

The Gartner 2021 Leadership Vision for Data & Analytics Leaders Webinar Q&A

Andrew White

JANUARY 11, 2021

On January 4th I had the pleasure of hosting a webinar. It was titled, The Gartner 2021 Leadership Vision for Data & Analytics Leaders. This was for the Chief Data Officer, or head of data and analytics. Do you recommend a consulting approach strategy rather than a CDO strategy? Governance.

Data Analytics

Data Analytics Analytics Data-driven Finance

What is Data Mapping?

Jet Global

FEBRUARY 23, 2024

An on-premise solution provides a high level of control and customization as it is hosted and managed within the organization’s physical infrastructure, but it can be expensive to set up and maintain. Next, identify the data sources that will be involved in the mapping.

Data Warehouse

Data Warehouse Reporting Data Transformation Sales

FINRA CIO Steve Randich pushes the public cloud forward

CIO Business Intelligence

FEBRUARY 10, 2023

While managing unstructured data remains a challenge for 36% of organizations, according to the 2022 Foundry Data and Analytics Research survey, many IT leaders are actively seeking ways of harnessing all types of data stored in data lakes.

Unstructured Data

Unstructured Data Data Lake Machine Learning Enterprise

CIOs rise to the ESG reporting challenge

CIO Business Intelligence

JANUARY 30, 2024

“Always the gatekeepers of much of the data necessary for ESG reporting, CIOs are finding that companies are even more dependent on them,” says Nancy Mentesana, ESG executive director at Labrador US, a global communications firm focused on corporate disclosure documents. There are several things you need to report attached to that number.”

Reporting

Reporting Data Quality Strategy Data-driven

UAB IT helps fuel genomic breakthroughs

CIO Business Intelligence

MARCH 10, 2022

And analyst Dr. Nimita Limaye, research vice president of life sciences R&D strategy and technology at IDC, says such IT efforts at institutions like UAB are vital given competition from the private sector. Next up: AI and data lake decisions.

IT Data Lake Digital Transformation Data Governance

10 Keys to a Secure Cloud Data Lakehouse

Cloudera

OCTOBER 25, 2022

The data lakehouse is gaining in popularity because it enables a single platform for all your enterprise data with the flexibility to run any analytic and machine learning (ML) use case. Cloud data lakehouses provide significant scaling, agility, and cost advantages compared to cloud data lakes and cloud data warehouses.

Data Processing

Data Processing Data Lake Cost-Benefit Risk

Announcing the 2020 Data Impact Award Winners

Cloudera

NOVEMBER 18, 2020

In fact, each of the 29 finalists represented organizations running cutting-edge use cases that showcase a winning enterprise data cloud strategy. The technological linchpin of its digital transformation has been its Enterprise Data Architecture & Governance platform.

Internet Publishing and Broadcasting

Internet Publishing and Broadcasting Data-driven Broadcasting Digital Transformation

Capital Group invests big in talent development

CIO Business Intelligence

JULY 29, 2022

That focus includes not only the firm’s customer-facing strategies but also its commitment to investing in the development of its employees, a strategy that is paying off, as evidenced by Capital Group’s No. The bootcamp broadened my understanding of key concepts in data engineering.

Data Lake

Data Lake Software Data Processing Structured Data

Get Your Analytics Insights Instantly – Without Abandoning Central IT

Cloudera

JANUARY 21, 2021

In one institution we recently spoke with, they told us it took them over 30 weeks to procure and deploy a new data warehouse, while with CDW they got everything up and running in just a few seconds (after, of course, a few days obtaining the data migration and policy clearances involved). . Central control of security and governance.

Data Lake

Data Lake Data Warehouse IT Analytics

Top 15 data management platforms available today

CIO Business Intelligence

SEPTEMBER 22, 2023

The term “data management platform” can be confusing because, while it sounds like a generalized product that works with all forms of data as part of generalized data management strategies, the term has been more narrowly defined of late as one targeted to marketing departments’ needs. Of course, marketing also works.

Management

Management Advertising Data Lake Sales

Modern Data Architecture for Telecommunications

Cloudera

SEPTEMBER 6, 2022

Previously, there were three types of data structures in telco: . Entity data sets — i.e. marketing data lakes . The result has been an extraordinary volume of data redundancy across the business, leading to disaggregated data strategy, unknown compliance exposures, and inconsistencies in data-based processes. .

Data Architecture

Data Architecture Cost-Benefit Digital Transformation Business Driver

Migrate an existing data lake to a transactional data lake using Apache Iceberg

Build a serverless transactional data lake with Apache Iceberg, Amazon EMR Serverless, and Amazon Athena

Webinars

Trending Sources

Orca Security’s journey to a petabyte-scale data lake with Apache Iceberg and AWS Analytics

Webinars

Create an end-to-end data strategy for Customer 360 on AWS

Governing data in relational databases using Amazon DataZone

Petabyte-scale log analytics with Amazon S3, Amazon OpenSearch Service, and Amazon OpenSearch Ingestion

Empowering data-driven excellence: How the Bluestone Data Platform embraced data mesh for success

CIOs weigh where to place AI bets — and how to de-risk them

Dairyland powers up for a generative AI edge

The Strategy Behind our Denodo Partner Program

TDC Digital leverages IBM Cloud for transparent billing and improved customer satisfaction

What a quarter century of digital transformation at PayPal looks like

Real-time streaming data top picks you cannot miss at AWS re:Invent 2023

Periscope Data Expands to Israel, Empowering Data Teams with Powerful Tools

Running both IT and digital at Alorica

5 ways to maximize your cloud investment

CDP Private Cloud is a Game-changer for Partners

10 Things AWS Can Do for Your SaaS Company

How The Cloud Made ‘Data-Driven Culture’ Possible | Part 1

Accelerating revenue growth with real-time analytics: Poshmark’s journey

5 misconceptions about cloud data warehouses

Build incremental data pipelines to load transactional data changes using AWS DMS, Delta 2.0, and Amazon EMR Serverless

Build efficient ETL pipelines with AWS Step Functions distributed map and redrive feature

Accomplish Agile Business Intelligence & Analytics For Your Business

How Amazon Finance Automation built a data mesh to support distributed data ownership and centralize governance

Announcing the 2021 Data Impact Awards

Analyze Amazon S3 storage costs using AWS Cost and Usage Reports, Amazon S3 Inventory, and Amazon Athena

Unlocking Data Storage: The Traditional Data Warehouse vs. Cloud Data Warehouse

Federate Amazon QuickSight access with open-source identity provider Keycloak

Stitch Fix seamless migration: Transitioning from self-managed Kafka to Amazon MSK

Accelerate your data warehouse migration to Amazon Redshift – Part 7

Themes and Conferences per Pacoid, Episode 8

Extreme data center pressure? Burst to the cloud with CDP!

The Gartner 2021 Leadership Vision for Data & Analytics Leaders Webinar Q&A

What is Data Mapping?

FINRA CIO Steve Randich pushes the public cloud forward

CIOs rise to the ESG reporting challenge

UAB IT helps fuel genomic breakthroughs

10 Keys to a Secure Cloud Data Lakehouse

Announcing the 2020 Data Impact Award Winners

Capital Group invests big in talent development

Get Your Analytics Insights Instantly – Without Abandoning Central IT

Top 15 data management platforms available today

Modern Data Architecture for Telecommunications

Stay Connected