2022, Metadata and Unstructured Data

2022

Metadata

Unstructured Data

What is a data architect? Skills, salaries, and how to become a data framework master

CIO Business Intelligence

OCTOBER 13, 2023

Established and emerging data technologies: Data architects need to understand established data management and reporting technologies, and have some knowledge of columnar and NoSQL databases, predictive analytics, data visualization, and unstructured data. Are data architects in demand?

Data Architecture

Data Architecture Data Warehouse Statistics Visualization

Use Apache Iceberg in a data lake to support incremental data processing

AWS Big Data

MARCH 2, 2023

Apache Iceberg is an open table format for very large analytic datasets, which captures metadata information on the state of datasets as they evolve and change over time. Iceberg has become very popular for its support for ACID transactions in data lakes and features like schema and partition evolution, time travel, and rollback.

Data Lake

Data Lake Data Processing Metadata Snapshot

Join 52,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Webinars

How to Optimize the Developer Experience for Monumental Impact

Generative AI Deep Dive: Advancing from Proof of Concept to Production

Understanding User Needs and Satisfying Them

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Leading the Development of Profitable and Sustainable Products

MORE WEBINARS

Trending Sources

The Future Is Hybrid Data, Embrace It

Cloudera

JUNE 7, 2022

In the past decade, the amount of structured data created, captured, copied, and consumed globally has grown from less than 1 ZB in 2011 to nearly 14 ZB in 2020. Impressive, but dwarfed by the amount of unstructured data, cloud data, and machine data – another 50 ZB.

IT Data Architecture Unstructured Data Big Data

Webinars

How to Optimize the Developer Experience for Monumental Impact

Generative AI Deep Dive: Advancing from Proof of Concept to Production

Understanding User Needs and Satisfying Them

Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You Need to Know

Leading the Development of Profitable and Sustainable Products

MORE WEBINARS

The Future Is Hybrid Data, Embrace It

CIO Business Intelligence

JUNE 23, 2022

IT Data Architecture Unstructured Data Big Data

Demystifying Modern Data Platforms

Cloudera

SEPTEMBER 15, 2022

July brings summer vacations, holiday gatherings, and for the first time in two years, the return of the Massachusetts Institute of Technology (MIT) Chief Data Officer symposium as an in-person event. A key area of focus for the symposium this year was the design and deployment of modern data platforms.

Data Lake

Data Lake Data Architecture Data-driven Data Warehouse

Optimize data layout by bucketing with Amazon Athena and AWS Glue to accelerate downstream queries

AWS Big Data

APRIL 25, 2024

In the era of data, organizations are increasingly using data lakes to store and analyze vast amounts of structured and unstructured data. Data lakes provide a centralized repository for data from various sources, enabling organizations to unlock valuable insights and drive data-driven decision-making.

Optimization

Optimization Data Lake Cost-Benefit Reporting

Backtesting index rebalancing arbitrage with Amazon EMR and Apache Iceberg

AWS Big Data

JULY 3, 2023

Terminology Let’s first discuss some of the terminology used in this post: Research data lake on Amazon S3 – A data lake is a large, centralized repository that allows you to manage all your structured and unstructured data at any scale. One of the key features of Iceberg is its support for scalable data versioning.

Snapshot

Snapshot Data Lake Testing Strategy

Habib Bank manages data at scale with Cloudera Data Platform

Cloudera

NOVEMBER 17, 2022

While Cloudera CDH was already a success story at HBL, in 2022, HBL identified the need to move its customer data centre environment from Cloudera’s CDH to Cloudera Data Platform (CDP) Private Cloud to accommodate growing volumes of data. Smooth, hassle-free deployment in just six weeks.

Management

Management Data Lake Consulting Unstructured Data

The Modern Data Lakehouse: An Architectural Innovation

Cloudera

SEPTEMBER 9, 2022

Imagine quickly answering burning business questions nearly instantly, without waiting for data to be found, shared, and ingested. Imagine independently discovering rich new business insights from both structured and unstructured data working together, without having to beg for data sets to be made available.

Metadata

Metadata Machine Learning Unstructured Data Data Lake

Choosing an open table format for your transactional data lake on AWS

AWS Big Data

JUNE 9, 2023

Iceberg doesn’t optimize file sizes or run automatic table services (for example, compaction or clustering) when writing, so streaming ingestion will create many small data and metadata files. Metadata table s eliminate slow S3 file listing operations.

Data Lake

Data Lake Metadata Optimization Statistics

Better Analytics Through AI: Our Take on Gartner’s AI Trends

Sisense

AUGUST 21, 2020

Knowledge graphs will be the base of how the data models and data stories are created, first as relatively stable creatures and, in the future, as on-demand, per each question. Trend 5: Augmented data management. Gartner: “Augmented data management uses ML and AI techniques to optimize and improve operations.

Analytics

Analytics Machine Learning Dashboards Visualization

Biggest Trends in Data Visualization Taking Shape in 2022

Smart Data Collective

OCTOBER 13, 2021

There is no disputing the fact that the collection and analysis of massive amounts of unstructured data has been a huge breakthrough. We would like to talk about data visualization and its role in the big data movement. How does Data Virtualization complement Data Warehousing and SOA Architectures?

Visualization

Visualization Cost-Benefit Big Data Prescriptive Analytics

Top Takeaways from the Gartner® Innovation Insight: Data Security Posture Management

Laminar Security

MAY 3, 2023

According to our recent State of Cloud Data Security Report 2023 , 77% of organizations experienced a cloud data breach in 2022. That’s particularly concerning considering that 60% of worldwide corporate data was stored in the cloud during that same period.

Management

Management Risk Risk Management Data Processing

Data Leaders Brief

What is a data architect? Skills, salaries, and how to become a data framework master

Use Apache Iceberg in a data lake to support incremental data processing

Webinars

Trending Sources

The Future Is Hybrid Data, Embrace It

Webinars

The Future Is Hybrid Data, Embrace It

Demystifying Modern Data Platforms

Optimize data layout by bucketing with Amazon Athena and AWS Glue to accelerate downstream queries

Backtesting index rebalancing arbitrage with Amazon EMR and Apache Iceberg

Habib Bank manages data at scale with Cloudera Data Platform

The Modern Data Lakehouse: An Architectural Innovation

Choosing an open table format for your transactional data lake on AWS

Better Analytics Through AI: Our Take on Gartner’s AI Trends

Biggest Trends in Data Visualization Taking Shape in 2022

Top Takeaways from the Gartner® Innovation Insight: Data Security Posture Management

Stay Connected