Skip to main content

Cloud

OLAP and Hadoop: The 4 Differences You Should Know

agile backlog groom

OLAP and Hadoop are not the same. OLAP is a technology to perform multi-dimensional analytics like reporting and data mining. It has been around since 1970. Hadoop is a technology to perform massive computation on large data. Around since 2002. They can be used together but there are differences when choosing between using Hadoop/MapReduce data processing versus classic OLAP. For this chat, let’s avoid the concern of price and also assume the business needs have been thought through.

1 Processing Type

For transactions and data mining use OLAP. But, for analytics and data discovery use Hadoop. For known cleaned data/processes that yield definitive results of high integrity use OLAP. For unknown messier data/processes that yield suggestive results use Hadoop. E.g., use OLAP for weather sensors, but Hadoop for weather models. OLAP can perform fast reads on high-end servers. Hadoop can perform fast reads and writes on distributed services.

2 Data Size

OLAP is meant to operate on pre-aggregated data from a massive number of records. It has good throughput of more records in a data warehouse. Hadoop is meant to operate on massive un-aggregated data from a lower number of objects. It has high throughput of larger objects in a data lake (Harris, n.d.). Does the business need more of smaller objects or less of larger objects? For example, if summing records is important, then OLAP is good. But, if audio analysis is important, then Hadoop is good. Overall, Hadoop has superior throughput.

3 Interaction

OLAP runs on SQL following DB normalization principles. Hadoop runs on HQL following object-oriented concepts. SQL is based on a relational DB model. But, HQL combines object-oriented programming with relational DB concepts (Jeyakanth, 2017). OLAP is good for update, insert, select, and delete. Hadoop is good for any other manner of object.

4 Data Structure

OLAP is meant for structured dimensional model. It scales well vertically. OLAP likes more of same things in a relational table. Whereas, Hadoop is meant for unstructured data and scales well horizontally. Hadoop likes more of different things with key/value pairs. Thus, the sources of data is important consideration. For example, OLAP for more police ticket transactions and Hadoop for more body cam data. Overall, Hadoop will be better on the max total storage needs.

Conclusion, OLAP and Hadoop

In most cases, Hadoop can do what OLAP does. OLAP might be needed if there is a legacy system to consider. Or you only need reporting. Or tech maturity is a driver. However, generally, I lean toward Hadoop/MapReduce over OLAP.

For more information:

Perficient Data

Perficient Cloud

Perficient Analytics

Perficient Big Data

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Rick Kapalko

With degrees in Analysis and Management, Mr. Kapalko has spent two decades in both project management of agile development and contract management of operations optimization. He is adept at managing the solution path - realizing business value from engineering projects, people, processes, and data.

More from this Author

Follow Us
TwitterLinkedinFacebookYoutubeInstagram