Blog - Data Leaders Brief

Upgrade Journey: The Path from CDH to CDP Private Cloud

Cloudera

SEPTEMBER 28, 2020

The data lifecycle model ingests data using Kafka, enriches that data with Spark-based batch process, performs deep data analytics using Hive and Impala, and finally uses that data for data science using Cloudera Data Science Workbench to get deep insights. Hive, Ranger, Atlas, Spark. Hive, Ranger, Atlas, Spark. Convert Spark 1.x

Testing

Testing Metadata Risk Data Science

Next generation tools for data science

The Unofficial Google Data Science Blog

AUGUST 31, 2016

By DAVID ADAMS Since inception, this blog has defined “data science” as inference derived from data too big to fit on a single computer. While MapReduce remains a fundamental tool, many interesting analyses require more than it can offer. Spark was developed at UC Berkeley to enable exploratory analysis and ML at scale.

Data Science

Data Science Sales Optimization Cost-Benefit

Data Leaders Brief

Upgrade Journey: The Path from CDH to CDP Private Cloud

Next generation tools for data science

Webinars

Stay Connected