Remove Article Remove Data Lake Remove Metadata Remove Statistics
article thumbnail

Migrate Hive data from CDH to CDP public cloud

Cloudera

This blog post outlines detailed step by step instructions to perform Hive Replication from an on-prem CDH cluster to a CDP Public Cloud Data Lake. CDP Data Lake cluster versions – CM 7.4.0, Pre-Check: Data Lake Cluster. Understanding Ranger Policies in Data Lake Cluster. Runtime 7.2.8.

article thumbnail

Convergent Evolution

Peter James Thomas

No this article has not escaped from my Maths & Science section , it is actually about data matters. The image at the start of this article is of an Ichthyosaur (top) and Dolphin. That was the Science, here comes the Technology… A Brief Hydrology of Data Lakes.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Demystifying Modern Data Platforms

Cloudera

Mark: The first element in the process is the link between the source data and the entry point into the data platform. At Ramsey International (RI), we refer to that layer in the architecture as the foundation, but others call it a staging area, raw zone, or even a source data lake.

article thumbnail

The Data Scientist’s Guide to the Data Catalog

Alation

A data catalog can assist directly with every step, but model development. And even then, information from the data catalog can be transferred to a model connector , allowing data scientists to benefit from curated metadata within those platforms. How Data Catalogs Help Data Scientists Ask Better Questions.

article thumbnail

Themes and Conferences per Pacoid, Episode 12

Domino Data Lab

Paco Nathan ‘s latest monthly article covers Sci Foo as well as why data science leaders should rethink hiring and training priorities for their data science teams. In this episode I’ll cover themes from Sci Foo and important takeaways that data science teams should be tracking. Introduction. What’s a Foo?