article thumbnail

Migrate Hive data from CDH to CDP public cloud

Cloudera

This blog post outlines detailed step by step instructions to perform Hive Replication from an on-prem CDH cluster to a CDP Public Cloud Data Lake. CDP Data Lake cluster versions – CM 7.4.0, Pre-Check: Data Lake Cluster. Understanding Ranger Policies in Data Lake Cluster. Runtime 7.2.8.

article thumbnail

Convergent Evolution

Peter James Thomas

No this article has not escaped from my Maths & Science section , it is actually about data matters. The image at the start of this article is of an Ichthyosaur (top) and Dolphin. That was the Science, here comes the Technology… A Brief Hydrology of Data Lakes.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Decoding Data Analyst Job Description: Skills, Tools, and Career Paths

FineReport

Given the critical role they play, employers actively seek data analysts to enhance efficiency and stimulate growth. This article explores the data analyst job description, covering essential skills, tools, education, certifications, and experience. SQL manages and retrieves data from databases, handling larger datasets.

article thumbnail

Demystifying Modern Data Platforms

Cloudera

Mark: The first element in the process is the link between the source data and the entry point into the data platform. At Ramsey International (RI), we refer to that layer in the architecture as the foundation, but others call it a staging area, raw zone, or even a source data lake.

article thumbnail

New Thinking, Old Thinking and a Fairytale

Peter James Thomas

Of course it can be argued that you can use statistics (and Google Trends in particular) to prove anything [1] , but I found the above figures striking. Figures suggest that both BPR and Data Warehouse programmes have a failure rate of 60 – 70% [5]. Source: Google Trends. Gentlemen (and Ladies) Place your Bets.

article thumbnail

The Data Scientist’s Guide to the Data Catalog

Alation

The catalog facilitates the synergy of the domain experts’ subject matter expertise with the data scientists statistical and coding expertise. Finally, a data catalog can help data scientists find answers to their questions (and avoid re-asking questions that have already been answered). Communicate and Visualize Results.

article thumbnail

Fact-based Decision-making

Peter James Thomas

This article is about facts. These normally appear at the end of an article, but it seemed to make sense to start with them in this case: Recently I published Building Momentum – How to begin becoming a Data-driven Organisation. A number of factors can play into the accuracy of data capture. Up-front Acknowledgements.

Metrics 49