article thumbnail

A Beginner’s Guide to Structuring Data Science Project’s Workflow

Analytics Vidhya

This article was published as a part of the Data Science Blogathon. Introduction Asides from dedication to discovery and exploration, to succeed in a Data Science project, you must understand the process and optimize it to ensure that the results are reliable and the project is easy to follow, maintain and modify where necessary.

article thumbnail

A Brief Introduction to Apache HBase and it’s Architecture

Analytics Vidhya

This article was published as a part of the Data Science Blogathon. Introduction Since the 1970s, relational database management systems have solved the problems of storing and maintaining large volumes of structured data.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

A brief introduction to SQL Alchemy

Analytics Vidhya

This article was published as a part of the Data Science Blogathon. Introduction The structured data we generally deal with gets stored in a tabular format in relational databases. And stored data in these databases can be accessed by a query language called “sequel” or SQL. But, it is […].

article thumbnail

Apache Sqoop: Features, Architecture and Operations

Analytics Vidhya

This article was published as a part of the Data Science Blogathon. Introduction Apache SQOOP is a tool designed to aid in the large-scale export and import of data into HDFS from structured data repositories. Relational databases, enterprise data warehouses, and NoSQL systems are all examples of data storage.

article thumbnail

Get to Know Apache HBase from Scratch!

Analytics Vidhya

This article was published as a part of the Data Science Blogathon. Introduction on Apache HBase With the constant increment of structured data, it is getting difficult to efficiently store and process the petabytes of data. To provide a massive amount […].

article thumbnail

Everything About Apache Hive and its Advantages!

Analytics Vidhya

This article was published as a part of the Data Science Blogathon. Hive, founded by Facebook and later Apache, is a data storage system created for the purpose of analyzing structured data. Operating under an open-source data platform called Hadoop, Apache Hive is a software application released in 2010 (October).

IT 259
article thumbnail

Natural Language Processing for Indic Languages

Analytics Vidhya

This article was published as a part of the Data Science Blogathon. Introduction Over the past few years, advancements in Deep Learning coupled with data availability have led to massive progress in dealing with Natural Language.