Open Source Projects by Google, Uber and Facebook for Data Science and AI

By Asel Mendis, KDnuggets on November 28, 2019 in Advice, AI, Data Science, Data Scientist, Data Visualization, Deep Learning, Facebook, Google, Open Source, Python, Uber

comments

Open source is becoming the standard for sharing and improving technology. Some of the largest organizations in the world namely: Google, Facebook and Uber are open sourcing their own technologies that they use in their workflow to the public. This has allowed the common person to utilize technologies that are used in the biggest companies in the world. Probably the most well-known open source projects are PyTorch and Tensorflow (both coincidentally being the de-facto standard for Deep Learning).

Open Source Projects by Facebook

Source

PyTorch

Source

PyTorch is basically the most famous Deep Learning library in the Data Science community. It has a rich ecosystem that data scientists can use to conduct a variety of tasks. Some of the tools available are BoTorch for Bayesian Optimization, AllenNLP for designing and using deep learning models for Natural Language Processing, fastai to easily build and evaluate neural nets and skorch for a high-level interface that provides full scikit-learn compatibility.

Prophet

Source

Prophet is an open source time series forecasting library that has an API to both Python and R . It is built to perform well on time series with high seasonality and able to account for holiday effects. It can handle missing data and outliers in the data. A big problem in Time Series is missing data as the data is supposed to be sequential and a common practice is to impute missing values with the mean or median (Most of the time not the bets option in Time Series).

Open Source Projects by Uber

Source

CausalML

Source

CausalML is uber's open source answer for uplift modelling and causal inference methods using machine learning methods. It allows the user to estimate the Conditional Average Treatment Effect (CATE) or Individual Treatment Effect (ITE) from experimental or observational data.

Ludwig

Source

Ludwig is probably the most famous open source project from Uber. Ludwig allows the user to train and test deep learning models without having to write a piece of code except for specifying YAML . It is built on top of Tensorflow. A Python API is available for users that have a preference.

Pyro

Source

Pyro is maintained by Uber AI Labs and was built on top of PyTorch for Deep Probabilistic Programming. It was built on the principles of Universal, Scalable, Minimal and Flexible. A beta version of NumPyro, a probabilistic programming library for Pyro with a NumPy backend is being built for faster processing.

kepler.gl

Source

Kepler.gl is Uber's open source geospatial analysis toolbox for scaling on large data sets. It was built to assist data scientists make an impact with location data using an interactive and data driven approach. It is built on top of Mapbox GL and Deck.gl