Accelerate AI Adoption: 3 Steps to Deploy Dataiku for Google Cloud Platform

Dataiku Product, Scaling AI, Tech Blog Timothy Law

Organizations are always seeking ways to more quickly enable their data science and analytics teams with cloud platforms and services. Dataiku and Google Cloud Platform (GCP) have partnered to deliver the Dataiku cloud stack accelerator for GCP. 

The Dataiku cloud stack accelerator for GCP is a templated approach to deploy and manage Dataiku for GCP. The offering delivers a self-managed AI platform on Google's powerful elastic compute infrastructure in hours. With Dataiku for GCP, customers achieve a faster time to value for AI and ML projects and IT administrators have an easier way to deploy and manage instances.  

With Dataiku, GCP customers can leverage their cloud investment with native integration to their favorite GCP cloud services such as Google BigQuery for both storage and compute. Dataiku connects to your GCP data with pre-built integrations to storage and processes it with an innovative pushdown execution to compute layers in GCP, so Dataiku leaves your data secured in your cloud. 

GCP customers can scale data science workloads such as model training and scoring by deploying elastic AI clusters powered by Google Kubernetes Engine (GKE) with the Dataiku cloud stack accelerator. And Dataiku features many additional integrations to services like Dataproc, Looker, Google Sheets, NLP and vision AI services, and more. 

Figure 1. Features Dataiku and GCP native integrations to popular cloud services.

Figure 1. Features Dataiku and GCP native integrations to popular cloud services. 

Get a Complete Data and AI Stack in 3 Steps

Within three templated, menu-driven steps, enterprises can have a complete data and AI stack up and running, including the ability to run workloads on elastic compute clusters powered by Kubernetes (GKE).

Step 1: Create Service Accounts for Fleet Manager and Dataiku Instances

The Dataiku cloud stack accelerator for GCP uses a feature called fleet manager to deploy, upgrade, backup, restore, and configure one or several Dataiku instances. It handles the entire lifecycle of the Dataiku instances, freeing enterprises from many administration tasks. Essentially, fleet manager provides enterprises with a management console for deploying and managing Dataiku within a Google Cloud account. 

The first step is to create two service accounts, one for the fleet manager template and the second service account for Dataiku instances. Administrators are able to assign roles and permissions at this step for the initial deployment. Roles and permissions can be further defined later using Dataiku advanced security options and GCP Identity and Access Management. 

Step 2: Deploy Fleet Manager and Provision Dataiku

Fleet manager needs to be hosted in a VPC. Administrators can access the fleet manager feature with two simple command lines. Once logged into fleet manager, deploy the Dataiku instance, using the “Deploy Elastic Design” option from the Quick Start Blueprints menu. With this option, you deploy Dataiku with elastic AI capabilities powered by Kubernetes (GKE). 

After initial resource and role assignments, the cloud stacks accelerator offers four one-click options for deploying your Dataiku instance. GCP users benefit from the elastic design option.

Figure 2: After initial resource and role assignments, the cloud stacks accelerator offers four one-click options for deploying your Dataiku instance. GCP users benefit from the elastic design option.

An instance is a single installation of a Dataiku design node, automation node, or deployer node, and a dedicated virtual machine backs each instance. When you provision an instance, cloud stacks accelerator creates the required cloud resources to host the Dataiku node. 

Step 3: Connect to GCP Services and Begin Your Everyday AI Journey

Once you’ve deployed Dataiku, you’ll want to connect all of your favorite GCP services so users can:

  • Scale computation through GKE.
  • Perform interactive query recipes with inputs and outputs in BigQuery.
  • Push down the execution of Dataiku visual recipes to BigQuery. 
  • Use Google Container Registry to manage Docker images.
  • Query the Looker API to load datasets from Looker into Dataiku.
  • Leverage Google compatibility-tested and optimized TensorFlow for training and inference of deep neural networks with the Keras Python interface.
  • Use Google Natural Language and Vision AI services.

Deploy Dataiku for GCP in three steps and then maintain instances with backups, one-click upgrades, and fast deployment of additional instances.

Figure 3: Deploy Dataiku for GCP in three steps and then maintain instances with backups, one-click upgrades, and fast deployment of additional instances.

In only three steps, you have a complete elastic AI stack. Now administrators can maintain and manage your Dataiku instance with the automated maintenance features.

Automate Maintenance to Reduce IT Requirements

The Dataiku cloud stack accelerator for GCP provides an easy-to-use, templated approach to the ongoing administration and maintenance of Dataiku instances. Instantly upgrade to the latest version of Dataiku and set backup, restore, and rollback procedures. Administrators can quickly provision new instances and onboard more users, departments, or business units with simple, menu-driven screens. With Dataiku for GCP, administrators can help their data and analytics teams experience the full value of adopting AI and ML in the cloud.  

You May Also Like

How to Build Tailored Enterprise Chatbots at Scale

Read More

Operationalizing Data Quality: The Key to Successful Modern Analytics

Read More

Alteryx to Dataiku: AutoML

Read More

Conquering the Data Deluge Through Streamlined Data Access

Read More