From Citizen Data Scientist to Citizen Data Steward

Use Cases & Projects, Dataiku Product Adam Nathan

This is a blog post from our friends at CoEnterprise. CoEnterprise is an award-winning B2B software and professional services company headquartered in New York City. As a Tableau Premier and Snowflake partner, CoEnterprise delivers supply chain and business analytics solutions and services that empower companies to make faster, smarter business decisions.

Predictive analytics are no longer the exclusive domain of data scientists tucked away in digital laboratories. The need to expand the accessibility of machine learning (ML) tools and approaches has led to significant progress in the number of organizational roles that can work with these tools. We’ve seen the birth and proliferation of citizen data science

As exciting as this has been, new opportunities and challenges have emerged. Increasingly, organizations struggle to democratize their data. Marketers, business analysts, and data engineers have expanding roles and need the tools to support them and work together.  

Without tools that interoperate with other platforms, the future will be limited indeed. Trust, collaborative investigation, real-time alerting, and intuitive visualization stand out as necessities. And it’s happening. 

So, What Has Made This Democratized Ecosystem Possible?

In a recent webinar for Tableau Conference 2021, Premium Tableau Partner CoEnterprise and Dataiku look at an end-to-end solution that solves these urgent problems. Working with a customer churn problem for a major telecommunications company, the hosts demonstrate how a real-world model is created, visualized, and acted upon – by a citizen data scientist, a marketing analyst, and a business analyst. 

We learn how Dataiku creates a collaborative environment for the creation, training, scheduling, and deployment of the churn model. We step through feature handling and reduction and the scoring of multiple ML models.  

We see how this is done in Dataiku through a visual machine learning interface as well as data science notebooks for coders. Governance capabilities and collaboration features like wiki documentation and interactive commentary make a comprehensive, documented, and trustworthy solution possible. Where many businesses have suffered shell shock from investments in data science that were ineffective — tools like Dataiku are entirely upending these limitations by providing complete governance, observability, and monitoring of models in production. In addition, Dataiku is working with Tableau to enable visualization of model outputs. 

One example of this tight coupling of technology is the effortless integration of Dataiku and Tableau. Data needs to be visualized, questioned, and discussed. Models need to be monitored and validated. Analysts need an intuitive environment to make choices and validate them. The webinar looks at how model output data from Dataiku and visualized in Tableau makes this sophisticated level of analysis and collaboration possible. 

Starting with the intuitive and user-friendly Tableau environment, we learn how the model output can be visualized. The telecommunications customer churn model is explored down to the area codes. We see how nested visualizations, interactive heat maps, box-and-whisker plots, and custom visualizations can be designed on the fly, specifically for an analyst’s problem. 

But the democratized ecosystem of the future demands frictionless collaboration. Tableau Online makes this possible through a wealth of intuitive features. From on-page comments and dialog, to the recent integration of Slack to alert users to urgent outlier data, we discover just how far the tools have come. 

And, finally, this is nowhere as powerfully demonstrated as with the integration of certified datasets into the portal environment. The effortless exposure of underlying metrics, definitions, data owners, and emerging data quality alerts are leading to fundamental changes in how we think about trustworthy data. Environments like Tableau Online are making the emerging role of “citizen data steward” possible.  

All of which begs the question: If we were working on a customer churn problem, why wouldn’t we be using state-of-the-art tools? If the capabilities of Dataiku and Tableau are raising questions about how you are currently approaching data science or, worse, raising concerns about being left behind, this 30-minute webinar is a fantastic opportunity to rethink your current approach. 

You May Also Like

Alteryx to Dataiku: Working With Datasets

Read More

Demystifying Multimodal LLMs

Read More

I Have AWS, Why Do I Need Dataiku?

Read More

Why Data Quality Matters in the Age of Generative AI

Read More