Know Your Data Ingredients

We're so eager to see our data that we forget to look at our data.

I've been indulging in more than my share of cooking instructional lessons. A few of my favorites:

Claire Saffitz demonstrates how creativity, analysis, and intuition make it possible to replicate your favorite treats, Gourmet-style.

Massimo Bottura showed me about how to build on traditional italian cooking to make a dish your own.

Thomas Keller showed me the most delicate omelet possible — which I promptly screwed up.

Gordon Ramsey explains that there is only one path to the perfect scrambled eggs. 

Samin Nosrat teaches how to find a balance of the foundational elements, salt, fat, acid, and heat, and shares a Bolognese pasta that I’m still trying to get right.

But forget cooking; I want to talk about ingredients. All of these chefs start by emphasizing the need to use quality ingredients and understand how they contribute to your dish. The same is true for your data.


In the data world, we don't always have the luxury of high-quality data ingredients. Often there is missing data, messy data, and misunderstood data. But if you can't have the best ingredients, it is important to at least understand what you do have so you can make the most of it.

This is an often overlooked step on the rush to visualize data. We're so eager to see our data that we forget to look at our data. Preparing the ingredients is like being a Sous Chef. It is both necessary and decidedly unsexy work.

In an effort to lay a strong foundation for your visualizations, here are three steps to understand and evaluate your data fields before you throw it into the Cuisinart that is your visualization tool.

(1) Separate your metrics from your dimensions. Metrics are the data values that measure performance. The numbers you tend to aggregate or average. The metrics are often the stars of the show (Metrics are the Characters of Data Stories) — like the proteins of your dishes. Dimensions are the data values that describe an attribute. These are the ways you slice and dice your metrics. Maybe we’ll call them the vegetables because we like to chop them up.

(2) Now it is time to get to know your metrics. For each one, you’ll want to understand the following:

  • Define - A common definition that everyone in the organization can agree on

  • Calculate - What mathematical operations do you apply to the data before you show the values?

  • Good/bad - How do you evaluate whether the metric result is good or bad for the organization?

  • Relationships - Does this metric relate to other metrics?

  • Format - How should the metric values be presented in your visuals?

(3) Dimensions need a similar examination, but with a few small differences:

  • Define - A common definition that everyone in the organization can agree on

  • Aggregate - When the dimension is displayed, are there ways that the entities are aggregated?

  • Ordering - Are the dimensional entities/buckets shown in a specific order?

  • Relationships - Does this dimension relate to other dimensions?

  • Format - How should the dimension values labeled?

And now you are ready to get cooking. In the words of Emeril…

Discover & share this Bam Yes GIF with everyone you know. GIPHY is how you search, share, discover, and create GIFs.