COVID-19 Data Spreads like a Virus

Anyone else out there confused by all the data you’ve been seeing regarding COVID-19? Not to minimize the severity of our situation, but the explosion of numbers, charts, dashboards and data stories is beginning to feel like a pandemic of its own.

As I was listening to a Data Visualization Society round table discussion about the responsible use of COVID-19 data (properly distanced and webinar-ed, of course), a few thoughts seemed most relevant.

Consider your source (Or: Hey Twitter: SHUT UP!)

It seems that every data vis practitioner out there is scrambling to put a new, insightful view on the situation. As things have it, the Johns Hopkins University dashboard quickly became the de facto standard, and as a result gets a billion views a day. ONE BILLION!! Whoa. Fortunately JHU is on the front lines and has the cred to speak to the meaning of the data they’re uncovering. But not all dashboards (and data sources) are created equal. Feel free to tell “that guy on twitter” (who a month ago was beach-partying in Florida with his bros, but somehow is now both an expert epidemiologist, and a brilliant economist) “thanks, no."

Data, data everywhere. (And it’s crap.)

Here’s snapshot of a recent search:

Screen+Shot+2020-04-16+at+3.00.02+PM.jpg

As Harry Stevens, Washington Post reporter (responsible for that amazing “flatten the curve” story) pointed out in the round table:

the truth is that a million cases happened weeks before the official count passed that number and so we’re not reporting on reality, but we’re reporting on the numbers.
— Harry Stevens, Washington Post

In other words, be careful when you talk about what those numbers actually mean. He goes on to clarify by saying that some estimate that 80% of the cases don’t get tested, so we don’t actually know the real number of cases, but it’s likely very much higher. But because we have a ton of data, we so want to think that means it’s accurate. Volume is not equivalent to accuracy.

Forecasts are not Predictions (But they’re still useful.)

Forecasts are built by experts, using lots of assumptions, based on very complex and specifically applied statistical models. And forecast modelers know how not only to create those, but also how to interpret them. And in this particular case, the uncertainty bands are huge, because there’s just so much about this virus that we just don’t understand. But when the public and politicians see a forecast, they we talk about them as if they’re a prediction. But it’s not. Stevens sums it up well: "Forecasts are not predictions, but a way to influence policy.” 

To sum up, don’t get too distracted by the latest new shiny when it comes to COVID-19 data. Go deep,  understand the meaning of the data, and use forecasts to influence policy, not to predict the future.