Big Data and Public Health: An interview with Dr. Willem van Panhuis about Project Tycho, digitizing disease records, and new ways of doing research in public health

All opinions of the interviewer are my own and do not necessarily reflect those of Novo Nordisk.

One of the huge and perhaps still underappreciated aspects of the internet age is the digitization of information. While the invention of the printing press made the copying of information easy, quick and accurate, print still relied on books and other printed materials that were moved from place to place to spread information. Today digitization of information, cheap (almost free) storage, and the pervasiveness of the internet have vastly reduced barriers to use, transmission and analysis of information.

In an earlier post I described the project by researchers at the University of Pittsburgh that digitized US disease reports over the past 120+ years, creating a computable and freely available database of disease incidence in the US (Project Tycho, http://www.tycho.pitt.edu/) This incredible resource is there for anyone to download and use for research ranging from studies of vaccine efficacy to the building of epidemiological models to making regional public health analyses and comparisons.

Their work fascinates me both for what it said about vaccines and also for its connection to larger issues like Big Data in Public Health. I contacted the lead researcher on the project, Dr. Willem G. van Panhuis and he very kindly consented to an interview. What follows is our conversation about his work and the implications of this approach for Public Health research.

vanPanhuis,Wilbert[brianCohen20131113] (12)_resized

Dr. Willem van Panhuis. Image credit: Brian Cohen, 2013

Kyle Serikawa: Making this effort to digitize the disease records over the past ~120 years sounds like a pretty colossal undertaking. What inspired you and your colleagues to undertake this work?

Dr. Willem van Panhuis: One of the main goals of our center is to make computational models of how diseases spread and are transmitted. We’re inspired by the idea that by making computational models we can help decision makers with their policy choices. For example, in pandemics, we believe computational models will help decision makers to test their assumptions, to see how making different decisions will have different impacts.

So this led us to the thinking behind the current work. We believe that having better and more complete data will lead to better models and better decisions. Therefore, we needed better data.

On top of this, each model needs to be disease specific because each disease acts differently in how it spreads and what effects it has. In contrast, however, the basic data collection process that goes into creating the model for each disease is actually pretty similar across diseases. There is contacting those with the records of disease prevalence and its spread over time, collecting the data and then making the data ready for analysis. There’s considerable effort in that last part, especially as Health Departments often do not have the capacity to spend a lot of time and effort on responding to data requests by scientists.

The challenges are similar–we go through the same process every time we want to model a disease–so when we learned that a great source of much of the disease data in the public domain is in the form of these weekly surveillance reports published in MMWR and precursor journals, we had the idea: if we digitize the data once for all the diseases that would provide a useful resource for everybody.

We can make models for ourselves, but we can also allow others to do the same without duplication of effort. Continue reading

Big Data provide yet more Big Proof of the power of vaccines

All opinions are my own and do not necessarily reflect those of Novo Nordisk.

Time for another screed about the anti-vaccination movement.

Well, not about them per se, but rather about another study that demonstrates how much of a positive difference vaccines have made in the US. The article, from researchers at the University of Pittsburgh and Johns Hopkins University, describes what I can only imagine to be a Herculean effort to digitize disease reporting records from 1888 to 2011 (article behind a paywall, unfortunately).  Turns out there are publications that have been collecting weekly reports of disease incidence across US cities for over a century.  I have not been able to access the methods, but I can’t shake the image of hordes of undergraduates hunched over yellowed clippings and blurry photocopies of 19th century tables, laboriously entering numbers one by one into a really extensive excel spreadsheet.

All told, 87,950,807 individual cases were entered into their database, including location, time, and diseases.  Not fun, however it was done. Continue reading

When business takes a stand

All opinions are my own and do not necessarily reflect those of Novo Nordisk.

Texas!

So much of what happens in the US seems to revolve around Texas.  It’s a huge, rich, diverse state, with influence that stretches far beyond its boundaries.  I mean, you rarely hear about how the politics of Rhode Island affect the nation.  I’m just saying.  Don’t hate me, people of Rhode Island! All eight of  you! Which is still about six more people than read this blog…

That’s why, for example, when Texas experiences outbreaks of whooping cough and measles, it makes the news.  The state is a bellwether for certain cultural and societal trends like the anti-vaccination movement.  And it’s in this context that two recent developments in how businesses are interacting with Texas are fascinating.

Let’s talk textbooks and the death penalty.

Continue reading

One small, wistful story about the shutdown

Seattle, like San Francisco, like New York City, is a city of water and bridges.  I remember reading Winter’s Tale by Mark Helprin back in college. I think it’s a book that I’d benefit from reading now, again, but one of the concepts that struck me even in my callow youth was his observation about cities that have bridges as part of their fundamental being.  He described how a city, to be magnificent, must “project, extend, fling itself in all directions–over the water, in peninsulas, hills, soaring towers, and islands linked by bridges.”

Seattle is that kind of city.

And it made me sad when I heard that one of our bridges, albeit a small and specialized one, was closing because of the government shutdown. Continue reading

Transparency and the invisible hand in hospital and healthcare costs

All opinions are my own and do not necessarily reflect those of Novo Nordisk

One of the things that sometimes seems to get lost when people talk about the power of the market to create efficiency is that a free market requires that information be shared and freely available and understandable by everyone.  When information is withheld by one side or the other of a transaction, or when different customers for a service or product are unable to compare prices, the metaphor of the invisible hand breaks down.

You can see, this, interestingly enough, in sports as it relates to both the trading of players under contract and the signing of free agents.  Since I’m a baseball fan, let me link here to a discussion of research that’s been done looking at Major League Baseball.  The studies looked at players traded or signed by a different team as a free agent and how those players performed in subsequent years versus players whose original team re-signed them.  It turns out that players who switched teams did, indeed, perform more poorly relative to projections than players who stayed.  This suggests that the original teams have proprietary information that allows them to make better decisions about which players to retain.  Thus the market for baseball players isn’t quite free and efficient because of information asymmetry.

And unfortunately, information asymmetry is also rampant in other industries such as healthcare. Continue reading