Big Data and Public Health: An interview with Dr. Willem van Panhuis about Project Tycho, digitizing disease records, and new ways of doing research in public health

All opinions of the interviewer are my own and do not necessarily reflect those of Novo Nordisk.

One of the huge and perhaps still underappreciated aspects of the internet age is the digitization of information. While the invention of the printing press made the copying of information easy, quick and accurate, print still relied on books and other printed materials that were moved from place to place to spread information. Today digitization of information, cheap (almost free) storage, and the pervasiveness of the internet have vastly reduced barriers to use, transmission and analysis of information.

In an earlier post I described the project by researchers at the University of Pittsburgh that digitized US disease reports over the past 120+ years, creating a computable and freely available database of disease incidence in the US (Project Tycho, This incredible resource is there for anyone to download and use for research ranging from studies of vaccine efficacy to the building of epidemiological models to making regional public health analyses and comparisons.

Their work fascinates me both for what it said about vaccines and also for its connection to larger issues like Big Data in Public Health. I contacted the lead researcher on the project, Dr. Willem G. van Panhuis and he very kindly consented to an interview. What follows is our conversation about his work and the implications of this approach for Public Health research.

vanPanhuis,Wilbert[brianCohen20131113] (12)_resized

Dr. Willem van Panhuis. Image credit: Brian Cohen, 2013

Kyle Serikawa: Making this effort to digitize the disease records over the past ~120 years sounds like a pretty colossal undertaking. What inspired you and your colleagues to undertake this work?

Dr. Willem van Panhuis: One of the main goals of our center is to make computational models of how diseases spread and are transmitted. We’re inspired by the idea that by making computational models we can help decision makers with their policy choices. For example, in pandemics, we believe computational models will help decision makers to test their assumptions, to see how making different decisions will have different impacts.

So this led us to the thinking behind the current work. We believe that having better and more complete data will lead to better models and better decisions. Therefore, we needed better data.

On top of this, each model needs to be disease specific because each disease acts differently in how it spreads and what effects it has. In contrast, however, the basic data collection process that goes into creating the model for each disease is actually pretty similar across diseases. There is contacting those with the records of disease prevalence and its spread over time, collecting the data and then making the data ready for analysis. There’s considerable effort in that last part, especially as Health Departments often do not have the capacity to spend a lot of time and effort on responding to data requests by scientists.

The challenges are similar–we go through the same process every time we want to model a disease–so when we learned that a great source of much of the disease data in the public domain is in the form of these weekly surveillance reports published in MMWR and precursor journals, we had the idea: if we digitize the data once for all the diseases that would provide a useful resource for everybody.

We can make models for ourselves, but we can also allow others to do the same without duplication of effort. Continue reading


The Innovator’s Dilemma in biopharma part 1. Framing the industry’s position

All opinions are my own and do not necessarily reflect those of Novo Nordisk

h/t to @Frank_S_David, @scientre, and the LinkedIn Group Big Ideas in Pharma Innovation and R&D Productivity for links and ideas

Joe Nocera’s recent column in the New York Times provided a nice dissection of how Blackberry tumbled from the position it once held at the top of the handheld phone/PDA business market.  In a nutshell it encapsulates how Blackberry fell victim to the Innovator’s Dilemma, the paradigm put forward by Clay Christensen about how and why established companies within an industry often fall victim to disruptive technologies.  This happened even though they were aware of the danger and made efforts to circumvent the dilemma.  In the case of Blackberry, one aspect of their fall was a lack of appreciation for the technology creeping up behind: the iPhone and other mobile devices using touchscreens.  For Blackberry one of their advantages and selling points was a physical keyboard which allowed rapid typing and emailing by business customers.  They couldn’t see why anyone would want something less effective for emails and messaging.

In addition, Blackberry felt both secure in and beholden to their customer base, the businesspeople who used Blackberries strictly as tools for work.  Blackberry (Research in Motion at the time) seemed both unable to conceive of the possibility of other markets and, frankly, had no incentive to reach into those markets until it was too late.  By then other phones and operating systems had grown and matured to the point of essentially overtaking the market of smartphone users, of which businesspeople make up just a small fraction.  Too little, too late, and now Blackberry has been trying to sell itself, although recent reports suggest that strategy is also failing.

From Blackberry to biopharma

In this post I’d like to explore the concept of the Innovator’s Dilemma as it might apply to the biopharmaceuticals industry.  Continue reading