Cheetahs hunting and the quantified self

Who doesn’t love cheetahs?  A young person of my acquaintance went so far as to spend a large portion of her time, at a certain age, cavorting on all fours and yipping and chirping like a cheetah.  And of course we all know that cheetahs are the fastest land animals, and that’s how they catch their prey, by outrunning them.

Only that’s wrong.

Yes cheetahs are wicked fast, reaching about 60 miles per hour, but a recent report in Nature has shown, via novel monitoring techniques, that maneuverability and deceleration skills are the keys to successful hunting.  The researchers designed a new type of monitoring collar that included GPS and accelerometers.  No word on whether the collars also allowed cheetahs to play Words with Friends.

This report highlights the things we can learn as we get better and better at measuring.  Conventional wisdom may remain or be turned on its head, and either outcome is fine.  The key is that we have a better  basis upon which to understand that wisdom, that we don’t take things for granted, that we question our assumptions.

The cheetah collars also point to how we can gather so much more data on individuals, whether furry or bipedal (or both), than we ever could before.  I’ve recently been made aware of the quantified self movement (HT @bkolko), and what they hope to do is in line with what was done with these cheetahs.  Take individual monitoring and data gathering to new heights.  No, it won’t involve tracking collars (unless, you know, that’s your thing).  But it will involve using technology to measure what previously we could only guess at, and enable decision making and research in new and powerful ways.

Why Derek Jeter being a lousy defensive shortstop gives me hope for innovation in industry

All opinions are my own and do not necessarily reflect those of Novo Nordisk

Hat tip to Jeff Sullivan of for the article that sparked this idea.

It used to be we knew what a good defender was in baseball.  And Derek Jeter was a good defender.  He had balletic grace, he scooped up balls and threw them with flair and panache, with an all-but-patented jump-throw that made announcers gush and coaches shake their heads in awe.  He was the complete package, a player who could hit, field, throw and lead, a first ballot hall of famer.

Except that, when you look closely, it turns out his defense is lousy.

Defense used to be measured (still is, by many) via the eye test.  How does a player look when catching balls in play?  And this was backed up by the statistic of fielding percentage.  How many balls did a player field cleanly?  It makes intuitive sense.  The more balls a player fields correctly, why, the better defender he must be, right?

Except that’s only part of defense.  It’s nice if a player can catch a ball well.  But what about balls that get by him?  In the last decade or so, baseball analysts began studying the concept of range.  All things being equal, the realization came, range is actually more important than errors or how a player looks.  It’s one thing to catch everything that gets to within a few steps to the shortstop’s right and left.  It’s another thing entirely to catch 98% of everything spanning the third baseman’s left pocket to the grass on the far side of second base.  When you consider the huge number of balls that are hit in the vicinity of the shortstop every season, and the relative value of a hit versus an out, those extra feet of range translate into saved runs.  And saved runs contribute to wins.

Just as an aside, current defensive metrics suggest Derek Jeter has cost the Yankees over a hundred runs relative to an average shortstop over his career.  Still a hall of famer.  Not a great defender.

However, those saved runs and that increased range come with a cost.  By definition, the best shortstops will have more chances to make a fielding play, and if you make more chances, you are likely to make more errors.  Indeed, the very fact that a great fielding shortstop is able to get to more hard hit balls on the edge of his range may well lead to a lower overall fielding percentage as well as a higher number of errors.

Fortunately for those shortstops, baseball teams are getting smarter and are realizing the tradeoff is worth it.  Scouting reports regularly cite range in addition to how a player looks, and fielding percentage is low on the list of statistics an organization cares about in evaluating a player.

And that gives me hope for innovation in two ways.  The first is the point above about the eye test.  We trust what we see and feel.  However, that’s not always the complete story.  Often in trying to implement innovation, there’s a gut feeling by those doing the evaluation–this is innovation, that isn’t, I can tell.  Only anecdotal evidence suggests that no, in fact, often people can’t tell.  Just ask Kodak.  However, if baseball can come to realize that the eye test, while important, is just one part of the evaluation package, industries can also learn that lesson and look for other, possibly less subjective ways to measure innovation.

The second relates to two contradictory things that are often said about innovation, sometimes one right after the other.  We need to innovate.  And we need to de-risk it to make sure that it will work.  Unfortunately, there can be no real innovation without the very real risk of failure.  In an interview with Wired magazine, the inventor James Dyson is described as having worked his way through 5127 prototypes of his bagless vacuum cleaner before hitting success.  But if baseball can come to realize that a decreased probability of fielding success is actually a good thing when it means a shortstop is reaching defensive heights few others can, maybe industries can finally realize that failure, in the right cause, is something to be celebrated and embraced.

Gastric bypass surgery and the ever expanding world of GxE interactions

All opinions are my own and do not necessarily reflect those of Novo Nordisk.

An early publication article from the Proceedings of the National Academy of Sciences reports the fascinating finding that children born to mothers before and after gastric bypass surgery show differences in the expression of  genes involved in, among other things, glucose metabolism and immune function.  The study is small, with only 50 children evenly split between cohorts born to moms before and after gastric bypass, but if it replicates, it’s another piece of evidence  showing how the environment influences the way our genes function.

Epigenetics has been a hot topic in genetics research for a while now.  It’s clear that DNA methylation changes over time and within an individual and can affect gene expression.  Studies in a number of institutions such as Washington State University in the lab of Michael Skinner have shown that changes can even persist through multiple generations (in rats, at least).  The PNAS report adds another twist in that the gene by environment interaction arose due to a change in maternal health induced by surgery.

There are a lot of implications to this, including the rather theoretical one of whether this knowledge would induce more potential mothers to undergo gastric bypass surgery, and also practical ones of whether weight loss alone without surgery or via, for example, a lap band, would have the same effect.  But the one I wonder about is what this might imply for drug development.

While many genetic variations are known to affect disease risk and progression, and drug metabolism, there has been considerable debate on how to use such data.  In many cases, such as with the majority of Genome Wide Association Study hits, the relative risk of discovered variants have been statistically significant but small.  However, as we have seen with Amgen’s purchase of DeCode, drug development companies are keen to use genetic information to help inform their drug development efforts, to find an edge.

In this PNAS report, however, I see a flag of caution.  I applaud the efforts of Amgen and other companies taking these risks, but this report of possible epigenetic effects following maternal surgery also points out how much we’re still discovering about basic human biology, how much we still don’t know about the diseases we study.  Understanding Gene by Environment interactions is, I think, one of the key factors we deal with in developing drugs, and not one to ignore.  And yet, it feels currently like one of those “unknown knowns,” the things we willfully decide not to think about, even though we know it’s there.

Metrics and the Heisenberg quality of gathering data about behavior

All opinions are my own and do not necessarily reflect those of Novo Nordisk

Thinking more about the Global Health Metrics Conference, one element that resonated was that measurement does not occur in a vacuum.  When metrics are gathered, and especially when they are gathered out in the open by global health surveys, for example, there’s the real issue of the act of measuring changing the validity of what’s being measured.  I’ve been thinking about this in the context of hiring and workplace management.  For example, if the media were to report that viewership of Khan Academy videos on YouTube was found to correlate highly with creativity in the workplace, I expect two things would happen.  One, viewership of Khan Academy would spike, and second, the metric would rapidly begin to lose what correlative and predictive power it had.  People would try to game the system.

In Global Health, where countries are incentivized to meet certain milestones, it requires real thought to either make sure the milestones are strongly causally related to the health goals, or else that the metrics undergo continual fine-tuning to ensure the desired effect.  If the metric were something like number of healthcare facilities, a country could ensure that number increases but there wouldn’t necessarily be a concomitant increase in actual health services delivery.  I’m sure these are topics the Global Health community wrestles with every day.

It’s kind of like with relationships.  While on the one hand, we can tell our partners what we want, and often see them do it, on the other hand don’t we really secretly want them to already know and behave accordingly, because somehow that’s more genuine?  It’s certainly why social science researchers often mislead their study subjects on the actual purpose of behavioral experiments.  Or, to quote from the movie Buckaroo Bonzai, “Character is what you are in the dark.”

Ultimately, it seems best to try to measure behavior as closely as possible to the desired outcome.  That’s why baseball is nice.  We want good hitters, and to find good hitters it’s simple:  we measure how well a player can hit.

A Genomics Researcher’s Take on the Global Health Metrics Conference 2013

All opinions are my own and do not necessarily reflect those of Novo Nordisk

Over the past three days I had the opportunity to attend the Global Health Metrics Conference here in Seattle.  This is not my field; I’m a genomics researcher working in biomedical research and drug development, but I’ve also been curious about what’s going on in the area of public and global health.  This seemed like a good place to get a crash course.  The Lancet has kindly published all the abstracts and I wanted to give my impressions of what I heard.

First takeaway:  I was surprised and intrigued by how many parallels I saw between the work I do (primarily transcriptomics and genomics) and the work I saw reported.  Sure, global health researchers use surveys rather than high throughput sequencing, and gather data on nations rather than patients, and deal with the complexities of culture and government instead of human biology, and work in the public sphere as opposed to the private, and use a completely different vocabulary than I do, but other than that it was really similar.  So similar I put together this table:

Biomedical Genomics Research Global Health  Metrics
Increasing amount and types of data Yup
Biomarkers Indicators
Growing emphasis on efficacy measurements ditto
Struggle to understand what tissue, cell, analyte to measure Struggle to characterize the right metric to demonstrate effects/efficacy
Gene X Environment interactions poorly understood Local environment effects beginning to be captured
Personalized medicine Nation specific solutions
Noisy data, lots of unknowns Maybe even noisier data and, yeah, unknowns
More focus on longitudinal studies Already there

And so on.  I’ll elaborate on a few more below.  Another immediate takeaway:  I wasn’t even aware of the Institute for Health Metrics and Evaluation (sorry guys).  Now that I am, it’s a place I’d like to visit.

One thing that really impressed me was the work that IHME has put into making the Global Burden of Disease survey lucid, simple and accessible.  The data presentation by Kyle Foreman and Peter Speyer (@Peterspeyer) was terrific.  Not so much for any specific piece of data (although the trends and findings are all pretty fascinating), but rather for their demonstration of the power of dynamic presentation and facile web-based tools.  Static powerpoint charts are clearly so last decade.  Anyone wanting to check out their presentation can go here, or even better just go directly to the site.  As a scientist who also works with large, multifactorial datasets, I know the struggle to condense that data into a usable, comprehensible form.  I think Peter and Kyle have done a great job, and I also like the potential crowdsourcing aspect of it.  As I’ve commented on before, crowdsourcing methods, whether via games or other techniques, have a real potential to fully utilize large datasets and also to solve big problems.

Of the many talks I heard, a few I’ll highlight, just for the specific points I took away.  On the first day, Tanya Marchant showed interesting and cautionary data about making sure that what you’re measuring really measures what you think you’re measuring.  In this case, measuring the presence of skilled birthing assistants as a proxy for maternal care during childbirth turns out to be incomplete because of other factors such as availability of basic medical supplies.  Reminds me of debates over things like how best to measure drug efficacy in clinical trials–for example, response versus progression free survival in oncology.

Joseph Dieleman presented his work on looking on the effects of external aid to developing nations for health.  In a perfect world, external aid would just be added to pre-existing health expenditures, and after aid expired, local governments would maintain spending at pre-aid levels, or even higher.  Well, turns out this isn’t always the way this happens.  Aid comes in, local health budget gets shifted “temporarily,” but temporarily turns to permanently when the external aid leaves.  One of the thoughts that went through my head during this conference was to remember the law of unintended consequences.

I enjoyed Michael Wolfson‘s talk on functional health status.  Coming from an industry that really likes it’s tried and true measures like HDL/LDL levels, the concept of looking holistically at factors relating to actually feeling good was a nice contrast, and food for thought.

Bruce Hollingsworth had a great quote in his part, “People need incentives to provide accurate data.”  Yeah.  Tell me about it.  In transcriptomics it’s been a mantra for years that “Garbage in, garbage out,” in terms of incoming biological sample integrity and resulting data quality.  From what I saw, the data you can get trying to measure Global Health is maybe even noisier than the kinds of data that I normally deal with.  My main conjecture for why all hope is not lost due to data quality in Global Health is that GH researchers are able to bias the indicators they sample towards things with (hopefully) real meaning, else they would be adrift in a sea of not very useful data.  Maybe they feel that way anyway?  Bruce also made the point that there are external factors, again, which influence health.  Even people who know where to go for the best treatment may not because the facility is too far away.  Location, location, location.

Speaking of garbage (but not in a bad way), David Phillip‘s talk later that day referred to the problem of trying to extract useful data out of vital health records full of things like garbage codes.  That is, causes of death that are supremely unhelpful from a public health perspective, such as (I’m exaggerating here) death by lack of life.  His work on extracting useful proportions from this data based on the overall data distribution reminded me of imputation techniques that are used in genomics.

There were many more engaging talks, and I also had great conversations at lunch with different people. I suppose I shouldn’t be surprised by the similarities.  I think many research fields these days are converging on a similar emphasis on big data, analytics, efficacy, and finding the right metrics.  I also appreciated the long view shown by so many of these programs.  One of the drawbacks of private industry is the prevalence, often, of the short term view.  I could wish we had the decades-long commitment shown by various Global Health initiatives.

The aspect I find daunting in Global Health is how much uncertainty that community is dealing with, which greatly affects efficacy and efficiency.  An intervention might be exactly the right one when viewed in isolation, but can be so easily derailed by external factors.  Like biology, like baseball, it seems the key thing is to find the metrics that at least tell you that you made the change you hoped for, with the understanding that what happens at the end is so often, unfortunately, out of our control.