Baseball, regression to the mean, and avoiding potential clinical trial biases

This post originally appeared on The Timmerman Report. You should check out the TR.

It’s baseball season. Which means it’s fantasy baseball season. Which means I have to keep reminding myself that, even though it’s already been a month and a half, that’s still a pretty short time in the long rhythm of the season and every performance has to be viewed with skepticism. Ryan Zimmerman sporting a 0.293 On Base Percentage (OBP)? He’s not likely to end up there. On the other hand, Jake Odorizzi with an Earned Run Average (ERA) less than 2.10? He’s good, but not that good. I try to avoid making trades in the first few months (although with several players on my team on the Disabled List, I may have to break my own rule) because I know that in small samples, big fluctuations in statistical performance in the end  are not really telling us much about actual player talent.

One of the big lessons I’ve learned from following baseball and the revolution in sports analytics is that one of the most powerful forces in player performance is regression to the mean. This is the tendency for most outliers, over the course of repeated measurements, to move toward the mean of both individual and population-wide performance levels. There’s nothing magical, just simple statistical truth.

And as I lift my head up from ESPN sports and look around, I’ve started to wonder if regression to the mean might be affecting another interest of mine, and not for the better. I wonder if a lack of understanding of regression to the mean might be a problem in our search for ways to reach better health.
Continue reading

Advertisements

What $85 million could get the NFL: thinking about the NFL concussion settlement

All opinions are my own and do not necessarily reflect those of Novo Nordisk.

Yesterday the NFL and the NFL Players Association reached a settlement concerning compensation for concussions and other football-related injuries.  The impending lawsuit was brought by former NFL players who claimed, among other things, that the NFL downplayed the risk of concussions despite having knowledge of their effects and also did not do all it could to help former players.

The total amount earmarked for the settlement is reported to be $765 million dollars, with the vast majority ($675 million) in a fund to support former players and families in dealing with the aftermath of concussions.  Commentators have noted that this appears to be a great victory for the NFL.  First, the amount of money is less than many expected even with a settlement.  Second, the NFL did not have to go through discovery, which would have laid open exactly what the NFL did know about concussions and possible side effects, as well as potentially other damaging information that, once released in court, could never be private again.

It seems likely that those who were bringing forward the suit settled because they were motivated to help the most needy members of their group.  Many former NFL players are suffering dementia and lingering aftereffects from their playing days.  Some families of deceased players will also benefit.  The former player pool can’t really afford to wait for the long protracted time a trial and subsequent appeals would take since in the interim many would fall into poverty and even poorer health; some could also die. Continue reading

Are market cap and present cash flows the best way to measure innovation?

All opinions are my own and do not necessarily reflect those of Novo Nordisk

Forbes, with the help of the folks from The Innovator’s DNA recently published their coverage and rankings of the 100 most innovative companies.  I’m particularly interested in their ranking method, as it contains elements that are near and dear to my heart–namely, metrics and crowdsourcing.  In a nutshell, they describe how they use a company’s current market capitalization, along with it’s current net present value based on cashflows, to extrapolate how much the market feels the company has in potential.  The method nicely incorporates crowdsourcing in that the market cap measures how much investors as a whole think a company is really worth, now and in the future, and if that’s higher than expected based on cashflow, that suggests investors are factoring in a bonus to value based on future expectations.  Higher future expectations are interpreted as investors seeing a particular company as innovative and having the potential for great leaps forward in offerings and/or income.

I really like using the crowd in this way, and would love to see an analysis that retrospectively looks at these kinds of values over, say, 1970-1990, and combines that with a mature assessment of which companies have been adjudged by business historians to truly have been innovative standouts, which is not the same as business successes.  We say now that Bell Labs was one of the most innovative places on the planet in the 1900s.  Would the same have been said at the time?

At the same time, I can’t help musing if this process couldn’t be made even better.  Recognizing innovation when it’s happening has obvious advantages for anyone looking to get into the next amazing thing, whether as a participant, an investor, or a policy maker.  So let’s start by examining where there might be shortfalls to the Innovator’s DNA method. Continue reading

Fielding percentage for UK surgeons

All opinions are my own and do not necessarily reflect those of Novo Nordisk.

Last week I posted on how our measurements of defense in baseball have become a lot more sophisticated, and how that gave me hope for the evaluation of innovation.  If baseball, one of the most tradition-bound of US sports can adopt to new metrics, surely business can too.

I was reminded of this with the publication of a recent article about the National Health Service (NHS) in the United Kingdom and their plan to publicize the surgical success rates of clinicians across their country.  Surgeons in eight different specialities will have their mortality rates for specific procedures, length of hospital stays post surgery, and other elements published in tables for anyone to access.  The first group to have this information released is vascular surgeons.

A fascinating aspect of how this is being done is that publication of one’s rates is voluntary, but if a surgeon chooses not to have his or her rates published, that surgeon will be named.  It’s not quite putting people into stocks in the public square, but it is definitely a form of public shaming meant to increase participation.

Nevertheless, six surgeons have opted out and been named.  Game theory might predict these are surgeons on the low end of the measured metrics, who are taking a calculated risk that the negatives associated with not publishing their rates are less than the negatives that would come with disclosure of their rates.  But that’s not the case.  The NHS has stated that none of these surgeons lie outside the normal range for the reported metrics.

Instead, these doctors are protesting that the metrics are not measuring the right things.   They suggest the metrics don’t take into account the subtleties involved in surgical cases, how procedure names alone don’t properly capture how difficult or easy a procedure might be for a given patient.  Are there comorbidities?  Is a patient in generally poor health?  Is a surgeon one who specializes in tricky, difficult cases which would therefore lead to a lower success rate even though the surgeon him or herself might be highly skilled and effective?  Could these metrics scare new surgeons away from performing more difficult procedures?

This echoes the debate about defense in baseball, and whether standard metrics such as fielding percentage are the best for measuring defensive ability, or if more elaborate measures better reflect reality.

Still, while I agree with the viewpoint that we should always try to improve metrics, I also think the NHS is doing the right thing.  I think in this case the proper analogy might be baseball defense back at the time before the invention of fielding percentage.  In the practice of medicine world-wide there is a surprising lack of information about measures like success rates and efficacy.  As Sir Bruce Keogh said to the BBC: “This has been done nowhere else in the world, and I think it represents a very significant step.”  To take another quote from the article, Professor Ben Bridgewater commented, “We’ve been collecting data on cardiac surgery since 1996 and we’ve been publishing it at individual surgeon level since 2005, and what we’ve seen associated with that is big improvements in quality: the mortality rates in cardiac surgery today are about a third of what they were ten years ago.” That which we don’t measure, we can’t improve.

In the US, that idea is becoming more prominent.  Recent articles in Time and the New York Times have highlighted how transparency is lacking in the United States healthcare system, and the Obama Administration’s emphasis on comparative effectiveness is another thrust in that direction.  What the NHS is doing is a great model and a great start, and I hope they continue to both make these aspects of healthcare more transparent and work to refine their metrics so that they accurately reflect the difficulty of practicing good medicine.

Cheetahs hunting redux: the next step in measuring baseball defense?

I had another thought about the collars that were used to measure cheetah hunting behaviors.  For a summary that is not behind a paywall, see here.  How long will it be before tools like these are used to measure baseball players, playing defense on the field?  Tools like FIELDf/x quantify the behavior of baseball players from an external viewpoint.  Sportvision’s cameras record elements of the game like positioning, how quickly a defender moves, the kind of jumps he takes when getting to (or missing) the ball, and overall range.  This allows a much clearer view of defender territory, ability to reach difficult balls, and general quality.

Now, what if that were combined with the kinds of tools that were used to measure cheetahs?  As the authors of the article point out, the collars they designed could record “some of the highest measured values for lateral and forward acceleration, deceleration and body-mass-specific power for any terrestrial mammal.”  If it can do that for cheetahs, it can certainly do that for Brendan Ryan and Mike Trout, much less Derek Jeter or Raul Ibanez.  By the way, this would obviously not be implemented as a collar.  You don’t have to drug and tag shortstops.  At least not for these purposes.

Instead, these monitoring devices would be attached to the body, and possibly in multiple places, to capture kinesthetics.  Now, one might say, can’t all this data just be captured by  image capture from the Sportvision feed, and algorithmically extracting things like acceleration, body positioning, etc?  Quite possibly; I don’t know enough about that technology.  But what about actions taken on fields which are not equipped with Sportvision cameras, which is to say, most of them?

That might end up being the sweet spot for implementing this technology, as an adjunct to training, coaching and scouting.  Being able to measure how quickly a high school shortstop actually reacts to the batted ball, based on his lateral acceleration and ability to accelerate/decelerate would provide a more proximal measure of athleticism when making scouting evaluations.  It can also allow quantification of both areas for improvement, as well as a measure of improvement during coaching.  And using these kinds of monitors can also help answer questions on what really is important for defense, based on a comparison of proximal, immediately measured body motions and more distal metrics such as are measured by things like UZR.

Like any of these kinds of quantified self tools, though, it remains to be seen how useful this extra data will be.  However, for the savvy organization at any level, I think these kinds of tools are worth thinking about.