Baseball, regression to the mean, and avoiding potential clinical trial biases

This post originally appeared on The Timmerman Report. You should check out the TR.

It’s baseball season. Which means it’s fantasy baseball season. Which means I have to keep reminding myself that, even though it’s already been a month and a half, that’s still a pretty short time in the long rhythm of the season and every performance has to be viewed with skepticism. Ryan Zimmerman sporting a 0.293 On Base Percentage (OBP)? He’s not likely to end up there. On the other hand, Jake Odorizzi with an Earned Run Average (ERA) less than 2.10? He’s good, but not that good. I try to avoid making trades in the first few months (although with several players on my team on the Disabled List, I may have to break my own rule) because I know that in small samples, big fluctuations in statistical performance in the end  are not really telling us much about actual player talent.

One of the big lessons I’ve learned from following baseball and the revolution in sports analytics is that one of the most powerful forces in player performance is regression to the mean. This is the tendency for most outliers, over the course of repeated measurements, to move toward the mean of both individual and population-wide performance levels. There’s nothing magical, just simple statistical truth.

And as I lift my head up from ESPN sports and look around, I’ve started to wonder if regression to the mean might be affecting another interest of mine, and not for the better. I wonder if a lack of understanding of regression to the mean might be a problem in our search for ways to reach better health.
Continue reading

Making Change

And now for something completely different! Short fiction in honor of the recent unveiling of the Apple iWatch and Healthkit.

“I wouldn’t eat that if I were you.”

Sylvia paused, bacon cheeseburger halfway to her mouth, and peered at the neon green band wrapped around her wrist. The wraparound touchscreen was currently showing a cat emoji. It had a frowny face, expression halfway between puzzlement and alarm.

“What did you say?”

“I’m just saying,” said her Best Buddy wristband, “that when we met a few weeks ago, you mentioned wanting to keep your weight in a specific range.” The emoji shrugged. “Little friendly reminder. You know?”

Sylvia carefully put the burger back down and resisted the urge to lick grease off her fingers. She fumbled for her napkin, her fingers leaving translucent streaks on the thin, white paper.

“I–well, yeah. But, I mean, you’ve never said anything like this before like when–” She broke off, remembering the milkshake, the onion rings, the King-size Choconut bar…

“Well it’s not the first thing you do, is it? When you meet someone and you’re just getting to know them?” The cat had morphed into a light pink, animated mouse, standing on its hind legs, bashfully kicking one leg. “But now, we’re friends!” Continue reading

Baseball, Bayes, Fisher and the problem of the well-trained mind

One of the neat things about the people in the baseball research community is how willing many of them are to continually question the status quo. Maybe it’s because sabermetrics is itself a relatively new field, and so there’s a humility there. Assumptions always, always need to be questioned.

Case in point: a great post by Ken Arneson entitled “10 things I believe about baseball without evidence.” He uses the latest failure of the Oakland A’s in the recent MLB playoffs to highlight areas of baseball we still don’t understand, and for which we may not even be asking the right questions. Why, for example, haven’t the A’s advanced to the World Series for decades despite fielding good and often great teams? Yes there’s luck and randomness, but at some point the weight of the evidence encourages you to take a second look. Otherwise, you become as dogmatic as those who still point to RBIs as the measure of the quality of a baseball batter. Which they are not.

One of the thought-provoking things Arneson brings up is the question of whether the tools we use shape the way we study phenomena–really, the way we think–and therefore unconsciously limit the kinds of questions we choose to ask. His example is the use of SQL in creating queries and the inherent assumptions of that datatype that objects within a SQL database are individual events with no precedence or dependence upon others. And yet, as he points out, the act of hitting a baseball is an ongoing dialog between pitcher and batter. Prior events, we believe, have a strong influence on the outcome. Arneson draws an analogy to linguistic relativity, the hypothesis that the language a person speaks influences aspects of her cognition.

So let me examine this concept in the context of another area of inquiry–biological research–and ask whether something similar might be affecting (and limiting) the kinds of experiments we do and the questions we ask.

Continue reading

Could pro sports lead us to wellness?

Comment From Bill
St. Louis is being hindered in the stretch drive by some kind of GI bug passing through (so to speak) the team. Reports have as many as 15 guys down with it at once. That seems a lot, but given the way a baseball clubhouse works, my question is why don’t we see more of that? Answering that baseball players are fanatically interested in sanitation and hygiene ain’t gonna cut it, I don’t think…

12:10
Dave Cameron: They have access to a lot of drugs.

–comment from a chat at Fangraphs, September 24, 2014

So this comment caught my eye. Ever since I began following sites like BaseballProspectus.com and Fangraphs.com, and reading things like Moneyball, I’ve found myself thinking about efficiency and unappreciated or unexplored resources in different situations.

I realize this was a throwaway line in a baseball chat. But it piqued my interest because it seems to point out something that’s maybe underappreciated and understudied about how sports teams go about their business–specifically, the kinds of things they do to keep their athletes healthy.

My question is, does this represent a potential source of “Found Research” data that could help the rest of us reach wellness? Continue reading