It is not the case that “the thing you regress to has to be an estimate of true talent.” You regress towards the mean.

You are confusing two topics. (1) Over the long run, players will perform at a level consistent with their true talent. This is the definition of true talent. (2) Mean reversion, which says that those with higher measured values are luckier than those with low measured values.

Sons will tend to be more average height than their fathers and that fathers will tend to be more average height than their sons.

Sir Francis Galton was a eugenicist and interested in how desirable characteristics pass from generation to generation and learned that when he found parents with some great characteristics, their offspring tended to be more like the population as a whole (more average) than their parents. He called this process regression and invented regression analysis to study its extent.

Applied to baseball, it means that if you take the batting average of the top 10 players in 2009, and take their batting average in 2010, it will be lower, or (more precisely) closer to the average. The way that they go to be in the to group was part that they were really good batters and part that they were lucky. The average is not just some point chosen because it is there and useful–it is what you revert to!

]]>It might be the case that those aggressive puts for par sometimes end up in double-bogeys. Perhaps the frequency of those double-bogeys exactly counterbalances the greater frequency of making those par puts, so that everything is even-steven. But that would be remarkable! More likely, the more aggressive approach to par putting is either a good strategy or a bad strategy overall, and either case proves my point: with humans, the past affects the future in ways that could affect the regression path.

]]>Regression is about returning to true talent level. Luck – good or bad – is just noise around that expected level.

You’re misunderstanding why we regress and the inclusion of league average in regression.

We regress to league average because we only learn so much about what to expect from a player with a certain number of ABs (or whatever appropriate trial type we’re concerned with), and the best (simple) assumption in that case is that they’re closer to average than they appear, so we regress them to league average a certain amount.

This has nothing to do with cancelling out luck, though it does help to compensate for lucky events.

]]>Aaron Hill might make a good case study here.

It’s been oft-written that his abysmal .196 BABIP last year was “bad luck”. If so, he’s surely taking his time “regressing” to his “norm”, given his .200 BABIP so far this season (albeit based on a teeny tiny sample size).

Whatever the cause(s), mechanics, pitch recognition, pitching patterns, his problems are starting to look less aberrational.

]]>You can see both in the lottery. People expect numbers that have already won not to win again (so there are books of previously winning numbers) and people expect gas stations that sell winning tickets to keep selling winning tickets. Both are incorrect.

]]>For Pujols, mean reversion means that his 2011 is likely to be more like other players 2011 than his previous years were like other players previous years. So far, that prediction looks like it will be spot on.

]]>