- FanGraphs Baseball - https://www.fangraphs.com/blogs -

The Humility of Statistical Projection

It’s “projection week” here at FanGraphs, which is a nice coincidence, since I was going to post about projections, anyway. While I dabble with my own projections (which probably will never see the light of day), no one wants to hear about that. Instead, I’ve just assembled some (very) non-technical reminders that might be helpful when looking at projections.

I’ve often heard the complaint that projections are “arrogant,” “put too much faith in the numbers,” or the classic “they rely on what a player has already done, but they don’t tell you want a player will do.” I want to emphasize that projection systems are not based on esoteric “tricks,” but rather are based on the fact that we don’t know very much about the player from the numbers.

Projection is not divination. I’ve sometimes heard that projection systems aren’t worth looking at because “after all, they projected an .800 OPS for player x and he ended up with an .850 OPS.” That’s a straw man, but it gets at the general point: projections are not prophetic divinations of the future, but attempts to measure what the “true talent” of players at any given point in time. The “general formula” for player performance is: true talent + luck + environment. (I’ll table discussion of parks and aging for now.)

The problem is that we don’t know, at least from the raw stats, what exactly is “luck” and what represents a player’s “true talent.” Moreover, “luck” doesn’t just mean things like BABIP rates. Even a player getting 700 PA in a season will have varying levels of performance around his true talent, what we call “hot streaks” or “cold streaks.” (Cf. Willie Bloomquist, April 2009.) To single these streaks out begs the question: how do we distinguish the “streaks” from the “true talent” parts of the seasons from which the projections draw? Projection systems use different methods; here I’ll mention basic factors that are used by most good projection systems. This may be old hat, but they are worth discussing because of how often they are passed over.

Regression to the mean. This is a very important concept, so important that I’m leery of screwing up the explanation. The best introductory piece I’ve read is one by Dave Studeman. In short: given a lack of any other information about a player, our “best guess” is that he’s an average member of (some particular) population. The more data we have on the player, the more we can separate him from the “average” population. This is one place where sample size issues come into play. [Note that there is a great deal of debate about how to regress, e.g., what the “population” should be. For examples, search at The Book Blog or Baseball Think Factory.]

Weighted average. Say a projection involves the last three years of performance. Do you simply take the three year average? Well, no, true talent can change from year to year. More recent years are thus weighted more heavily (5-4-3 for hitters and 5-3-2 for pitchers are common weights). Alex Gordon had a .321 major-league wOBA in 2009, and a .344 in 2008. Do we automatically assume that .321 is closer to his true talent? No, because the .321 was in only 189 PA, while the .344 was in 571 PA.

This isn’t all there is to projection, but you’d be surprised how much work those basic concepts do. Tom Tango’s Marcel works entirely from a weighted average, regression, and a very basic age adjustment, and it hangs in with the “big boys” pretty well. No projection system will ever be perfect, of course. Part of that is the influence of “luck” and the limited samples we have from all players. Part of it is also that some players don’t have that much information available on them. Players develop differently.

The point is that we simply don’t know ahead of time which players will be exceptions. Projection systems generally do better when looking at how the project groups of players, rather than focusing in on individual successes or failures, as in the case of Matt Wieters (ahem). The point I’ve been trying to make in a roundabout way is that regression, weighted averages, generic aging curves, etc. might miss out on certain players, but are based on studies that show how most players would do. They are humble confessions of ignorance on an individual level, but are still the best overall bet. Expecting anything more leads to folly.

One might express the difference as that between a making a conservative, diversified investment and “just knowing” that Enron stock will continue to rise. Tough choice.

More later this week on “breakouts,” “outliers,” and other traps.