In my Monday post about the White Sox recent success beating preseason projections, I included a statement that I’ve mentioned a few times over the last few years:
But also, please just keep in mind that projections are not predictions. They are a snapshot of what we think a team’s median true talent level might be, and it should be understood that there’s a pretty sizable margin for error based on things that projection systems simply can’t forecast, and also the errors that come from having imperfect information or imperfect calculations.
I wrote about this distinction a couple of years ago, but I think it’s worth delving into the differences again. For one, FanGraphs has gotten a lot larger over the last few years, so many of you might not have read that piece, but also, I think there’s a few things that I could have stated better in that article, and I want to give more context for why I see the distinction as meaningful rather than being a semantical argument with no practical use.
Let’s start out by acknowledging that predictions are a subset of projections. Or, to put it another way, predictions are projections, but a projection isn’t necessarily a prediction. I know that’s a bit of a tongue twister, and seems like a semantical difference, but think of it like this: Mothers are women, but not all women are mothers. No one would suggest that it is simply semantics to clarify whether a women is or is not a mother. There’s a meaningful difference there.
So it is with predictions and projections. A prediction is essentially a projection where there is a high degree of confidence in a specific outcome. Not all projections lead to that kind of confidence in one result, however. In fact, in many cases, an accurate projection will result in a range of outcomes where there is no single result that is likely to occur.
Let’s take the NBA’s Draft Lottery, for instance. The 14 non-playoff teams get various combinations of numbers assigned to them, and those numbers correspond to 14 ping pong balls that are placed into a lottery machine. The team with the worst record gets 250 of the 1,000 possible combinations, and then the second worst team gets 199, and each successive team gets fewer than the one in front of them, down to the 14th worst team getting just five of the 1,000 possible combinations. Because the NBA doesn’t want teams to drop too far by random chance, they only draw for the first three selections, and then the remaining teams are slotted in from #4 to #14 based on win-loss record in the previous year. The Wikipedia entry on the draft lottery has a pretty nifty chart showing the various odds of each outcome, which we’ll reproduce here:
If you were the only person on the planet who knew those odds, what kind of predictions would you be willing to make? Would you predict that the team with the #1 seed would win the first overall pick? I’d hope not, because you’d be wrong three times out of four, even though you were selecting the most likely outcome every single time. On the other hand, you probably would be willing to predict that the #14 seed would pick 14th, because a 98.2% chance of being right is pretty darn good.
Depending on how much you value your own credibility, you might even be willing to predict the outcome of picks #8 through #13, since the likelihood of being right on each was greater than 72%. You wouldn’t always be right, but you’d be right often enough that your overall record would come out looking pretty good. But, hopefully, you’d be wise enough to steer away from predicting any kind of specific result for anything in the top 7, where your odds of being right would be between 25% and 60%, meaning you’d be taking the side of something close to (or worse than) a coin flip in each case. If someone asks you to predict who is going to win the #1 overall pick in the NBA Draft Lottery, a correct interpretation of the data is simply “I don’t know.”
Preseason win-loss projections for Major League teams are much like the NBA draft lottery, just with the caveats that we’re not dealing with perfectly known variables and there’s no artificial floor placed below each team to keep them from crashing due to random variation. With all of the unknowns that are simply outside of the realm of forecasting, every possible win-loss record you could dream up for any team is unlikely. It doesn’t matter how good or how bad the team is; the spread of talent across the league is simply not large enough to allow us to have confidence in any given win-loss record to make a prediction, given all of the variables that we know we can’t forecast with any kind of certainty.
It doesn’t mean that these forecasts are useless, of course. Despite having a range of unlikely outcomes, we can still come up with a projection that is likely enough to occur for us to make a prediction, but that projection has to be a range of numbers, not a single outcome. Since even the best projection systems tend to have standard deviations from actual win-loss results of 6-10 wins, we can say with something like 95% confidence that a team will finish within +/- 16 games of their mean projection. So, you could confidently predict that a team that has a projected 81-81 record would win between 65 and 97 games.
The problem, of course, is that’s not very helpful. Anyone could predict that any team will be somewhere between “terrible” and “excellent”, and you certainly don’t need any fancy algorithms to say that a team could finish somewhere between first and last. This is why making preseason predictions is kind of silly. We simply don’t know enough in advance to be confident enough in our forecasts to make declarative statements about small ranges of outcomes.
We don’t have to get to the 95% confidence level that two standard deviations brings about, of course. Knowing that 68% of teams fall between +/- 8 wins of their projected record is still useful, as long as the results aren’t overstated. Knowing that, we can look at a team with a projected 75-87 record as an unlikely contender, but more importantly, we can look at a group of six teams projected for mid-70s records and realize that one of them will probably make a playoff run, since we’d expect two of the six teams to fall outside of the standard deviation range, with one on the high side and one on the low side.
In other words, if we look at all the teams that are projected to win between 75-80 games, we might find a list that includes the Orioles, White Sox, Brewers, Pirates, Padres, and Royals. None of them are likely to make the playoffs, but as an aggregate group, this is a pretty good place to start if you’re looking for a “surprise team” in 2013. It doesn’t mean that the surprise team will certainly come from that group — the Orioles weren’t forecast as a mid-70s win team last year, for instance — but starting with the preseason forecasts and knowing the standard deviation can help guide decisions about what teams should be making more aggressive efforts to improve their teams in the short term versus focusing on the bigger picture.
Where one can start to get into trouble is if they start treating all projections as if they’re predictions. Every preseason win-loss forecast that comes out over the next six weeks is going to put a single number on each team as the most likely outcome, but it’s important to remember that every single of those numbers is likely to be wrong, and that the spread in expected wins around that number is pretty large. When a team like the Indians starts upgrading their roster, the hope is not that they can push their forecast mean total up to 81 wins from 75 wins — which can be viewed as a meaningless difference if one is solely focused on a binary playoffs/no playoffs outcome — but that they can raise the amount of opportunities they have to have things break right and end up with 90+, sneaking their way into October baseball in the process.
The conflation of projections and predictions lies partly with the public’s fascination with “making a pick” and then defending it — those kinds of stories are extremely popular and drive a lot of traffic — but are also born out of the way forecasters have chosen to display their results. If we want to really get across the meaningful difference between projections and predictions, maybe we’d be better off displaying the results of preseason projections as overlaying bell curves rather than a simple standings table with the weighted mean representing the entire projection. Or maybe something like the way the guys at RLYW do it, with pie charts showing the differences in how often a division is won by each team in its simulations.
So, forecasters, here’s my request: Show us more than the single weighted mean outcome when doing win-loss records. Give us the confidence level of each number between 60 and 100 wins. That’s interesting data, and it’s helpful in pointing out that the projections you’re making are not predictions that you’re attempting to stake your reputation on. And, writers quoting those projections, let’s do a better job of calling them what they are. Or, more specifically, what they aren’t. The forecasters are doing a real service by publishing their results. Let’s not pretend that all that work is simply a prediction, no different than a random number pulled out of thin air by a television talking head. There is a difference, and we should try to shine a spotlight on those differences whenever possible.