How’d We Do a Year Ago?

I’m probably not being biased at all when I say we offer a lot of different great features here at FanGraphs, but I’m personally a huge huge fan of our projected standings and playoff-odds pages. Now that we have ZiPS folded into the mix, things are pretty complete, and it’s exciting to be able to see the numbers whenever one wants to. The numbers are based on depth charts maintained by some of our own authors, and they’re living and breathing, so you can see the direct impact of, say, the Phillies signing A.J. Burnett. (It lifted their playoff odds about four percentage points.) FanGraphs is always improving, and these additions have been a big recent improvement.

Now, as discussed briefly yesterday, we never want the projections to be actually perfect. Thankfully, that’s never going to be a problem, on account of the damned human element. But we do want the projections to be meaningful, because otherwise, what’s the point? We want the data to be smart and more right than wrong. So that brings to mind the question: how did things go last year, in our first thorough experiment with depth charts and team projections?

This is a helpful page. We never really projected true standings, but over the course of the 2013 positional power rankings, we did project team-by-team WAR with author-maintained depth charts. So it would make sense to compare projected team WAR against end-of-season actual team WAR, which we can get easily from the leaderboards. Win/loss record isn’t as easy as just adding WAR to the replacement-level baseline, but WAR does more or less capture the core performance of a team, so this should be informative.

One complication: right after posting the team WAR projections, we changed the replacement level to agree with Baseball-Reference. So I can’t just do a straight comparison, because the projections add up to far more WAR than the end-of-season numbers. What I decided to do was calculate z scores. The Astros, for example, projected to be 2.5 standard deviations worse than the mean. The Braves finished the year 0.8 standard deviations better than the mean. It is possible to do a straight comparison of the z scores, or at least I think it is, and this should give us a sense of how well last year’s teams were projected, relative to the average.

So here’s a graph, of z scores against z scores. I want to emphasize that I have never been a mathematician. Probably never will be. Maybe I did something stupid, but this shouldn’t tell us nothing.

realityprojections2013

At least to me, it feels like there’s pretty good agreement. There’s an obvious relationship in the direction we’d expect. The Tigers were projected to be 1.8 standard deviations better than the mean, and they came out 1.8 standard deviations better than the mean. The Mariners were projected at -1.0, and they came out at -1.0. The Cubs were projected at -0.5, and they came out at -0.5. Of the 30 teams, 16 had z-score differences no greater than half of a point.

For position-player WAR z scores, the r value came out to 0.7. For pitchers, 0.7. For teams overall, 0.7. Basically, the projections weren’t totally off base. But there were some bad misses, and I’ll highlight them below. Maybe I shouldn’t say “misses” — projections can’t really be wrong. But there were some teams that deviated rather significantly.

RED SOX

  • Projection: +0.5 standard deviations
  • Reality: +2.0
  • Difference: +1.5

Initially, the Red Sox were ranked in the upper-middle tier. They finished as the best team in baseball, based on regular-season performance and subsequently postseason performance. Pitching projections missed by 0.8 standard deviations, but position-player projections missed by 1.5. They got more than what was expected from most players, especially Shane Victorino and Jacoby Ellsbury. Oh, also John Lackey and that Koji Uehara guy. It wouldn’t be fair or accurate to say the Red Sox came out of nowhere. They were, however, baseball’s biggest positive surprise.

ORIOLES

  • Projection: -0.9 standard deviations
  • Reality: +0.4
  • Difference: +1.3

I mean, this isn’t hard to explain. Baltimore’s pitchers did what they were projected to do. Baltimore’s position players did not, by which I mean, Chris Davis and Manny Machado did not. Davis and Machado were projected for a combined WAR of about 4. They came out to a combined WAR of 13. Pretty easy to surprise when you field two unexpected superstars.

INDIANS

  • Projection: -0.8 standard deviations
  • Reality: +0.3
  • Difference: +1.1

Some of this was as simple as big years from Jason Kipnis and Yan Gomes. But the bulk of this came from the pitching staff, which was projected to be a pretty big problem. The Indians were projected to have baseball’s fourth-worst staff. They came out ever-so-slightly above average, with Justin Masterson getting better, Ubaldo Jimenez getting better, and Corey Kluber and Scott Kazmir being outstanding. I’m never going to blame a projection system for not anticipating Scott Kazmir. That’s a thing. That’s a thing that happened in real life.

BLUE JAYS

  • Projection: +0.6 standard deviations
  • Reality: -0.4
  • Difference: -1.0

This was shared equally between the hitters and the pitchers. The catchers were atrocious, and Melky Cabrera was atrocious, and Maicer Izturis was atrocious, and Jose Reyes got hurt. Then Josh Johnson didn’t do what he was supposed to do, and Ricky Romero was something out of the dark parts of the Bible, and Brandon Morrow got himself injured. Last year, for the Blue Jays, had a lot of bad luck, which is why they’re reasonably expected to regress upward in 2014. But by the same token, they’re not expected to regress up into a playoff spot.

ANGELS

  • Projection: +1.3 standard deviations
  • Reality: +0.3
  • Difference: -1.1

This, despite Mike Trout over-performing. That’s what happens when you get way too little out of guys like Albert Pujols and Josh Hamilton. The Angels had every reason to expect those guys to be terrific, hence all the money, but Pujols had to play hurt and Hamilton’s just one of the more volatile players in the game. The pitching staff was its own kind of problematic, but it was the position players who were most responsible for this big negative deviation. Pujols and Hamilton should improve in the future, but then, 2013 counts.

PHILLIES

  • Projection: +0.3 standard deviations
  • Reality: -1.4
  • Difference: -1.7

And here’s the biggest team deviation. The Phillies were projected to be a little better than average. In the end, they were a catastrophe, despite big statistical years from Cliff Lee and Cole Hamels. Of course, an enormous problem was that Roy Halladay was projected for almost four wins, and he was actually worth about negative one. A rotation can’t really recover from that. But Chase Utley was the only position player to be worth more than 1.6 WAR. Carlos Ruiz was a whole lot worse than expected. Delmon Young batted 300 times. The Phillies, like other bad teams from 2013, are expected to be better in 2014, and while they still won’t be good, they should be watchable more than two or three times a week.

—–

Based on our first depth chart and team-projection experience, the numbers seem to be worth paying close attention to. Due to the nature of sample sizes, projections aren’t going to nail the spread in wins that we observe in reality, but last year’s numbers did pretty well in projecting the end-of-season order. Obviously, there are things we can’t predict, and it’s better that way. But when it comes to talking about the season ahead, our projected numbers make for a pretty good foundation.





Jeff made Lookout Landing a thing, but he does not still write there about the Mariners. He does write here, sometimes about the Mariners, but usually not.

38 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Catoblepas
10 years ago

Interesting to me is the slope of the best-fit line in that graph. If a team was projected for +.7 SD, they (on average) ended up at +1, and if they were projected for -.7 SD, they (on average) ended up at -1. This might just be because of the way projections work — they regress to the mean, so the human element pushes them further away — but I think there should be just as much chance that the human element pushes them closer to the mean. That would indicate a potential source of bias, with Fangraphs projecting the bad teams not as bad as they really are and the good teams not as good. Would love to see some more data from past years to get a better sample.

Iron
10 years ago
Reply to  Catoblepas

I don’t think there’s nearly enough data points or high enough r value to make too much of the 0.7 slope. It could easily regress to 1.0

Elias
10 years ago
Reply to  Catoblepas

Generally, a good metric for the accuracy of a projection is the mean squared difference between the projected and actual results. An optimal projection system that minimizes this amount will tend to have the property you’ve noted: the projected values are in a narrower range than the actual values. This is the same reason fewer individual players are projected to hit above .300 than we actually expect to see.

If you were to scale the projections up to match the expected distribution of actual results, you’d actually introduce more error.

Elias
10 years ago
Reply to  Catoblepas

Another way to say this is that we can’t project the noise part of what actually happens during a season, only the signal part. So projections have a variance that is related just to the signal part, whereas actual results have both variance from the signal AND from the noise. So the range of actual results is wider than the projected component.

Catoblepas
10 years ago
Reply to  Elias

That is an excellent explanation, and matches my intuitive expectations. You can expect some batter to hit .375 this season, but not one of your individual projections will have someone hitting .375. Thanks!

Dave
10 years ago
Reply to  Elias

So if you used monte carlo projections across the player population and their range of potential outcomes instead of projecting based on the mean player outcome the error rate on the monte carlo projections would roll up into more variation than the actual — we know some players are going to have outlier seasons but trying to project which ones is more likely to mess up the overall projection on a team level than to capture human variation?

TanGeng
10 years ago
Reply to  Catoblepas

Since it’s standard deviation the difference from mean are normalized. You can’t reasonably expect 1.0 slope for best fit unless your projection is nearly dead on correct.

You certainly can’t expect the slope to be more than 1.0.

Ian R.
10 years ago
Reply to  Catoblepas

Actual records are always going to be slightly more disparate than projected records because of in-season transactions. Good teams become buyers at the trade deadline and get better. Bad teams become sellers at the trade deadline and get worse. Projections can’t anticipate those trades, but we know they’re going to happen.

Ian R.
10 years ago
Reply to  Ian R.

Slight addendum: Right, this analysis is based on WAR, not record. My bad. The same logic still holds, just replace ‘actual record’ and ‘projected record’ with ‘actual WAR’ and ‘projected WAR.’

Sam
10 years ago
Reply to  Catoblepas

Catoblepas, I think the relationship is actually opposite of what you indicated. The projections are the x-axis, and the actual are the y-axis. Just look at the Red Sox point (projected at +0.5, actual was +2.0). So the linear regression suggests that the Fangraphs projections were farther from the mean than the actual results, in this case.

channelclemente
10 years ago
Reply to  Catoblepas

Look at the fit residuals, that and the variance. You don’t really know anything about individual teams that a coin flip wouldn’t tell you.