CHONE Projected Standings

Here at FanGraphs, we’re big fans of Sean Smith’s CHONE projection system. It’s proven to be equally or more accurate than any other projection system out there, he provides the data for free on his site, and he’s a good guy.

So, today, with news from the game at a virtual standstill, I figured we should take a look at the projected standings that he recently put up, based on the CHONE forecasts and his playing time estimates.

None of the projections should be all that surprising. The East divisions are very good, the West divisions are not, and the AL is still better than the NL. Perhaps fans in LA will be surprised by the lack of wins projected for their two teams, but as we’ve talked about, the Angels win total was built on a house of cards last year, and the Dodgers haven’t re-signed Manny Ramirez yet.

I know that whenever a projection system is published, a bunch of you guys immediately look at the results and proclaim they’re too low, because the projected leaders have less wins/home runs/strikeouts/whatever than the previous historical leaders. As we always try to explain, that’s because of regression to the mean. We understand that the final AL West winner isn’t likely to have 85 wins. Even CHONE will agree with you on that.

Let’s walk through an example, shall we? The Angels are projected to win the AL West with an 85-77 record. But that’s just a mean projection, based on the range of probabilities of the Angels winning anywhere between 60 and 110 games. Obviously, at the extremes, the odds are very tiny, so the distribution of the probabilities will look like a bell curve. Actually, let’s just show it to you.

bellcurve

At each win total between 81 and 89, there’s a greater than 5% chance of that win total occurring, if we agree that the Angels are a true talent 85 win team. There’s less than a 1% chance of each win total at less than 72 or greater than 98, but those are still possibilities, even if they’re pretty unlikely. Those are the individual probabilities – now let’s look at the cumulative probability.

cumulative

We find 76 wins at the 90% mark. In other words, we’d expect this Angels team to win at least 76 games 90% of the time. 50% gets you to 85 wins, while 10% gets you to 93 wins. So, while 85 wins is the mean for the Angels, and they have the highest mean of any team in the division, it is not predicting that the division winner will finish with 85 wins.

The Angels have a 19.3% chance of winning 90+ games, based on this distribution. But they’re not the only team in the division. The A’s, with their projected 81-81 record, have a 6.7% chance of winning 90+ games in 2009. The Mariners, with their 78-84 projected record, have a 2.4% chance of winning 90+ games. And the Rangers, projected at 72-90, have a .1% chance of winning 90+ games. The sum of these probabilities is 28.5%. In other words, despite projecting the best team in the division to win 85 games, CHONE is still saying that there’s a 28.5% chance that the division winner will win 90+ games.

Hopefully, this is somewhat helpful – when you look at projected standings, they are giving you relative strength from team to team. In every division, it’s pretty likely that some team will outperform their expected win total, and that the division winner will end up with more wins than the mean of the projected top team. This is not a flaw of projection systems – it is a reality of math.




Print This Post



Dave is a co-founder of USSMariner.com and contributes to the Wall Street Journal.


20 Responses to “CHONE Projected Standings”

You can follow any responses to this entry through the RSS 2.0 feed.
  1. mattymatty says:

    [CHONE has] “proven to be equally or more accurate than any other projection system out there…”

    Baseball Prospectus makes the same claim about their PECOTA projection system. What is the above statment based on? I’m not trying to be argumentative, but since these are two competing claims, I’m curious. Thanks.

    Vote -1 Vote +1

    • Dave Cameron says:

      BP is in the business of trying to sell you stuff. Take their claims about their own metrics with a few thousand grains of salt.

      Here’s a good thread on the various projection systems.

      Vote -1 Vote +1

      • Fresh Hops says:

        BP is also a bunch of people that really care about baseball. It’s not an infomercial and it’s unfair to treat the people at BP like they don’t really believe in their product.

        Vote -1 Vote +1

    • Fresh Hops says:

      In the article Dave links there’s a really nice response to the idea that this or that projection system is better:
      “I hope that more researchers evaluate the forecasting systems like David did, so that we can stop with the nonsense that one system is better than the other. As I’ve said in the past, it is a very dubious claim, akin to saying a team that wins 85 games is better than a team that wins 84 games. Marcel will win 83 or 84 games, the good forecasting systems will win 84 or 85, and the not so good will win 81 or 82.”

      There’s far too much that’s unknown in baseball to create any kind of super projection system that can actually predict the season. For any projection system, random chance/unknowns, will better explain deviations from predictions that shortcomings of the model. Might CHONE or PECOTA be ever so slightly more accurate than the other? Sure. But in any given season, CHONE and PECOTA are probably more likely to agree with one another than they are with reality, because random chance (something which by definition neither can predict) has more influence on the final standings than non-chance factors that these systems account for differently.

      Vote -1 Vote +1

  2. Chris says:

    Is it possible to get the data on the bell curve off of his website? Or are all projections just a standard normal bell curve?

    Vote -1 Vote +1

  3. What I find most interesting regarding the CHONE projections are that the Yankees, Red Sox and Rays grade out as the best teams in baseball. Tough division.

    Vote -1 Vote +1

    • AA says:

      That’s because they ARE the best 3 teams in baseball. It’s sad that the 3rd best team in the game won’t even make the playoffs this year.

      Vote -1 Vote +1

      • Collin says:

        The worst team in the AL East is better than the best team in the NL West and practically the entire NL Central, according to these rankings. Unquestionably the best baseball division in recent memory.

        Vote -1 Vote +1

  4. Mike says:

    Minor quibble: is the chance of someone finishing above 90 wins really 28.5%? Honestly, I think you use stats more often than I do but in this case, I think you have to subtract the probabilities for the events that occur jointly. So there isn’t a 19.3% + 6.7% = 25% chance that either the A’s or Angels will win the division with more than 90 wins, but 19.3%+6.7%-(19.3%*6.7%) = 24.7% chance. In other words, there are scenarios where the Angels might have 93 wins and the A’s 91 or something like that, which you’re double counting.

    When you subtract all the intersections, I get a figure of 26.5%. Doesn’t change your original point, but somewhat different (the difference between a .285 hitter and a .265 hitter, perhaps)…

    Anyway, love this site, it’s now in my daily reading, keep up the great analysis…

    Vote -1 Vote +1

    • Mike says:

      Uh I meant 26% instead of 25% …. ha…

      Vote -1 Vote +1

    • Dave Cameron says:

      Good catch, Mike. You’re right – I overlooked the situations where multiple teams win 90+ in the same season, so my numbers are very slightly high.

      But as you said, the point is the same.

      Vote -1 Vote +1

    • LarryinLA says:

      Mike, your point is right, but your math is off. Your math assumes those precentages are uncorrelated. They’re not, I would guess that the chances of the A’s and Angels winning 90 games are somewhat anti-correlated. So, I expect the real chance is between your number and Dave’s.

      Vote -1 Vote +1

  5. steve-o says:

    wow, lasts years projections weren’t even close. VERY RELIABLE INDEED!

    Vote -1 Vote +1

    • Fresh Hops says:

      Before you conclude that they weren’t very reliable, you should take a look at how human “experts” and other computer models did. PECOTA was off by an average of 8.5 wins in 2008, I think. Terrible right? In the contest I saw, which was between a bunch of “experts” and PECOTA, PECOTA beat every single one of them, and all the experts were off by more than ten wins. That’s a completely different system, I know, but it serves to make a point that predicting baseball is very hard to do and so what looks like huge error may actually be the most successful prediction out there.

      Vote -1 Vote +1

  6. TangoTiger says:

    Here’s how they did last year:
    http://vegaswatch.net/2008/09/evaluating-april-mlb-predictions-update.html

    PECOTA, MGL, and Chone are all pretty much neck-and-neck.

    MGL beat PECOTA in 05 and 06, and lost to PECOTA in 07, 08:
    http://www.insidethebook.com/ee/index.php/site/comments/saberists_predict_better_than_insiders/#19

    ***

    As for BPro’s claims: they are about to be tested:
    http://tangotiger.net/forecast/

    Clay Davenport will be supplying me the BPro forecasts for the project.

    Vote -1 Vote +1

  7. If Sean pops into this thread, which I assume he will, then: Would you be able to add the Runs Scored and Runs Allowed projections for each team as well? Gracias!

    Vote -1 Vote +1

  8. Christian says:

    The Giants are pretty low. I don’t expect them to win, but behind the Padres?

    Vote -1 Vote +1

  9. DavidCEisen says:

    The Tigers are the 5th best team? Really?
    2008 Record: 74-88
    2008 Pythagorean Record: 78-84

    Vote -1 Vote +1

  10. Matt H. says:

    I like this one for my own personal reasons because PECOTA has the Marlins behind the Nationals

    Vote -1 Vote +1

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>