FanGraphs Baseball


RSS feed for comments on this post.

  1. In retrospect, the methodology is not “very flawed”, only slightly flawed. Thanks for sticking with this project all year, and the humility and ability to make it better.

    Comment by Terence — November 28, 2011 @ 4:21 pm

  2. Change the name. “Power Rankings” is so Sports Illustrated.

    Comment by Husker — November 28, 2011 @ 4:35 pm

  3. It doesn’t really matter. You can not create a system that will totally avoid some knucklehead complaining that you are hating on their team. Just like you can not convince that same knucklehead that what some guy on the internet thinks of his team has nothing to do with whether or not they actually make the playoffs.

    Comment by MikeS — November 28, 2011 @ 5:14 pm

  4. first, i think it’s great that you’re taking feedback on this, and have clearly thought about this throughout the year. i think this is a great component to the fangraphs site and would be sad if it went away.

    IMO, i think the fan% does what it’s supposed to do for certain teams–yankees, phillies, red sox etc–teams where payroll acts as a normalizing factor through july 31st. however, is making sure those teams are “correct” in the rankins worth the cost of over/undershooting colorado or cleveland?

    …which is why i hestitate to vote for a previous 2-week sample to be included as a weight. “banked wins”, as a commentator above suggested, either by record or one of the pythags (my preference is actual record) standing would seem to correct best for the changing landscape for the first 1/3 through 1/2 of the season and in a way that WAR does not. to paraphrase DC’s phenomenal sox post from april: wins/losses in april, even in small samples, absolutely have an effect on the standings in september.

    the other suggestion i have relates to UZR in team WAR. is there anyway to strip the current year UZR from WAR and replace it with a 3 year WAR avg for players where that data is available or a 0 UZR for players without that experience (I say 0 because of Marcel’s success predicting rookie performance compared to other methods)? my concern is that when calcing a team WAR, the error associated with team UZR is the sum of the individual UZR errors. uzr might be a describer of what has happened, but it’s still a model, right? as a consequence, there is a pretty big error associated with team UZR.

    Comment by jcxy — November 28, 2011 @ 5:44 pm

  5. Why not replace FAN% with zips(R) (or equivalent) for all but the initial rankings? Continuing to rely on preseason projections when our best guess about future player performance, and perhaps more importantly future roster construction, changes as the season goes on, seems quite silly.

    Using actual record, zips(R), and some weighting of pythag records (perhaps giving more weight to more recent performance) seems the best way to go. This would basically mean using actual record as a starting point and then projecting ROS record using some combination of zips(R) and pythag. Sounds right to me.

    Comment by Jordan — November 28, 2011 @ 6:32 pm

  6. I think you ditch FAN% altogether. That way, when some Indians fan gets upset, the explanation is simple: the numbers show they are playing over their heads and they are regression candidates. FAN% introduces bias, and causes more problems than benefits.

    Comment by Eminor3rd — November 28, 2011 @ 6:32 pm

  7. Why not use ZiPs even on the initial rankings?

    Comment by Eminor3rd — November 28, 2011 @ 6:33 pm

  8. My point was that it doesn’t make sense to keep relying on projections that may well be based on now false information (injuries, trades, unexpected performance). Your point that there are other problems with FAN% is well taken, but it is a separate point.

    Comment by Jordan — November 28, 2011 @ 7:14 pm

  9. Despite thinking these power rankings are flawed, I almost voted for the top option due to the great Seinfeld reference.

    My suggestion would be to use a few different components: Projections, WAR%, Pythag, and Reality in varying weights.
    Projections-I’d try an approach similar to the computers in the BCS. Use ZiPS% Marcel%, Tango%, and Fan%, dropping the highest and the lowest (so the average of the two middle projections) so that way crazy stuff like Colorado’s Fan% doesn’t completely ruin the rankings
    WAR%-The Main component. No adjustments for recent changes in the team make-up, in order to calculate past events fairly as well. Should be the primary number fairly early on.
    Pythag and Reality-This number would be a weighted average of the pythag win % (as a predictive tool) and real win&. Both would be weighted toward more recent production to account for trades, signings, and injuries. True Win% or something like that.
    Then for the weighting–True win% and WAR% would be combined for the reality number and then they would be combined for a weighted average with the projection%. I voted for the projections% to be at 50% at about the 50 game mark so obviously I support that here.

    Comment by Jack — November 28, 2011 @ 7:31 pm

  10. That’s a fallacy. Even if people still will complain (and nobody said otherwise) you can still have a better justification for your decisions.

    A better system is possible, and I think a lot of the criticism above was legitimate and came from a thoughtful place. Not all knuckleheads.

    Comment by kylemcg — November 28, 2011 @ 7:43 pm

  11. If you build it, they will come

    Comment by jross — November 28, 2011 @ 10:47 pm

  12. for a third metric I would go with season record (actual wins) but also weight the last 14 games to give more of an impact to recent performance.

    something like .4*(season win%) + .6*(last 14 game win%)

    I chose actual wins because Pythagorean wins usually don’t vary far from actual wins. Why not make the results match up with the newspaper?

    Comment by adohaj — November 28, 2011 @ 10:47 pm

  13. I would like to see an impact to signifcant roster changes incorporated (injury and/or trade). Not quite sure of the ‘how’ though (and it wouldn’t be easy). Some (probably bad) thoughts:

    1) Adjust the WAR % of fan projection % for a key player move (either to or from team)…. could use ZIPs? or simply prorate/extrapolate the current WAR over the rest of the season (though that might introduce signifcant noise if a player is performing well over/under expectation and gets traded).

    2) Injury would also be subjective as I think you have to account for expected performance minus the projected replacement performance (shoudn’t just assume replacement level) but I think this should adjust the Fan expectation portion. For example the Card lose Pujols 2 days into the season… obviously fan projections did not likely account for that so it should be adjusted by the expected loss. While in the end the power rankings will reflect injuries (fan projection starts going toward 0), along the way the rankings will probablty appear inflated for a team hard hit by injuries.

    I guess the main thought is that if the fan projections are way off due to a somewhat unforeseeable circumstance the rankings will likely not reflect it until you get toward theend of the year when the results become 100% of the rankings.

    Comment by Tom — November 29, 2011 @ 1:31 am

  14. Indeed. Most of the criticism came early in the season, and the stubbornness of the rankings proved right in the end.

    As of May 23rd, Cleveland had the best record in baseball by a large margin, but was ranked #18 by the Power Rankings. In response to posters complaining about Cleveland’s positioning, I commented, “I’ll betcha, from here on out (i.e., rest of season), that Cleveland has closer to the 18th best record in baseball than the #1 record in baseball.” Three commenters accepted that “bet”.

    Turns out, over the rest of the season, Cleveland went 50-67, good for the 23rd best record in baseball over that stretch.

    Comment by Yirmiyahu — November 29, 2011 @ 9:12 am

  15. I’ve long wished that fangraphs in general would use a regressed 3-year sample of UZR when calculating WAR.

    But the system you’re describing is far too complicated for something like these Power Rankings. Every week, you’d need to calculate the regressed UZR for every player in baseball, and then add them together to get each team’s total.

    And consider that we’re talking about team stats. Most of the error should quickly correct itself just based on the larger sample size. And, if you really wanted to tweak the WAR equation for this exercise, it’d probably be better/easier to just toss UZR and use something like defensive efficiency (percentage of balls in play the defense was able to turn into outs). Philosophically, it’s a more pure complement to FIP anyway.

    Comment by Yirmiyahu — November 29, 2011 @ 9:23 am

  16. two thumbs up on wholesale regressed UZR data included in WAR.

    and you’re probably right about feasability, but…if you will it, dude, it is no dream.

    Comment by jcxy — November 29, 2011 @ 10:35 am

  17. Many of the criticisms came from differing interpretations.

    It would help if the “Power Rankings” had a stated goal.
    1) Predict entire season wins
    2) Predict rest of season wins
    3) Predict entire season quality of team (so a BAL gets a boost for playing the rest of the AL East, Cardinals get dinged for playing the NL Central and the Royals, etc)
    4) Predict rest of season quality (like in #3)

    It would also help if the goal had a stated measurement
    A) Squared error
    B) Absolute error
    C) Try to split the over/under ranking at 50/50 regardless of size (like a default on a mortgage model)

    Pick a goal, do some science to it, and make an objective choice. You’ll have a better response for critics than “well a plurality prefered the 50% weighting to occur at Game XX”.

    Comment by glassSheets — November 29, 2011 @ 2:10 pm

Leave a comment

Line and paragraph breaks automatic, e-mail address never displayed, HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Current day month ye@r *

Close this window.

0.127 Powered by WordPress