FanGraphs Baseball


RSS feed for comments on this post.

  1. Does this improve the case of someone like say Larry Walker?

    Comment by Tomcat — March 28, 2013 @ 9:26 am

  2. If replacement level is set at approx 48 wins a season, and the Astros won 55 games last year…

    Oh my.

    At least they’d make for a semi-competitive AAA team.

    Comment by Rays' 7th Infielder — March 28, 2013 @ 9:29 am

  3. I feel like this change makes being a baseball nerd a lot easier.

    Comment by lonewolf — March 28, 2013 @ 9:31 am

  4. It’s definitely a step forward–but the statistic still has a whole lot of false precision which mere standardization of the baseline won’t fix. Mixing the poorly measured defense value (to say nothing of the baserunning hack) and giving players credit for positional adjustments that are, at best, heroic assumptions, in with the well-described offense variables leads to a measure that is full of a whole lot of mush.

    Comment by Blue — March 28, 2013 @ 9:32 am

  5. Glad to see this open collaboration between these two giants in the Sabermetric field. . . .

    Comment by Jeff T — March 28, 2013 @ 9:32 am

  6. And when can we expect these changes to be made on the website(s)?

    Comment by lonewolf — March 28, 2013 @ 9:34 am

  7. Dave – while you’re collaborating with Sean, it might be helpful to list out the other differences between the two versions of WAR, maybe on the glossary page? I know FIP v. ERA is probably the biggest difference.

    Comment by Bob — March 28, 2013 @ 9:35 am

  8. Probably not much, since the folks keeping him out aren’t relying on fWAR.

    Comment by Bookbook — March 28, 2013 @ 9:37 am

  9. Are the total WAR for players on both sites now updated??

    Comment by Hurtlockertwo — March 28, 2013 @ 9:38 am

  10. I wouldn’t say it improves anyone’s case. This is really just an adjustment in the baseline, so that everyone is now being measured against a common denominator. Walker’s WAR went from +73.2 to +70.0.

    Comment by Dave Cameron — March 28, 2013 @ 9:43 am

  11. The numbers are live now on FanGraphs. Sean will announce the changes at B-R when they’re live there, I’d assume.

    Comment by Dave Cameron — March 28, 2013 @ 9:43 am

  12. They are updated here as of this morning. They’ll be updated on B-R as soon as Sean announces this change as well. I believe that’s happening today.

    Comment by Dave Cameron — March 28, 2013 @ 9:44 am

  13. Good idea, thanks.

    Comment by Dave Cameron — March 28, 2013 @ 9:44 am

  14. bravo.

    Comment by Dave S — March 28, 2013 @ 9:50 am

  15. So are you guys using the same defensive metrics now as well?

    Comment by James — March 28, 2013 @ 9:54 am

  16. Oh good, this makes the shallow “just add up the WARs!” style of analysis which has sadly become the norm around here that much easier

    Comment by Manifunk — March 28, 2013 @ 10:05 am

  17. The false precision is a problem with the *users* of the statistic, not the statistic itself. It’s not the fault of WAR if some writer mistakenly draws a strong conclusion from a 4.2 WAR over a 4.1.

    Comment by mickeyg13 — March 28, 2013 @ 10:05 am

  18. I’m a bit confused on the math. How are we getting a .294 win percentage on 1000 WAR per 2430 available wins? I must be missing something.

    Comment by Brian — March 28, 2013 @ 10:07 am

  19. BRAVO!. I have been waiting for this for a long time, and the decision to make it a clean 1000 will make it more simple for the casual reader while still making it accurate and reasonable. Thanks David, Dave and Sean.

    I heard that BPro was also considering unifying their replacement level to your sites as well. Is that happening.

    Comment by Darren — March 28, 2013 @ 10:08 am

  20. I think I need to change my name…

    Comment by tomcat — March 28, 2013 @ 10:17 am

  21. right but as he played so many fewer games than other RF in the top 20 he is within a few WAR of the top 7 with most of those around him playing almost 1000 more games

    Comment by Tomcat — March 28, 2013 @ 10:19 am

  22. So are you guys using the same defensive metrics now as well?

    No. That’s one of the things that Dave’s getting at when he says “there will never be one single agreed upon WAR calculation” and “the common baseline will give us a better opportunity to explore where the real differences are.” Nothing about the way either site calculates WAR is changing, they’re just now starting from a common baseline.

    Comment by Anon21 — March 28, 2013 @ 10:19 am

  23. Jesus, you’re really worked up about this nonexistent problem, huh? Try reading some of the 98% of Fangraphs articles that aren’t positional power rankings, you dumb whiner.

    Comment by Anon21 — March 28, 2013 @ 10:21 am

  24. It still may improve the case, while not improving his chances.

    Comment by tomdog — March 28, 2013 @ 10:22 am

  25. That is like saying the that since a Hammer can’t cut plywood it isn’t a good tool or that since you saw a guy using a hammer to put drywall screws in that there is a flaw with hammers. WAR has flaws and should be used as a conversation starter not ender.

    Comment by Tomcat — March 28, 2013 @ 10:22 am

  26. Yes, relative to other players in that 70 WAR range, Walker is going to take less of a hit. Raising the replacement level rewards better players and decreases the amount of value that can be racked up by accumulating above replacement level but below average performances.

    Comment by Dave Cameron — March 28, 2013 @ 10:22 am

  27. I guess we now can use the “Griffin line” as the career equivalent of the “Mendoza line”

    Comment by tz — March 28, 2013 @ 10:23 am

  28. And as you said in the article, it affects those with longer careers the most.

    Comment by tomdog — March 28, 2013 @ 10:24 am

  29. and btw, love the change here and at baseball reference!

    Comment by tz — March 28, 2013 @ 10:24 am

  30. Or try a different site. All they did was create a common baseline for a statistic to make that particular aspect of the site better. If you don’t like it then you know what you can do.

    Comment by tomdog — March 28, 2013 @ 10:28 am

  31. never mind. I got it now.

    Comment by Brian — March 28, 2013 @ 10:31 am

  32. Just out of curiosity does anyone know what version of WAR espn uses on their site? Just wondering if the numbers there would be making the adjustment as well or if those numbers would still reflect a different baseline.

    Comment by MarinersFan000 — March 28, 2013 @ 10:34 am

  33. Playing the d’s advocate here: isn’t this simply a case of the two WAR peeps getting together so that there can’t be any more, “But the two WAR peeps can’t even come up with the same WAR?” or something similar?

    Comment by JeffD — March 28, 2013 @ 10:36 am

  34. Is Pitcher WAR measured by Innings or by games appeared in? shouldn’t Replacement Level be a function of per/PA?

    Comment by Tomcat — March 28, 2013 @ 10:37 am

  35. I agree it’s not very clear about the explanation.

    2430 is the number of wins in a full season amongst all teams you need to be .500. (2430-2430).

    1000 WAR is what’s needed to get to that hypothetical .500 season. So, replacement team level is at 1430 (2430 – 1000) wins.

    1430/4860 = .294

    Hence a replacement level team has a .294 win percentage.

    Comment by FJ — March 28, 2013 @ 10:41 am

  36. Is there any way we can get a list of the players who lost the most WAR due to the change in baseline?

    Comment by Caveman Jones — March 28, 2013 @ 10:44 am

  37. But there are still differences in the calculations that, as Dave says, should be considered a feature as they can tell you different things about different players. This is just putting both stats on the same scale to make comparisons easier

    Comment by agam22 — March 28, 2013 @ 10:48 am

  38. B-Ref has a great summary of differences between many WAR systems:

    Comment by Sky — March 28, 2013 @ 10:52 am

  39. does this mean that the cohort of replacement-level players you investigated will now produce a larger negative WAR?

    Comment by kdm628496 — March 28, 2013 @ 10:55 am

  40. Hey Dave, can you go back and fix all the articles written in the last five years? Thanks.

    Comment by Jason — March 28, 2013 @ 11:03 am

  41. How much longer until the new FIP formula gets uploaded into the FG glossary?

    And I’m super curious to see which pitchers’ numbers have changed for the better or the worse.

    Comment by MonkeyEpoxy — March 28, 2013 @ 11:05 am

  42. You are saying WAR is the equivalent of saying, “pretty nice weather today,” which is not much different then Blue’s comment.

    A hammer works perfectly to drive nails or claw them out, what does WAR do (beside signal that you use imprecise aggregates to start conversations).

    Comment by Cguudgyrdycjvhkj — March 28, 2013 @ 11:08 am

  43. Hey Jason,

    Nothing changes because everyone goes down by the same amount. So in reality nothing changes except the exact numbers.

    Comment by gouis — March 28, 2013 @ 11:08 am

  44. I think it would be great to add (statistical) uncertainties into baseball stats. For example, if a player had a .400 OBP in 600 plate appearances, the rough statistical error would be 1/sqrt(N) ~ 0.041, so you could quote his OBP as .400 +/- 0.041. The same could be done for any rate stat (or counting stat, if you’re careful). Propagating these errors through the WAR calculation could clear up some of these issues.

    For instance, are we sure that a player has exactly 2.1 WAR in a season, or is it more like 2.1 +/- 0.3 WAR? If one wanted to go further (beyond just statistical uncertainties), you could use the varying WAR definitions on the web (fWAR, rWAR, etc.) as measures of the systematic uncertainties in the WAR calculation.

    This seems like relatively simple statistical analysis to me – maybe it’s been suggested before?

    Comment by Tom H. — March 28, 2013 @ 11:10 am

  45. Just sort by career PA or IP.

    Comment by Bryce — March 28, 2013 @ 11:13 am

  46. OBP in a season as NO error because it is a full and complete description of the events of that season. There is no need for error bands around it because there is no statistical uncertainty to describe.

    Comment by Blue — March 28, 2013 @ 11:15 am

  47. “has” no error

    Comment by Blue — March 28, 2013 @ 11:15 am

  48. I like this idea, but I don’t think it’s as trivial as you imply. What does it mean to put error bars on the number of doubles a player hit? He hit them; error is zero. You could put error bars on the value of a double, or on you prediction of the talent underlying the number of doubles, but those have very different meanings.

    Comment by Bryce — March 28, 2013 @ 11:16 am

  49. Well, wait. Why would you want error bars on OBP? So far as I’m aware, there is virtually no measurement error associated with OBP, at least when it comes to people who played in the modern era of baseball.

    Comment by Anon21 — March 28, 2013 @ 11:16 am

  50. This is a change for the better. Thanks.

    Comment by Bryce — March 28, 2013 @ 11:17 am

  51. I believe ESPN uses bWAR

    Comment by momomoses7 — March 28, 2013 @ 11:19 am

  52. Chalk another one up to people who don’t understand the difference between description and prediction, I guess.

    Comment by Anon21 — March 28, 2013 @ 11:21 am

  53. This is great. Are you going to update the glossary pages so they will be consistent with the new replacement level? Also when I was looking at the glossary pages I noticed that you use a different replacement level for starters vs. relievers and i was just wondering what the new replacement level for relievers is? On a related note Dave Cameron might not want to use RA Dickey as the “walking example” of a replacement level pitcher in any updated explanations of replacement level.

    Comment by GWR — March 28, 2013 @ 11:32 am

  54. Sean Smith clearly had a tough time in the minors. Poor guy never made it to the bigs.

    Comment by Stan — March 28, 2013 @ 11:34 am

  55. Replacement level is dead! Long live replacement level!

    Comment by Big Jgke — March 28, 2013 @ 11:40 am

  56. There would be no reason to have an error bar for OBP insofar as it’s a descriptive statistic, but when using it to predict future OBP, it might be useful to have error bars in order to make it clear how much predictive value the sample has.

    Comment by Naveed — March 28, 2013 @ 11:45 am

  57. I’d love an article on players whose relative rankings are most affected. How much of the difference in Jack Morris’s all-time rank is resolved? Old rankings 145b/75f vs…?

    Comment by brad — March 28, 2013 @ 11:51 am

  58. Just to make sure I have this right: so a team needs 33.3 WAR in the aggregate to reach .500? (1000 WAR divided by 30 teams).

    Comment by Urban Shocker — March 28, 2013 @ 11:52 am

  59. 81, now, here. Unless he jumps up a lot over at b-r the mystery remains.

    Comment by brad — March 28, 2013 @ 11:54 am

  60. OBP as a measurement of what happened certainly has no (significant) error; however, if we’re trying to get an estimator of his true OBP, error bars are certainly appropriate. The 1/sqrt(N) error bars are approximately correct for large sample sizes, but binomial error bars are most appropriate for a rate stat.

    For example: we want to know the true OBP talent level of a certain player. He has had 4 plate appearances, reaching in two of them. In reality, his OBP has been exactly .500, but we know that this is a flawed measure of his true talent level. The 68% (1 σ) confidence interval for his true talent level is (.186, .814). This means we’re 68% confident that his true OBP lies in that interval, based only on the knowledge we have (his 4 PAs) – there’s just not much information. If he, however, had 400 plate appearances and reached in 200 of them, our 68% CI would be (.474, .526) – we would be much more confident.

    In standard baseball notation, both players have a .500 OBP, but we obviously believe the second one (+/- .026 uncertainty) much more than the first one (+/- .314 uncertainty). This helps to quantify this. It’s only meaningful as a predictor, though, not as a measurement. (You also have to assume that his true talent level is fixed, and not varying over time, which is probably roughly true for most players, at least over the course of a season.)

    Comment by Tom H. — March 28, 2013 @ 11:55 am

  61. Poor Jamie Moyer. 269 Wins just ain’t what it used to be.

    Comment by Steve Jeltz — March 28, 2013 @ 12:11 pm

  62. Hitting singles (or doubles, or triples, or homeruns) is essentially a Poisson process – discrete, countable events which happen at random intervals, but at some average rate. Thus, when we try to estimate, for example, HR/FB rate, we’re really trying to estimate the Poisson parameter λ of this process. Even if the observed HR/FB rate has no error (i.e., there’s no chance of misclassification), the estimation of the true HR/FB rate certainly has statistical uncertainties.

    Comment by Tom H. — March 28, 2013 @ 12:21 pm

  63. You’re mixing a couple of very distinct concepts. His “true OBP” is no different that measured OBP–what occured in the season, assuming no measurement errors. That’s very different from creating an estimate of “true talent OBP” that would be expected over a large number of PAs.

    Comment by Blue — March 28, 2013 @ 12:22 pm

  64. Again, the rate has no error and no statistical uncertainty because it is a descriptive statistic that is a full and complete accounting of the entire population of events.

    Comment by Blue — March 28, 2013 @ 12:24 pm

  65. Well, 33.3 WAR + 47.7 replacement level wins = 81

    So, yeah, it seems that way.

    Comment by siggian — March 28, 2013 @ 12:25 pm

  66. Fancy. Thanks for the explanation Siggian, that clears up a lot.

    That’s not quite what you see on the positional power rankings though, where it looks like an aggregate 38-39 WAR is what it takes to get to 80 wins. any guesses?

    Comment by Urban Shocker — March 28, 2013 @ 12:36 pm

  67. I guess we fundamentally disagree then – I indeed believe that true OBP is a distinct quantity from observed OBP. For one thing, true OBP must be able to take any value on the continuum of 0.000 to 1.000, but observed OBP can only take a certain number of discrete values. For example, if a player gets 700 PA in a season, his OBP can only take 701 discrete values – 0/700, 1/700, 2/700, …, 699/700, or 700/700. A player’s true OBP can be defined as the limit of his measured OBP as we approach an infinite number of observations. The fact that we have only a finite number of observations to estimate this true OBP is the origin of the statistical uncertainty we’re trying to measure.

    To summarize: true talent level OBP (or whatever stat: SLG, BA, etc.) is a quantity we can only estimate but never measure with perfect precision; measured OBP is a well-defined, exactly measured quantity, but it only describes a finite number of observed events, not the nature of the underlying distribution which generated those events.

    Comment by Tom H. — March 28, 2013 @ 12:44 pm

  68. How is a descriptive statistic of a population not “true”? It is an exact description of what occured!

    Comment by Blue — March 28, 2013 @ 12:49 pm

  69. That is never what WAR has been, from Tango’s earliest conceptualization to any of the implementations. You’re positing some different stat.

    Comment by Anon21 — March 28, 2013 @ 12:50 pm

  70. What is your preferred method for evaluating MLB talent? Using that lets build what would be considered the best possible roster from MLB talent [lets say based on 2012 stats].

    Then lets build another team using fWAR.

    I’m sure the two teams will have a lot in common [unless the stat you choose is (H-HR+SB)/AB or some other bizarre metric]. If the two teams end up very similar, then it would certainly seem that WAR isn’t quite as terrible as you seem to think. Comparing the players who are not common would at least be very interesting.

    Comment by Eric R — March 28, 2013 @ 12:52 pm

  71. Tom: What you are talking about is just not OBP. “True talent” is a useful concept in baseball, but mostly it’s useful for predicting future performance. When we look at historical OBP, all we want to know is what it was; the question of whether it was composed of a bunch of dying quails or hard line drives is just totally irrelevant to measuring its impact on the outcome of the games that have been played.

    Comment by Anon21 — March 28, 2013 @ 12:54 pm

  72. They’re only getting together on the value of replacement level. They’re not creating 1 unified WAR that will become THE WAR calculation. That was my initial concern but Dave’s explanation here alleviated that.

    Comment by chuckb — March 28, 2013 @ 12:56 pm

  73. Everyone doesn’t change by the same amount. The replacement baseline changes for everyone but that doesn’t affect everyone’s WAR calculation equally.

    Comment by chuckb — March 28, 2013 @ 12:57 pm

  74. Great work! My gut reaction to David Appelman’s post was concern but, after reading this, I really like what you and the Seans have done.

    Comment by chuckb — March 28, 2013 @ 12:58 pm

  75. That is comedy gold!

    Comment by Blue — March 28, 2013 @ 12:59 pm

  76. It’s easy to point out flaws in WAR, but what’s the point? We know it’s a rough indication of a player’s value, but it’s a pretty good one and it’s objective.
    If you have a better stat to do that job, please reveal it to us. If you have proven ways to improve WAR, please reveal them to us.
    Otherwise, shut up.

    Comment by Baltar — March 28, 2013 @ 1:03 pm

  77. I guess this whole subthread is caught up on whether we want to describe what happened or to estimate the true talent of a player. I come from a more statistical background, so I prefer the latter.

    Here’s an example: if a rookie comes up and gets 3 hits in 10 at-bats, he is hitting .300 – that’s the descriptive rate, and it has no error bars. However, it will take a lot more than 10 at-bats before I’m ready to say that he’s a .300 hitter – the error bars on his true talent level (for batting average) are too big for me to confidently say that. They’re complementary interpretations, not competing or exclusive ones.

    We do, subconsciously, apply error bars to most stats we see, however. We cut the triple-slash stats off at 3 digits because that’s roughly the level at which those rate stats fluctuate during a full season. We cut WAR off after one decimal place because we realize there is estimation in the calculation, and it would be disingenuous to say “Mike Trout had 10.03857 WAR in 2012″ because we just don’t know it that precisely. I’m just proposing that these uncertainties be quantified a little better.

    Comment by Tom H. — March 28, 2013 @ 1:06 pm

  78. better?

    Comment by Frank's Wild Years — March 28, 2013 @ 1:06 pm

  79. JAW JAW is always better than WAR WAR. Now that there’s only one WAR, the saber rattlers can focus on JAW JAW.

    Comment by jfree — March 28, 2013 @ 1:10 pm

  80. Yeesh. Meant to say — Now that the WAR War’s over, the saber rattlers can focus on JAW JAW.

    Comment by jfree — March 28, 2013 @ 1:13 pm

  81. We also implicitly include uncertainties by requiring a minimum number of plate appearances in season-long awards like batting titles. We require a player to have more than about 500 plate appearances to qualify because we know that, for a small sample size, statistical fluctuations are much more important and can inflate rate statistics beyond sustainable levels (which is roughly what I mean by “true” talent levels).

    Comment by Tom H. — March 28, 2013 @ 1:14 pm

  82. I do have guesses on that Urban.
    To begin with, that referred to the previous version of WAR, with a lower replacement level.
    I’m guessing the remaining adjustment had to do with injuries and other unknowns, which FanGraphs rightly did not attempt to predict in its rankings.
    The total number of team wins had to come out correct, so an adjustment was made.

    Comment by Baltar — March 28, 2013 @ 1:15 pm

  83. LOL!

    Comment by Baltar — March 28, 2013 @ 1:16 pm

  84. Yes, they will.

    Comment by Baltar — March 28, 2013 @ 1:19 pm

  85. Now maybe Dave can admit that FIP for pitchers is a dumber metric than ERA (or RA). Or maybe we can use batted ball profiles to do hitter war.

    Comment by db — March 28, 2013 @ 1:19 pm

  86. You’re making a huge assumption that some of us don’t “come from a more statistical background.”

    Comment by Blue — March 28, 2013 @ 1:20 pm

  87. LOL!
    That thought occurred to me, not just humouresly but seriously. I was thinking of the recent rankings series and whether they would correct it, then realized that if they did that, why not everything? Then the enormity of the task knocked the silly out of me.

    Comment by Baltar — March 28, 2013 @ 1:24 pm

  88. I apologize – I didn’t mean any offense. I only meant that in my real life, I’m a scientist who performs statistical analysis for a living, so I have perhaps a more “scientific” or rigorous view of what statistical analysis actually means.

    Comment by Tom H. — March 28, 2013 @ 1:27 pm

  89. You may be right, but that would be extremely cumbersome. You wouldn’t really want to read an article that showed those extra numbers on every stat that is being used predictively.

    Comment by Baltar — March 28, 2013 @ 1:28 pm

  90. Baseball Prospectus differs not only on replacement level, but also on run per win.

    Comment by commave — March 28, 2013 @ 1:35 pm

  91. ESPN uses Baseball Reference metrics, including WAR.

    Comment by commave — March 28, 2013 @ 1:37 pm

  92. Your comment is sort of double-dumb. There are good reasons for using FIP rather than ERA, which I won’t go into.
    And I would love to have some sort of analysis of a player’s batted balls to use in place of whether they happened to fall in for hits or not. That day may not be far off.

    Comment by Baltar — March 28, 2013 @ 1:38 pm

  93. “Shut up” is an awesome suggestion to someone who wrote something on the internet.

    Comment by Price enforcer — March 28, 2013 @ 2:08 pm

  94. how about aaron sele?

    Comment by tomplatypus — March 28, 2013 @ 2:14 pm

  95. We’ll be eagerly awaiting your more correct system’s publication on your website.

    Comment by commenter #1 — March 28, 2013 @ 2:15 pm

  96. I am so happy that this is being done, and even if Caple’s article earlier this year is getting credit for the catalyst and starting the discussion, I still have to call attention to this article that I wrote in December, 2012 — which as cited in the piece, was inspired by Sam Miller at BP:

    Comment by Joe Peta — March 28, 2013 @ 2:33 pm

  97. It establishes a baseline to measure player value, you can chose to trust things like park factors and defensive metrics more or less than the model does.

    Comment by Frank's Wild Years — March 28, 2013 @ 3:04 pm

  98. I think you’ve made a fundamental error of statistics. The sample OBP may well be known exactly, but we are interested in the underlying “true” OBP, which is known imprecisely due to the limited sample size from which we derive the sample OBP. Thus, the “true” OBP has an uncertainty, which we can estimate using Poisson statistics, as pointed out by the OP.

    Comment by X — March 28, 2013 @ 3:37 pm

  99. Err, I should say our estimate of the true OBP has an uncertainty, not the true OBP itself.

    Comment by X — March 28, 2013 @ 3:39 pm

  100. WAR was so last year. I like that someone ripping on WAR had to point out to you nerds that your model was flawed. FG says themselves it’s a general stat or a big hammer or something. And hey, you can add or subtract a win depending on what you think of the defensive value. So why try to make it exact? Keep tinkering with it and it will never have any credibility. I’ll always know what a RBI is even if it doesn’t tell me anything.

    Comment by Kyle — March 28, 2013 @ 3:40 pm

  101. Bravo. Whenever you compromise and arrive where Tom Tango is, it’s a good indication that you have done something right.

    For those too young to remember Alfredo Griffin, he was a co-winner of the AL Rookie of the Year award after hitting .287 with (a career-high) 40 walks and 21 steals (with 16 CS) as a 21 year old shortstop. If you look where he and Ozzie Smith were at age 21 and see where they ended up, you might think that smarts matter. You would be right.

    Comment by Mike Green — March 28, 2013 @ 3:43 pm

  102. we are interested in the underlying “true” OBP

    No, we are not. Not when constructing a statistic like WAR, which is simply supposed to serve as a descriptive record of what happened.

    Comment by Anon21 — March 28, 2013 @ 3:48 pm

  103. The hammer would be a terrible way to install drywall screws, and you certainly wouldn’t want to ‘claw them out’ once installed.

    Comment by That Guy — March 28, 2013 @ 3:54 pm

  104. I posted on BBTF as well, but I think this would be a good opportunity to for both BBREF and FG to report multiple WARs based on the different partitioning of credit between fielders and pitchers:

    FIP-WAR (fWAR) – batted balls are 100% fielder
    RA-WAR – batted balls are 100% pitcher
    ERA-WAR – batted balls are 98% pitcher (mlb fielding % is about .98)
    bWAR – whatever partitioning bbref uses to get the “middling” number.

    This would show that, in fact, the two sites are using the exact same numbers for RV AND would illustrate the kind of assumptions that go into creating a WAR stat and demystify the stat a bit.

    Comment by zenbitz — March 28, 2013 @ 3:59 pm

  105. We do have RA9-Wins (which is RA-WAR) on the site and have for a while. We also have BIP-Wins (portion of wins due to balls in play) and LOB-Wins (portion of wins because of stranded runners & misc other stuff), and finally FDP-Wins, which is the difference between RA9-Wins and WAR.

    Comment by David Appelman — March 28, 2013 @ 4:04 pm

  106. Can someone tell me how they calculate WAR for retired players? I thought they needed to use the data from the Sportvision technology to determine the distribution of balls for the UZR and UBR calculations?

    Comment by Kevin — March 28, 2013 @ 4:10 pm

  107. poor Alfredo, he will be forever be known as the player behind the Alfred Line.

    Comment by Kiss my Go Nats — March 28, 2013 @ 4:44 pm

  108. I feel like you’re missing out on what’s going on here. Everyone knows what an RBI is. These sites are just trying to help us understand the game better. You don’t HAVE to use WAR when you’re having a conversation with your buddy about who the better baseball player is. I wouldn’t say “You know, Andrew McCutchen had a better year than Josh Willingham because McCutchen had 3 more WAR.” That’s a lame conversation. But that’s better than saying “Yeah, I think Willingham was better because he had 14 more RBI.”
    My point is, why would you ever talk about RBI when there are so many better things to bring up. All RBI are good for is for me to get excited when the Twins get one and to get bummed out when it haoppens against the Twins.
    Why would you NOT want a statistic to tell you something, to help you understand the game better.
    And, as far as tinkering with WAR. What’s wrong with improving something? I don’t understand why you’d slag on something that is trying to get better? That’s just weird.

    Comment by hossenfefer — March 28, 2013 @ 4:48 pm

  109. I may be mistaken, but if the problem with Jack Morris’ WAR differential was only a baseline replacement level issue, wouldn’t his career WAR ranking be closer to the other site. This says he jumped up 70 pitchers because of a difference in baseline. Wouldn’t all 70 of those pitchers career WAR jump as well, if the baseline was the only problem?

    Comment by blindbuddysirraf — March 28, 2013 @ 5:14 pm

  110. Remember “bbs”, WAR is a cumulative stat. So if the baseline is moved (say lowered) everyone’s WAR does increase but a player who has played more seasons than others will have his career WAR jump more.

    Comment by Joe Peta — March 28, 2013 @ 5:40 pm

  111. They use Total Zone for pre-2002 (I think that’s a year) defense, which uses play by play data from Retrosheet.

    Comment by Ben Hall — March 28, 2013 @ 5:43 pm

  112. The primary takeaway for me was that Tom Tango is pretty much always right.

    Comment by Clave — March 28, 2013 @ 6:03 pm

  113. A population is not a sample, X. When you describe populations, error terms are not appropriate.

    Comment by Blue — March 28, 2013 @ 6:18 pm

  114. And in my real life I have a copy of SAS on my work machine and many, many statistical programs I’ve written to tease out information from huge data sets of various populations.

    Comment by Blue — March 28, 2013 @ 6:23 pm

  115. I believe this marks the first time that anything Jim Caple wrote contributed positively to the game of baseball.

    Comment by Ray A. — March 28, 2013 @ 6:32 pm

  116. Why is replacement level set at 1,000 WAR per 2,430 games? What is the genesis of that number?

    Comment by chasfh — March 28, 2013 @ 6:44 pm

  117. So players like Rickey Henderson and Pete Rose lost 14-19 WAR, and a player like Joe DiMaggio lost about 7-8 war? And the inverse for Bbref?

    Comment by adohaj — March 28, 2013 @ 6:52 pm

  118. Again it is important to understand the flaws in the way we measure things.

    Comment by Bill but not Ted — March 28, 2013 @ 7:01 pm

  119. I always thought fangraphs WAR was a little inflated compared to baseball-reference. Or of course that B-ref was a little deflated compared to fangraphs, as neither was clearly right or wrong about replacement level.

    Comment by Bip — March 28, 2013 @ 7:23 pm

  120. Man, Alfredo Griffin was pretty bad for a long time.

    Comment by Forrest Gumption — March 28, 2013 @ 7:56 pm

  121. The question now is not if, but when the Unification of Replacement Level will be commemorated on a US postal stamp or massive oil painting displayed at The Metropolitan Museum of Art. This is some big-time forefathers shit right here.

    Comment by Choo — March 28, 2013 @ 7:58 pm

  122. OPS has been cited for years (decades?), but what the hell does it mean? Should we have stuck with batting average until something like wOBA came along? Is wOBA good enough? Should we revert back to batting average until we are 100% positive about the linear weights in wOBA?

    Comment by TKDC — March 28, 2013 @ 9:06 pm

  123. I like this, but it’s freaking me out that I woke up today and everyone’s WAR is slightly different.

    Comment by Neil — March 28, 2013 @ 9:10 pm

  124. Cool, that explains Ricky Romero’s huge difference for his 2011 season. Fangraphs 2.4, Baseball Reference 6.3.! Are there any larger differences?

    Comment by Joshy — March 28, 2013 @ 9:18 pm

  125. Tom, do you not care about J D’s 56 game hitting streak because it was statistically implausible given his true talent level? Would a modern day .400 hitter not matter to you if he had an inflated babip and therefore his achievement involved substantial luck? If your answers to these questions are yes, I wonder why you like baseball. People honestly really do care about statistics that measure what actually happened. In fact, estimates of true talent are almost always used to project future performance, not to inflate or deflate previous performance.

    Comment by TKDC — March 28, 2013 @ 9:30 pm

  126. I don’t begrudge anyone the ability to enjoy the game in their own way. And yes, I would be excited for any of those records to be broken, but probably more for the history of it than for the pure statistical improbability.

    I’m not saying that descriptive statistics are wrong, or bad, or meaningless – just that there’s another way to look at these things that I think would be fun and interesting.

    (I did not expect to be having to write an argument like that, on FanGraphs of all places. Is this 2013 or 2003?)

    Comment by Tom H. — March 28, 2013 @ 9:40 pm

  127. This is nearly perfect satire, except I believe the first sentence should be, “WAR *is* so last year.” Nice job.

    Comment by Paul — March 28, 2013 @ 11:26 pm

  128. Would I be wrong to surmise this was the question:

    How good were the replacements?

    And the final answer ended up being:
    We split the difference

    Comment by samuelraphael — March 29, 2013 @ 12:45 am

  129. But it doesn’t, because his B-R WAR went up and his Fangraph’s down from this change. This difference is due to the different calculations.

    Comment by Patrick — March 29, 2013 @ 3:26 am

  130. I would assume it is because it is very close to Tom Tango’s original number of 1009 wins and it is a round number. Who doesn’t like round numbers?

    Comment by Pinstripe Wizard — March 29, 2013 @ 9:48 am

  131. So to summarize Mike Trout > combined Astros roster.

    According to Cot’s, the Astros 2012 opening payroll was around $60.8M. Given that Trout was 10.2/7.3 = 1.4 times better than the Astros, Boras should ask for a contract with an AAV of $60.8M * 1.4 = $85.1M. I’m thinking 10/850 is a good starting point.

    Comment by Pinstripe Wizard — March 29, 2013 @ 10:06 am

  132. And yes I know Trout isn’t a Boras client, but Boras would probably be the only agent that would ask for 10/850.

    Comment by Pinstripe Wizard — March 29, 2013 @ 10:08 am

  133. Peter repays Paul?

    One underlying problem with all the silly Stat Separatism is that the very best few understanderers of all this are getting paid for it,

    and keeping quiet.

    But okay, at least some good news here.

    Comment by rubesandbabes — March 29, 2013 @ 6:52 pm

  134. Well, graphs are descriptive, are they not?

    Comment by YanksFanInBeantown — March 30, 2013 @ 3:23 pm

  135. It’s nice to have both. Even if I do prefer bWAR for pitchers.

    Comment by YanksFanInBeantown — March 30, 2013 @ 3:34 pm

  136. They could be instructing you to close up your laptop and go watch some baseball.

    Comment by Oh, Beepy — April 4, 2013 @ 4:49 pm

  137. Has this change already taken place? Or does Randy Johnson really have a career WAR of 110?

    Comment by TheSinators — April 25, 2013 @ 8:01 pm

  138. There is still a huge difference between fangraphs WAR and Baseball Reference WAR in some cases. Why?

    Comment by dolbear65 — May 8, 2013 @ 10:22 am

Leave a comment

Line and paragraph breaks automatic, e-mail address never displayed, HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Close this window.

0.238 Powered by WordPress