FanGraphs Baseball

Comments

RSS feed for comments on this post.

  1. Excellent, well thought out, and reasoned piece Dave. Thanks for posting.

    Comment by Mike G. — August 26, 2011 @ 12:48 pm

  2. Responding to Buster’s tweets in 140 characters or less:

    1. What do you think makes up WAR?
    2. Most projections systems use past results extensively
    3. They don’t use WAR because they have even better metrics they didn’t come up with in their mother’s basement

    Comment by suicide squeeze — August 26, 2011 @ 12:49 pm

  3. Do you know what a couple great predictive statistics are? ERA, Wins, Saves, RBIs, AVG.

    Comment by theonemephisto — August 26, 2011 @ 12:52 pm

  4. No baseball statistics can predict what a player will do or accomplish in the future. Every player is an injury away from ending a career, no matter how promising. Every decision is based on the “hope”, for lack of a better word, that the player will continue to perform at a certain level. Dometimes you get lucky
    (Bautista) and sometimes you don’t. (Zito and many, many others)

    Comment by Hurtlocker — August 26, 2011 @ 12:58 pm

  5. Sarcasm, I hope?

    Comment by Sean O'Neill — August 26, 2011 @ 12:59 pm

  6. Isn’t this the same Buster Onley who was crying about the fact that people were using FIP, which he himself dubbed as a predictive stat, to discredit Trevor Cahill’s Cy Young campaign in 2010?

    Comment by Brendan — August 26, 2011 @ 1:04 pm

  7. Good rebuttal. Truthfully, Buster was trolling, knowingly or not. To criticize something without completely understanding it is… well, pretty useless.

    Some people have this strange mental block where they are incapable of truly understanding the point of sabermetrics… someone said it best, better than I can, that the number one principle of sabermetrics is that every number and stat has a role, and what’s more important than the number is interpreting what that number actually means. It’s so easy to misuse or misunderstand what exactly a number is telling you, and Buster did exactly that.

    Comment by Telo — August 26, 2011 @ 1:07 pm

  8. Why does BABIP factor into WAR at all? Is BABIP useful for anything other than predictive statistics?

    Comment by Eminor3rd — August 26, 2011 @ 1:09 pm

  9. I like Buster and Fangraphs! Can’t we all live in peace?

    Comment by Cloud Computer — August 26, 2011 @ 1:09 pm

  10. As poor as Seidman’s article was this morning, it’s closely related to that idea. Except more than context of the number among other similar numbers (Bond’s slugging compared to Williams’ slugging), it’s knowing and understanding what the number represents, and how it fits in the context of the game of baseball.

    Comment by Telo — August 26, 2011 @ 1:10 pm

  11. Buster is great for what he is. But just because he’s one of the most successful baseball beat writers ever, doesn’t mean he thoroughly understands all facets of sabermetrics. We are all equal in the eyes of math and logic, and Buster has shown indications of not totally grasping some major pillars of the sabermetrics curriculum. I think he would probably admit as such.

    Comment by Telo — August 26, 2011 @ 1:13 pm

  12. WAR tells you what happened; it’s a backwards looking stat. BABIP is what happened. So it makes sense not to ignore it… for hitter, at least.

    [Insert long discussion about FIP and fWAR for pitchers here]

    BABIP is extremely complex – the dark matter of sabermetrics, if you will. We have such a hard time separating the noise from the data. As our data and methods get better and better (HitFx ahem) we will begin to really understand players’ fluctuations in BABIP.

    Comment by Telo — August 26, 2011 @ 1:16 pm

  13. Buster is a great reporter. He is not so good at analysis of value and understanding statistics.

    Comment by theonemephisto — August 26, 2011 @ 1:18 pm

  14. Affect* ;)

    Comment by Ryan — August 26, 2011 @ 1:23 pm

  15. good post, dave, and a point that is often misunderstood about advanced stats

    Comment by jim — August 26, 2011 @ 1:24 pm

  16. If WAR isn’t supposed to be a predictive stat and only pertains to what actually happened in the past, then why do you guys use FIP and such in your pitcher’s WAR? Or if you want to keep the pitcher’s WAR fielding independent and negate the luck factor, how come you don’t do the same for hitters by neutralizing their BABIP’s to some extent? Seems a little strange to me.

    Comment by Jason — August 26, 2011 @ 1:27 pm

  17. Well… it depends on how you define “predict”. Can we know exactly, to the atbat, what a player will do in the future? Of course not. Not unless you have a map of every atom in the universe, and even then, there’s the human element. (This was an interesting discussion at Tango’s blog a month or two ago).

    But if you use this definition of predict:
    “to foretell on the basis of observation, experience, or scientific reason”

    Then you certainly can. And many, many people do so very well. Every year there are improvements to these models. We are getting close to the ceiling of predictive power using the numbers we currently have access to, but technology is always progressing. Things like Hit and Field FX will revolutionize sabermetrics and the models we use to predict performance… assuming we have access to the data.

    Comment by Telo — August 26, 2011 @ 1:28 pm

  18. I really wish Buster would just stick to scoops, quotes, and classic “beat writer” stuff.

    Whatever “analysis” he brings to the table is usually parroted opinions.

    He’s only showing how threatened he is by stats that he apparently just can’t comprehend by impugning them out of context the way he does.

    ESPN gave him so much false credibility by giving him Peter Gammons’ space on their page and Gammo he is most certainly not.

    Comment by Scott Clarkson — August 26, 2011 @ 1:28 pm

  19. The very very short answer is… a pitcher has less control over his BABIP than a hitter does.

    Long answer:

    It’s helpful to think of the situations as separate, as BABIP means something different to pitchers and hitters.

    Things that affect BABIP:
    – how hard the ball is hit
    – where on the field the ball is hit
    – whether it’s hit on the ground, air, or LD
    – speed of batter
    – fielders
    – park

    Now, let’s look at each of these and see how much control the batter and pitcher have over each,and whether or not they should be penalized for the factor:

    BATTER
    – how hard the ball is hit – lots of control
    – where on the field the ball is hit – some control
    – whether it’s hit on the ground, air, or LD – a good amount of control
    – speed of batter – complete control
    – fielders – zero
    – park – zero

    PITCHER
    – how hard the ball is hit – little/some control
    – where on the field the ball is hit – little control
    – whether it’s hit on the ground, air, or LD – some control
    – speed of batter – zero
    – fielders – zero
    – park – zero

    BABIP is out of the pitcher’s control in a lot of the factors, and he really shouldn’t be penalized by a bad defense, bad park, above average speed batters (if for some reason he faced way more fast hitters over a whole season – not likely, ignore this). And what he does have control of, he doesn’t have complete control.

    Comment by Telo — August 26, 2011 @ 1:37 pm

  20. Doesn’t the fact that Fangraphs WAR for pitchers is largely FIP based mean that it is…at least sort of predictive as well?

    Comment by Kevin — August 26, 2011 @ 1:37 pm

  21. http://www.fangraphs.com/blogs/index.php/why-our-pitcher-war-uses-fip/

    http://www.fangraphs.com/blogs/index.php/why-our-pitcher-war-uses-fip-part-two/

    Comment by Dave Cameron — August 26, 2011 @ 1:37 pm

  22. Posted this before the explanation. Thanks, I’ll read it.

    Comment by Kevin — August 26, 2011 @ 1:37 pm

  23. I assumed this article would be defending the use of FIP in fangraphs WAR rather than runs allowed. FIP is useful because it is a better predictor of future runs allowed than past runs allowed does. But FIP does not do a better job in actually describing past runs allowed, obviously.

    In this sense, fWAR for pitchers is trying to be a predictive stat, along the lines of “if every one pitched the same on the controllable skills and we ran the season 10000 times, who gets the best results”, rather than “who got the best results, lucky or not”.

    Comment by test — August 26, 2011 @ 1:39 pm

  24. That’s an interesting way to frame question. The other way to look at it is – we are taking 30% of what the pitcher did in the past, and REALLY nailing it. He was X good. We can do that, or we can say, well this is 100% of what the pitcher did, but this data is noisy as hell, close to useless. What would you rather base your WAR off of?

    Just because FIP happens to be a better predictor of future ERA/RA than ERA/RA is irrelevant, and really just shows you that ERA has serious deficiencies as a stat.

    Comment by Telo — August 26, 2011 @ 1:46 pm

  25. “He’s only showing how threatened he is by stats that he apparently just can’t comprehend by impugning them out of context the way he does. ”

    Yep.

    Comment by Telo — August 26, 2011 @ 1:46 pm

  26. because FIP isn’t a predictive stat, it uses what has happened. xFIP is the predictive stat, it replaces the hr/fb rate that the pitcher has with a standard number and says that if the pitcher continues to strikeout/walk/get flyballs at the same rates as before, and with nuetral luck his ERA should be X.XX.

    Comment by sean — August 26, 2011 @ 1:46 pm

  27. Buster has always been a great talking head. He’s well connected, can access rumor speculation, and knows how to break news before it becomes public. Any of his non-inner circle scooping insight should be taken very lightly.

    Comment by baty — August 26, 2011 @ 1:53 pm

  28. If you mean predict in some absolute sense, you are correct. Prediction systems can absolutely make probabilistic predictions and place error bands around them.

    Comment by Blue — August 26, 2011 @ 1:59 pm

  29. As Telo says, the problem with ERA or RA/9 is that there’s so much noise and so much outside of the pitchers control.

    FIP isn’t really predictive. It’s more of a “what should have happened given league-average defense”, while xFIP is the “what is going to happen” stat. ERA and RA/9 are the pure “what happened” stats, but they also include what the fielder’s did.

    Comment by theonemephisto — August 26, 2011 @ 2:04 pm

  30. This isn’t true in a meaningful sense. Runs allowed is the bottom line result for pitchers – a skill, like striking people out, not walking them, limiting homeruns, consistently limiting BABIP (for the few exceptions), getting GB instead of FB, is useful only because it turns out that having it helps to lower the number of runs given up. FIP (or xFIP) looks at what it does because they are the best way to relatively simply predict future results (i.e., runs allowed). To do so they of course use things that have actually happened already, but that doesn’t mean they aren’t basically predictive stats.

    As stated in the articles Dave linked to, it’s a matter of personal preference. I prefer that lucky players get credit for the lucky results, even if it’s unlikely they will ever have the same results again.

    I don’t mind checking both sites, and I don’t think bbref does it perfectly either, but in this case, I find their pitchers WAR more useful as a measure of who had the best results in a given year. If I wanted to pick a player for next year, I come back here. Six one, half dozen the other…

    Comment by test — August 26, 2011 @ 2:04 pm

  31. FIP ignores 70% of what happened. Its primarily used because it predicts ERA better than ERA does.

    It most certainly is predictive, and not descriptive.

    Comment by RC — August 26, 2011 @ 2:08 pm

  32. The only point you can take out of his comments, is that he has information regarding conflict between Organizational opinion of talent/production evaluation. He should stick to relaying what those methods are. His opinion is essentially meaningless.

    Comment by baty — August 26, 2011 @ 2:09 pm

  33. My thoughts on Buster aside, I think his questioning of WAR’s usefulness in evaluating a player is valid. He is saying you can’t just Look at WAR, you have to go deeper. Victorino’s WAR is 6.1. If I look at that alone I have no idea what that means for evaluation purposes. I need to break down the components of WAR and see what’s driving the 6.1 WAR.

    That brings up my point. If I have all the components of WAR in front of me. Then why do I need a formula with subjective weights and unreliable statistics (cough I am looking at you UZR) to synthesize the data for me? We all bring our own subjectivity to the table and so does WAR.

    I’d rather use my own preconceptions and assumptions because I know what they are. It’s simply harder to understand WARs

    Comment by The WAR to end all WARS — August 26, 2011 @ 2:15 pm

  34. The problem is that you’re confounding the pitcher’s contribution and the defense’s contribution. Those are impossible to separate out in ERA.

    Comment by theonemephisto — August 26, 2011 @ 2:16 pm

  35. They’re not seperated out by FIP either (as it uses IP).

    And while no, you can’t seperate out defense just using ERA, its silly to completely ignore balls in play. We know that pitcher’s have some control over BABIP. Just not full control.

    Comment by RC — August 26, 2011 @ 2:18 pm

  36. “FIP isn’t really predictive. It’s more of a “what should have happened given league-average defense”, ”

    No, FIP absolutely is not that. Because of the IP denominator, FIP goes past neutralizing BABIP, and actively punishes players who have skill at depressing BABIP.

    FIP is trying to determine skill level by using what they consider “under a pitcher’s control”.

    Comment by RC — August 26, 2011 @ 2:23 pm

  37. The Fangraphs authors themselves acknowledged that FIP is problematic for us in WAR. You’re still basically ignoring 70% of what actually happened, part of which is still in the pitcher’s control. ERA is problematic, RA/9 is problematic, and FIP is problematic. So we really trust WAR with any exactitude for pitchers.

    It’s for this reason that I dislike seeing Roy Halladay’s case for MVP being based on his 0.5 lead in WAR. If you take pitcher WAR with a grain of salt, and then provide a little margin for error in the UZR portion of position player WAR, then Roy Halladay isn’t any stronger a candidate than Matt Kemp, Joey Votto, or Justin Upton.

    It’s not that I’m categorically against pitchers winning the MVP, but I think they need to be sufficiently dominant that you can feel quite confident in their case. Vintage Pedro, maybe Clemens in ’97, Greg Maddux in ’95, and beyond that, essentially no one else.

    Comment by Bronnt — August 26, 2011 @ 2:24 pm

  38. I know WAR is a very respected statistic. It is useful because it tells us things that we might not know intuitively from looking at unsophisticated statistics. I don’t know how its validity is verified though. Can we total the WARs of each player on a particular team to calculate the expected win total of the team and compare that to the team’s actual win total? I just recall Buster being skeptical that Ben Zobrist was ranked so high with WAR; part of the argument for WAR is that is told us of Zobrist’s value when we might not see it so obviously. But what test did WAR pass that led us to trust it? What standards were used in its development? So far to me, the rationale seems logical yet hand-wavy. So I have a hard time trusting WAR charts where many players are within 1-1.5 WAR of each other in a given season, because I know of no standards that provide some uncertainty of the statistic.

    Sorry for the long comment, but I wanted to make sure I articulated my concern well.

    Comment by drewcorb — August 26, 2011 @ 2:25 pm

  39. “and he really shouldn’t be penalized by a bad defense, bad park, above average speed batters ”

    Right. The problem here with FIP is that by ignoring these things, they absolutely do penalize the pitcher for being in these parks, etc.

    Park factors affect Ks, BBs, HRs, etc.

    Also, because FIP uses IP as a denominator, and 70% of IP is balls in play, it doesn’t ignore defense at all.

    Comment by RC — August 26, 2011 @ 2:27 pm

  40. “if for some reason he faced way more fast hitters over a whole season – not likely, ignore this”

    Another thing. We make assumptions and statements like this all the time… IE “It’ll even out over the course of the season,” and yet, in the vast majority of cases, it doesn’t.

    I remember last year looking at this, and the OPS of players that pitchers had faced varied by more than .100. IE, some pitchers average opponent was a .680 guy, while others it was a .780 guy.

    Comment by RC — August 26, 2011 @ 2:30 pm

  41. This post started good, then turned to shit.

    Comment by Garrett — August 26, 2011 @ 2:33 pm

  42. It should be noted FIP does not describe run prevention. It describe predictive “tools” that impact run prevention. It also fails to account for potential pitcher talent beyond its rate stats. (BABIP depression, SLG depression, etc).

    Comment by Garrett — August 26, 2011 @ 2:34 pm

  43. Hrm. I prob should read thread before ranting. RC wins.

    Good effort DC. You’re improving your knowledge of English. Perhaps Eno Sarris will take note.

    Comment by Garrett — August 26, 2011 @ 2:35 pm

  44. lol

    Comment by Garrett — August 26, 2011 @ 2:37 pm

  45. We don’t know enough so we just make terrible assumptions.

    Thanks!

    Comment by Garrett — August 26, 2011 @ 2:41 pm

  46. Perhaps I should start suffering through horrid comments.

    Comment by Garrett — August 26, 2011 @ 2:42 pm

  47. Hai thremp.

    Comment by wiffle — August 26, 2011 @ 2:48 pm

  48. Here’s a simple example RC I’m sure you will understand. Say a pitcher has a 8 K/9, a 3 BB/9, and a 1 HR/9 rate. All of those numbers are descriptive. Now, if I combine them into one stat, those numbers still are descriptive. Does it describe everything, NO, but it is still a descriptive stat. Just because it doesn’t describe everything and ignores certain things, doesn’t make it not descriptive. Just like ERA ignoring things arbitrarily called errors doesn’t make ERA not descriptive, FIP ignoring the small BABIP control a pitcher has does not make it not descriptive.

    Comment by A guy from PA — August 26, 2011 @ 3:09 pm

  49. Well, each baseball player has 25 guys on it, so 25*(1) or 25*1.5 is quite a variance, wouldn’t you say?

    But of course you can calculate that – click on ‘Leaders’, then click on ‘Team’. Scroll down to your team and add. Rinse/Repeat for pitchers. Add…Here are the Blue Jays

    Bautista – 7.8
    Escober – 4.0 (11.8 team total)
    Molina/Arecibia – 2.5 (14.3)
    EE – 1.3 (15.6)
    Lawrie – 1.2 already! (16.8)
    Thames – 1.0 (17.8)
    Lind – 0.7 (18.5)
    Snider, Nix, Mccoy, Davis, Cooper, Hill – (-0.1) (18.4)
    Rivera/Rasmus : Call it even

    Pitchers

    Scrabble, Jo Jo, Dotel, Frasor – ~.7
    Morrow – 3.4
    Romero – 2.6
    Villanueva – 1.1
    Rest – 2.3

    Team total=18.4 + 10.1 = ~28.5 WAR

    I believe that a team of replacement level players would win ~43 games per year, for a per game win rate of .265 (I may have replacement team wins wrong). The Blue Jays have played in 130 games this year, so a replacement level Blue Jays team would have 34.5 wins. Add the Blue Jays’ team WAR of 28.5, and you get 63 WAR-expected wins. Their actual total is 66, so it appears quite close here. I’ll leave the other 29 teams to you…

    Comment by MC — August 26, 2011 @ 3:30 pm

  50. Sorry..each baseball *team*

    Comment by MC — August 26, 2011 @ 3:33 pm

  51. Where did you get the ~43 wins or win rate of 0.265 for a team of replacement players? I’m assuming this has been done to verify the WAR formula, or some other verification method. I’m just wondering what was done. I doubt people just developed a formula that “made sense” and called it good. It must have been verified sometime before you just calculated the Blue Jays’ win total today. How was that verification done?

    Comment by drewcorb — August 26, 2011 @ 3:47 pm

  52. hai clam

    Comment by Garrett — August 26, 2011 @ 3:50 pm

  53. The amount of replacement level wins is irrelevant honestly. It’s the control group – what is used to compare each player to. It doens’t matter if it’s 43 wins per 162 (which is where the .265 comes from, 43/162) or 22 wins or 18 wins, the baseline WAR player is still the same.

    However, all of their information as to where these numbers come from is here on the website, you just have to look for it..

    Comment by MC — August 26, 2011 @ 4:22 pm

  54. I wonder, though, if GMs are using WAR when negotiating contracts — at least when it works in their favor. (I’m certain agents are using WAR in negotiations when it works in their clients’ favor). Even though players are given contracts in the anticipation of future performance, it seems like past and current performance (aka “track record” and “veteran-ness” etc) matter a great deal; certainly, here at fangraphs you see contracts analyzed in terms of (current and projected) WAR.

    Comment by joser — August 26, 2011 @ 4:26 pm

  55. Here is a quick weblink/article quote

    Second what exactly is replacement level? Where did you come to that baseline?

    All it is is the contributions of players that are not part of the 25-man roster, basically. You can also look at it as the best (non-prospect) AAA players. There’s been many studies on this issue, and the consensus is very close to being 2 wins below average. MGL for example uses 18 runs below average per 150 games. Keith Woolner uses 20 runs per 162 games. I use 2.25 wins below average per 162 games.

    From this weblink here…
    http://www.insidethebook.com/ee/index.php/site/article/mike_silva_chronicles_part_2_war/

    Comment by MC — August 26, 2011 @ 4:30 pm

  56. I’ve read the reasoning and basis for WAR before. I hope I’m not coming off as though I don’t appreciate its how well-thought out and clever it is. However, I can’t see where it is verified against any sort of standards. I’m not sure how this was done, or can be done. If it has not been done, how can WAR be taken with anything but a grain of salt?

    Also, Fangraphs as 157 offensive players between 7.8 and -2.4 WAR. So an uncertainty of 1 WAR is ~10% of the total range, or roughly 15 players. So 1-1.5 WAR is quite a variance, wouldn’t you say?

    Comment by drewcorb — August 26, 2011 @ 4:43 pm

  57. But, in terms of what actually happened, how does it affect on field value? The guy got a hit, and it helped the team. Whether or not it was a lucky hit (IMO) shouldn’t change the fact that it was a hit. Player X provided Y value over Z time, no matter what his BABIP was. I don’t know why his BABIP changes our measure of the value he provided.

    Comment by Eminor3rd — August 26, 2011 @ 5:00 pm

  58. oh good, you’re back around.

    Comment by jim — August 26, 2011 @ 5:17 pm

  59. Here’s a quick study BtB did a little while back plotting team WAR totals against actual wins.

    http://www.beyondtheboxscore.com/2009/9/18/1035183/team-war-vs-actual-wins

    Comment by jonts26 — August 26, 2011 @ 5:40 pm

  60. I wish I could distance myself from these douchey Garrett #2 comments.

    Comment by Garrett — August 26, 2011 @ 5:47 pm

  61. WAR, what is good for? Absolutely nothing – Buster Olney

    Comment by Don Mynack — August 26, 2011 @ 6:07 pm

  62. DO YOU HAVE ANY IDEA WHO YOU ARE TALKING TO??????!!!!!!?!?!?!?!?!?!!?!!?!?!?!?!??!?

    Comment by wiffle — August 26, 2011 @ 6:09 pm

  63. Most reporters ply their craft by creating a narrative and then trying to make it a centerpiece of making sense of the facts as they see them. Olney may find WAR a challenge to that narrative and his associated interpretations, but I’m skeptical about his ‘they don’t use it so it’s useless’ logic (?). Myself, as a Giants fan, have a single example that convinced me of it’s usefulness. I looked at Tejada’s performance via WAR and concluded his better days were way behind him, but more importantly, that 2011 wasn’t likely to represent a Renaissance given the trends in his WAR score. That certainly let me adopt a useful POV on what was to come, I’m sorry to say.

    Comment by channelclemente — August 26, 2011 @ 6:14 pm

  64. “Truthfully, Buster was trolling, knowingly or not. To criticize something without completely understanding it is… well, pretty useless.”

    The only way for this quote not to be hypocritical, is if Telo is Buster Olney (or God, I guess).

    Comment by The Real Neal — August 26, 2011 @ 6:43 pm

  65. Just looking at that data, I would be shocked if aggregate war is noticeably better than just looking at OPS X playing time aggregates.

    Comment by The Real Neal — August 26, 2011 @ 6:58 pm

  66. So the slope of wins to date vs. team WAR is not very close to 1, and it has a fairly good correlation to show that WAR undervalues players. So does it undervalue all players equally or is it biased against some sort? I think it must have some systematic flaw since it does not appear to be within noise of a slope of 1. So this brings me to my original question, how can we expect to get a real reflection of a player’s value from WAR?

    Comment by drewcorb — August 26, 2011 @ 8:23 pm

  67. perhaps you are remembering one that I appreciate:

    IF you intend to look at numbers, THEN this is one of the best ways you would look at them.
    totally up to you if you are going to use numbers.

    http://www.insidethebook.com/ee/index.php/site/comments/how_to_sell_a_stat/

    Comment by Tim_the_Beaver — August 26, 2011 @ 8:48 pm

  68. I like WAR and other stats that simply tell what a player did, as these stats are helpful to MVP discussions (not necessarily best player).

    My gripes with WAR are:

    1. Using unregressed UZR. Single season UZR should be regressed 50% per MGL’s instruction.

    2. Park adjustments. Players do what they do at the park they play at, and the good hitters adjust to the park they play at. Why adjust for park, unless you want to predict what they would do at some other park. Also, unless they adjustments take into account how a park plays for RHB/LHB and RHP/LHP they are useless. I mean, does SAFECO really hurt Ichiro? Is CC really hurt by the Yankees short RF porch (given teams load up on RHB)

    3. My other grip is for pitchers. FIP does not tell what a pitcher did. This is a predictive stat, and one which filters out luck, kind of like BABIP which is not used for offensive WAR. To know how a pitcher actually did, lucky or not, you have to look at actual runs allowed, not theoretical runs allowed.

    So it seems to me that WAR has identity problems. It tries to measure what a player has done, but it also makes adjustments to go past what a player has done to look at context neutral skill, which is somewhat predictive.

    WAR is still useful, but I LOL at those who argue that someone is more valuable to his team because his WAR is 0.5 more than another player. The best indicators of what player has done are the individual counting stats (and rate stats derived from them) that go into WAR, as well as some others that are ignored, like RBI and R scored (adjusted for opportunity) and context dependent stats (performance late and close, RISP. etc). These of course are subject to some interpretation, but that makes for good discussion as opposed to the science is settled approach by some using advanced metrics.

    Comment by pft — August 26, 2011 @ 11:27 pm

  69. Re #3: FIP does tell what a pitcher did. It attempts to base a pitchers performance only on those things we can reasonably attribute to the pitcher. If you don’t agree with that, then rWAR would probably be more to your liking.

    Comment by suicide squeeze — August 27, 2011 @ 1:06 am

  70. “What WAR is good for: Absolutely nothing!” Alex Remington

    Comment by The WAR to end all WARS — August 27, 2011 @ 3:55 am

  71. WAR is currently the trendy stat that many like to take out of context. It’s not a perfect stat and needs time to mature and iron out it’s flaws. I see WAR at this point as BJ Upton, great potential but he’s got some things to figure out. So using it for anything besides ultimately pointless debate seems foolish.

    Which, I am afraid, is what Buster’s point is.

    Comment by Bill but not Ted — August 27, 2011 @ 4:21 am

  72. Your words: “attempt…reasonably…attribute.”

    These are all words/terms/descriptors of something pertaining to predictive function.

    There is no way around it. FIP attempts to normalize fielding factors, which are inherently NOT normalized in the field of play. And how exactly does one normalize said factors?

    …by “attempting to reasonably attribute” x and y in proper variable context of a and b.

    No one is saying FIP is unreasonable…simply that at it’s most BASIC function, unescapably predictive in nature.

    Comment by Romodonkulous — August 27, 2011 @ 5:56 pm

  73. RC, I think you’re overstating the impact of FIP using IP as a denominator. I don’t feel like re-doing the math, so I’m just going to C&P an example I used on HBT last week:

    Justin Verlander this year has 212 K, 48 UBB+HBP, and 17 HR in 803 TBF. Thanks to randomness (and in spite of the Tigers’ decidedly below-average defense), he has a .232 BABIP, meaning he’s gotten outs on 404 of the 526 balls in play he’s allowed (he’s also gotten 13 outs that weren’t on strikeouts or BIP – we’ll add those back in at the end). Let’s say he had a league-average BABIP which this year would be .290. 29% of 526 BIP is 153 H, leaving 373 outs. Verlander now has (212 + 373 + 13)/3 = 199.1 IP. By comparing Verlander’s component FIP to his posted FIP, I get the FIP constant to be 3.00 this year. His adjusted FIP is now ((13*17)+(48*3)-(212*2)/199.1) + 3.00 = 2.70, two hundredths of a point off his actual 2.72 FIP. Much of that insignificant difference can be attributed to randomness, not fielding…

    TBF is more technically sound (and it made me very happy when FG added K% and BB%), but in practical application there is very little gained from using (1-lgBABIP)*(TBF-K-UBB-HBP-HR)/3 as the denominator instead of IP.

    Verlander’s numbers may be a start or two out of date, but the point still stands.

    Comment by Kevin S. — August 27, 2011 @ 7:45 pm

  74. I’m wondering why WAR doesn’t take WPA or WPA/LI into its equation. From what I understand, WAR is context neutral as it is, which I can appreciate. But when assigning value to past performance isn’t the context important? It would change the way some people look at an MVP race…
    I understand that luck plays into WPA. In WAR a homerun is a homerun, but with WPA putting context to that, it shows that not all homeruns (or any other stat) are made the same. WPA/LI really, in my mind, shows what a player means to a team. Bautista isn’t just the WAR leader but he owns the Win Probability boards.

    Also, what about Stolen bases and Caught Stealing being used in UBR and, subsequently, WAR. I could be mistaken, but I don’t think it is. A stolen base from 1st to second puts a runner in scoring position, which can be extremely important. Look at two WAR leaders in Pedroia and Granderson. Both have 24 stolen bases, but Granderson was caught 4 more times, creating four additional unnecessary outs (while still advancing into scoring position up to 24 times). Shouldn’t these things factor into WPA and WAR, and also the MVP discussions?

    I’m a newbie to FanGraphs and have been doing my best in educating myself in the intricacies of this great site and everything it provides. The lack of SB/CS in UBR and WPA/LI in WAR do bother me though. Please correct if I’m off base.

    Comment by Evan — August 27, 2011 @ 11:42 pm

  75. WAR is a comprehensive statistic that includes many facets of a player’s performance. So saying teams will look at “OBP, ERA, defense-independent P numbers” and not WAR, when WAR includes base-running and fielding events, is like saying people don’t like to eat cake they only eat things with flower in them.

    Comment by R_Magillicutty — August 28, 2011 @ 12:22 am

  76. While I assume it was meant as a joke, over a long time (like 4-5 seasons) ERA actually becomes just as / more predictive than things like SIERA, FIP and xFIP.

    http://www.fangraphs.com/blogs/index.php/want-to-write-for-fangraphs-2/

    Comment by Matthias — August 28, 2011 @ 3:20 am

  77. SB/CS are accounted for in wOBA.

    Comment by Kevin S. — August 28, 2011 @ 8:02 am

  78. It’s useless to argue that FIP is predictive, not descriptive, around here so don’t bother.

    But the absurdity of using FIP for WAR is right there on the page in Dave’s “Why our Pitcher WAR uses FIP” post:
    “In the end, we had to choose between two different methods – assuming that the pitcher had no responsibility for the outcome of a ball in play, or attempting to approximate the amount of time that the result was due to the pitcher or the fielder. Ideally, we’d be able to do the latter – which is how Sean approaches it – but I just don’t think we currently have the tools available to make an accurate enough judgment on how to apportion that responsibility.”

    And further in part 2:
    “1. FIP-based WAR, which is what we ended up using, essentially admits that we don’t have enough information about dividing responsibility for the results of balls in play, and so it ignores them.

    The problem is that once you put FIP into WAR and start arguing that Francisco Liriano or Cliff Lee deserved the Cy Young last year or should even be in the conversation, you are going a step too far. You are no longer “just ignoring” the balls in play. You are emphatically stating that they are not the pitcher’s fault. You’ve jumped the shark. Those balls in play resulted in runs scored and lost games, making fWAR problematic in any fair Cy Young discussion.

    You could just as easily have gone the opposite route and used ERA in WAR and placed all of the blame for balls in play on the pitcher. To me, thats a much more honest assessment of what actually happened.

    How can one possibly argue that only taking 30% of what happened is a fair description of what happened?

    Comment by noseeum — August 28, 2011 @ 10:50 am

  79. sorry in advance for my ignorance but does anyone know if the positional adjustments in WAR are calculated scientifically/mathematically or just subjectively?

    Comment by jts5 — August 28, 2011 @ 10:57 am

  80. This is probably a good time to point out that there is no such thing as a predictive statistic. xFIP, for example, is simply a transformation of another statistic (FIP) with potentially greater value in terms of building predictive models. The notion of “predictive statistics,” from a frequentist perspective anyways, involves putting together probability models and measuring their performance against some criteria (remember p-values, confidence intervals, etc. from your intro to stats courses?). One of the simpler examples of this might be creating a regression model containing the past three years’ xFIP as a predictor for a statistic in the 4th season (RA9, maybe?). You could then evaluate the utility of that model vs. other models using p-values (or a Bayesian maximization method if you’re really up-to-date on your skills).

    Comment by Ray — August 28, 2011 @ 3:37 pm

  81. I think Buster is still sore over his productive out percentage getting laughed off the face of the earth.

    Comment by Notrotographs — August 28, 2011 @ 8:16 pm

  82. I strongly agree with …

    (1) MGL’s suggestion to regress single season fielding Runs — UZR.

    (2) Tango’s suggestion to average fWAR with brWAR for pitchers. Not giving 100 or 0 % credit for thing like LOB% and BABIP, but giving some credit.

    I’m not painting myself as anywhere near the baseball stat minds of these 2 guys. But given the situations of FIP and UZR, the recommendations from their creators/developers just make a lot of sense in regards to single season WAR.

    We put a lot of stock into WAR, and even use it with a decimal point.

    We know that P have less control over BABIP than batters do. But we also know that they experience more BIP (via BF) than do batters. So 10 points less than league average equates to a lot fewer hits allowed, which is important.

    I also think that “luck” on BABIP can be something pitchers influence in a single season but not necessarily for their career or consistently. It is still difficult for me to bekieve that for an entire season a pitcher can just catch all the breaks in regards to balls being hit right at fiekders or the defense only playing outstanding when that guy pitches. I could be wrong on that but experiencing that type of luck over 200 IP just seems so extraordinary that the most likely conclusion is that the P must be doing something (location, changing speeds, etc) that leads to lesser contact, but not necessarily more K’s or swing and misses.

    Comment by CircleChange11 — August 28, 2011 @ 9:34 pm

Leave a comment

Line and paragraph breaks automatic, e-mail address never displayed, HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>


 

Close this window.

0.233 Powered by WordPress