FanGraphs Baseball


RSS feed for comments on this post.

  1. This is extremely interesting and I would love to see it taken a bit further. What is the average $/WPA for RPs? Is it more in line with other position players?

    Comment by Eminor3rd — December 20, 2011 @ 2:38 pm

  2. Good stuff

    Comment by fjmanuel — December 20, 2011 @ 2:42 pm

  3. I think the reasons teams do it is that when RP fail it is often spectacular and memorable. Much more so than any other position. If a RP has just a few melt downs out of 40 appearances it will look bad as opposed to a hitter who strikes out a few times late in the game with men on. A game that had been mentally already placed in the win column becomes a loss. Emotion takes over and managers/GMs say they aren’t going to go through that again so they overvalue the position and overpay.

    Comment by MikeS — December 20, 2011 @ 2:48 pm

  4. Those trends don’t look very linear to me.

    Comment by Yirmiyahu — December 20, 2011 @ 2:49 pm

  5. Yeah, it’s hard to get past the mentality that a save = a win, and a blown save = a loss.

    Comment by Yirmiyahu — December 20, 2011 @ 2:52 pm

  6. Thank you. This is a much better way of studying how teams operate. Using a rigid $/WAR approach and then declaring all relief pitching contracts to be terrible is so, so lazy.

    Comment by John — December 20, 2011 @ 2:54 pm

  7. I’ve been following the free agent market very closely this year and the thing that sticks out to me the most is the Javier Lopez/Matt Capps/LaTroy Hawkins/Jon Rauch type signings. Those four pitchers might combine for a total of 1.5 wins next year, but they’ll be getting paid 15.5 million dollars.

    It seems like teams want to pay more to know what they’re getting (established veteran) instead of bringing up some young AAA pitcher at the league minimum.

    Comment by CSJ — December 20, 2011 @ 2:56 pm

  8. For all of Neal Huntigton’s faults, the poor value (WAR related) in relievers is something he figured out several years ago. He has consistently built solid bullpens on the cheap during his tenure. So, not every GM overpays for relievers. Ours overpays for over the hill veteran position players.

    Comment by PiratesHurdles — December 20, 2011 @ 3:06 pm

  9. I always love it when a reliever blows a lead and then his team bails him out the next inning and he ends up with the win despite a line that reads something like 1 IP, 1 H, 2BB, 1 HBP, 3 ER.

    Random coworker “Did you see Gagne blow that game last night?”

    Me “I missed the game but when I looked at the box score he said he was the winning pitcher.”

    Comment by grandbranyan — December 20, 2011 @ 3:23 pm

  10. Who fits that bill under the Huntington regime? He’s been a lot smarter about that. Sure he likes collecting middle infielders but he’s never even done that to the extent the NL West teams seem to be doing this year.

    Comment by Los — December 20, 2011 @ 3:37 pm

  11. ICorrect me if I’m wrong, but since leverage will never be constant across all players, the correlation between WPA and WAR will always be limited, which I believe is one of the underlying premises of the article.

    Comment by James — December 20, 2011 @ 3:53 pm

  12. I think if Jack had used WAA instead of WAR, the scatterplots would look more linear than they currently do.

    Comment by Colin Wyers — December 20, 2011 @ 3:54 pm

  13. I might have missed it, but did this analysis separate closers from RPs? It is really only in a statistical world that Robertson is more valuable to the NYY than Rivera.

    Or take one of the most ridiculous stats in the history of statistics. Valverde came in for a save 49 times. The Tigers won all 49 games. And it wasn’t luck: he pitched great. A Whip of 1.189, OPSa of .580. 5th in C.Y. voting.

    His WAR? 1.0.

    Phil Coke, with his 3-9 record, and proud owner of a 4.47 ERA, and a 1.454 Whip, apparently was twice as valuable as Valverde with his 2.0 WAR.

    My statistics teacher told me that the most important thing he was going to teach was to step back from the model and see if it makes sense from an intuitive perspective.

    Don’t take this personally, but if you continue to use a model that says Coke is twice as valuable as Valverde, and you continue to use that model, my teacher would’ve failed you, expelled you from the class, and had you barred from Yankee Stadium.

    Comment by Joey B — December 20, 2011 @ 3:57 pm

  14. agreed, though i’m not sure that invalidates the concept–i doubt teams are looking at WPA directly, but rather at stats that correlate with WPA better than WAR. what i took from this is that their high-leverage usage plays some role in how late inning pitchers are valued, and WPA beats WAR at explaining this because WPA takes this into account where WAR does not. I also think that there is a good amount of truth to the fact that relievers are overvalued regardless and GMs don’t like being embarrassed by numerous blown saves, which might be an additional factor that increases randomness in this case and makes correlation with any reasonable stat more difficult to find.

    Comment by juan pierres mustache — December 20, 2011 @ 4:00 pm

  15. WAR is a counting stat, phil coke having pitched in 30 more innings at a comparable FIP would do this. Baseball ref numbers would be different

    Comment by Endeav — December 20, 2011 @ 4:15 pm

  16. well, first off, WAR is cumulative and coke pitched 108 innings to valverde’s 77. second off, WAR uses FIP, not ERA, and according to FIP valverde and coke were not that far off (which, perhaps, is what you really would take issue with). if you look at valverde’s HR/FB, LOB%, and BABIP, it’s pretty easy to see why a) he had a very strong year results-wise and b) likely will not repeat it.

    i agree that there is a strong argument to be made that valverde was better last year (and, for the record, there aren’t many people on FG that think WAR=precise amount of goodness). I disagree with the statement “And it wasn’t luck: he pitched great. A Whip of 1.189, OPSa of .580. 5th in C.Y. voting.”. the first stat is middle of the pack, the second is pretty good, the third is irrelevant, and to say that any reliever who goes 49/49 in saves is not lucky is probably incorrect, and to say that having watched valverde “save” games is certainly incorrect.

    Comment by juan pierres mustache — December 20, 2011 @ 4:16 pm

  17. My favorite work on FanGraphs or any saber-slanted site comes when writers attempt to attack some of the unchallenged mores of the community, or merely look at them from a different perspective.

    This is a very interesting take, and I’m looking forward to seeing it fleshed out. Great job.

    Comment by Eddie Oropesa — December 20, 2011 @ 4:27 pm

  18. It’s not strictly a question about correlation (and there are non-linear measures of correlation, like the Kendall tau rank or the Spearman’s rank). It’s a question of the linearity of the relationship – the best fit line between WAR and WPA isn’t going to be linear because of the different scales involved.

    Comment by Colin Wyers — December 20, 2011 @ 4:28 pm

  19. What do you think the relationship is then? I suppose you could argue some higher degree polynomial as WPA seems to rise faster than WAR does.

    If you are instead talking about how it looks like a blob, then I am afraid you are confusing variance and linearity. The relationship may not be perfectly linear, but it is very close. However, there is a good amount of variance, as WPA and WAR clearly are not perfectly correlated.

    Also, as a note to the author, when your graphs are that blobby, it is a good idea to look at some other characterizations, such as a series of boxplots. There is a lot of information potentially hidden in that mess.

    Comment by guesswork — December 20, 2011 @ 4:38 pm

  20. Concerning that last point, I decided to go ahead and plot it for WPA/WAP in general for 2011 (no position designation). Click on my name to see the results. In this case, we didn’t learn a lot but it does reinforce the non-linear argument.

    Comment by guesswork — December 20, 2011 @ 5:25 pm

  21. Haha, you’re using wins, WHIP, and Saves, you’re on the wrong site

    Comment by BoSoxFan — December 20, 2011 @ 5:49 pm

  22. I don’t think this will be right. Leverage makes relievers higher than WAR. The players have no control of it. You could throw a random starter in the bullpen and his WPA would probably be higher than as a starter because they have a much higher leverage and a much lower ERA

    Comment by BoSoxFan — December 20, 2011 @ 5:54 pm

  23. I hate WPA. Not the stat itself, but I hate how it’s used. People like to use it as a player rating stat which it simply is not. The problem is that WPA counts high leverage situations way too high. A single is a single, a double is a double, why should a player have any control when he gets them. Simple answer: He can’t. WPA tells you exactly what it should measure how much did the team improve their chance to win when the player came to the plate. It also extremely inflates relief pitchers. They have lower ERAs and higher leverages and it overvalues leverage way too much like I said. The problem is I did some correlations from year to year and it looks like pitchers don’t really pitch better in high leverage situations. None of them.

    Comment by BoSoxFan — December 20, 2011 @ 6:03 pm

  24. Position players are given credit for playing multiple positions (If a player splits time between 1B and LF, he gets more credit than just a 1B). If a relief pitcher pitches most his innings in the 8th — shouldn’t those innings get a higher leverage index, and be valued more?

    “Closers” typically come in the 9th, bases empty, clean frame — it’s not like they vulture up gaudy WPA totals like a LOOGy would if he got one out with the bases loaded. Closers come closest to actually earning their WPA for that reason. (No, I’m not Ruben Amaro Jr.)

    Comment by Dekker — December 20, 2011 @ 6:34 pm

  25. Yes, but the teams do control when a reliever has a high leverage situation. Sure, a hitter has to get lucky to get a high leverage situation in the first place, but typically a reliever can be placed into one very easily.

    That said, that’s merely what I understand the argument for WPA to be. I personally dislike the stat as some players are in high leverage situations more often due to luck, as you mention. Things like leverage index , which (I believe) attempt to correct for this (regardless of how often he was in situation A, how well did he do in that situaiton? Situation B? etc), present a better picture, at least in my mind.

    Comment by guesswork — December 20, 2011 @ 7:45 pm

  26. Leverage explains RP pay, plain and simple. And WPA measures that leverage better, which is why it better corresponds to reliever pay, at least for relievers that will be employed in high leverage situations. It probably won’t apply much for non high leverage relievers who are there just to eat innings.

    Comment by Colin — December 20, 2011 @ 9:41 pm

  27. This is a great start to answering a question that needs to be answered.

    I think the primary problem with using WAR for evaluating relief pitchers is the assumption that they operate in the same context as starters, or pitchers as a whole.

    FIP was developed looking at how much a strikeout, walk, or home run results in runs scored against. Some math was done and run values were effectively assigned to each of those events. While not directly, it’s implicitly a linear weight type of approach.

    And that approach assumes a given context, or run scoring environment. I think the problem with using FIP based WAR for relievers is that the run scoring environment or context for quality relief pitchers is quite different than that for starters.

    Starters, and the group of relief pitchers as a whole, pitch an equal amount of above and below average leverage innings. That is a truism.

    But quality relief pitchers pitch primarily high leverage innings. That means that, in their context, the value of a Walk, Strikeout, or Home Run is different.

    That means that the formula for FIP should be different, and as a result the WAR should be different.

    That is what the divergent relationship for reliever WAR/WPA shows. The bigger question is how to fix it. Relievers as a whole can’t simply be treated differently because many poor relievers pitch lots of below average leverage innings. If you look at the average leverage for relievers as a group, the result will be… average (duh).

    What one wold have to do is evaluate relievers as a function of their skill, or perhaps the leverage that they have been asked to pitch in as a proxy for skill. The run scoring environment for good relievers is going to be higher, so a given performance helps the team win more games than the same performance in low leverage innings (duh). However, unlike batters and starting pitchers, this is not randomly distributed. Relief pitchers are selected to pitch in certain leverage situations by their managers. They do have significant control over the leverage of the situation in which they apply their skills.

    Good relievers help their teams win games more than WAR will suggest. Their performance is NOT context neutral, and the team/manager have direct control over what context they choose to use them in. This is a unique situation in the baseball world and needs to be evaluated accordingly.

    Plain and simple, 100 innings of releif @ 2.5 FIP will result in more won games than 100 innings as a starter @ the same rate, but will be shown as the same WAR. In the case of relief, the manager gets to choose when and where those high quality innings get used. The starters are randomly distributed.

    Figuring out how to use leverage index as part of the WAR calculation for relievers would be a good start.

    Coincidentally, the same principal would probably apply to pinch hitters.

    Drilling down even further, the run environment is going to differ with position in the batting order for hitters, so a wOBA of .350 in the 4 spot is going to result in a different number of wins than the same wOBA in the 8 hole, to some extent.

    Comment by FairweatherFan — December 20, 2011 @ 10:08 pm

  28. “WAR is a counting stat, phil coke having pitched in 30 more innings at a comparable FIP would do this. Baseball ref numbers would be different”

    Which is why you can’t use these derived statistics. If you’re telling me it’s because of the 30 extra IPs, actually it’s 36, then those 36 extra innings resulted in 66 extra hits, 6 extra walks, and 36 extra ERs.

    There is no possible logical way to explain that Coke was twice as valuable as Valverde.

    Comment by Joey B — December 20, 2011 @ 10:13 pm

  29. Check out first base.

    Comment by The Real Neal — December 21, 2011 @ 2:23 am

  30. Though I agree with you in general, the “logic” being applied by Fangraphs is that all batted balls, which don’t leave the ballpark are the same, therefore the difference between two pitchers who K the same number of batters per inning (not per plate appearance) and walk the same number of batters per inning (also, not per plate appearance) is negligible if they give up the same amount of home runs.

    Hopefully they’ll grow up at some point start and at least start using SIERA as their basis for WAR.

    Comment by The Real Neal — December 21, 2011 @ 3:18 am

  31. No kidding, NL West!

    Comment by Newcomer — December 21, 2011 @ 3:31 am

  32. WPA certainly shouldn’t be a “rating” stat. It’s merely an indication of how a player’s performance in a given year has impacted his teams wins/losses and heavily weighted by what situations they were put in. Relievers are necessarily going to have bigger swings in WPA simply because they tend to get used in high leverage situations. That isn’t because they are talented in any particular way, just goes with the territory of being a reliever.

    Comment by Eric — December 21, 2011 @ 6:34 am

  33. The difference between WHIP and Wins/saves is that WHIP actually measures a quality of pitcher skill, whereas wins and saves do not. Nothing wrong with using it, even on a site that typically employs more advanced metrics of analysis.

    Comment by Snark — December 21, 2011 @ 7:08 am

  34. Great article!

    Two items to consider:

    1) Usage For the three groups in the study:

    – Position players – generally bat against the full spectrum of pitchers, except for platoon usage.
    – Starting pitcher – generally face the full spectrum of batters, except for lineups skewed slightly to take advantage of platoon splits.
    – Relief pitchers – three splits:
    – Low leverage long/middle RP – face full spectrum
    – LOOGYs – face mostly lefties, skewing stats in their favor
    – Closers – will either face the meat of the lineup, or will face pinch-hitters instead of bottom-of-lineup guys. As a result, usage skews stats against closers.

    2) Value over Worst Pitcher (VOWP??) Because roster spots are limited, the value of any reliever added to the staff can actually be evaluated relative to the “worst” reliever in the bullpen. For instance, if your worst reliever had a -0.5 WAR, signing a 1.0 WAR veteran will actually improve the bullpen by 1.5 wins. Teams with a lack of pitching depth, or frail starting pitchers, will pay a premium to get middle relievers they consider reliable.

    Comment by tz — December 21, 2011 @ 11:07 am

  35. Pretty smart article for a kid from Wisconsin.

    Actually, pretty smart article for a kid from anywhere.

    Nice job, and thanks for sharing.

    Comment by ValueArb — December 21, 2011 @ 11:31 am

  36. @ Snark —

    “WHIP actually measures a quality of pitcher skill,”

    but it does so poorly. BBs are a measure of pitcher skill. Giving up hits is to some degree but hits are, as you doubtless know, in some measure a reflection of homers allowed, defense, and luck.

    Comment by chuckb — December 21, 2011 @ 11:50 am

  37. Good article. I have always been skeptical of fangraphs WAR for relief pitchers. I really don’t look at B-Ref’s WAR enough to know how the results compare to fangraphs. So, I’m glad you will be looking at it next. When I evaluate relief pitchers I like to compare the net difference between meltdowns and shutdowns, which are WPA stats. It’s not the only thing to evaluate, and it’s not a rate stat, which makes it more difficult to use. But I do think that the net of meltdowns and shutdowns provides information on the consistency of relievers. And I think teams pay a premium for consistency in late inning relievers.

    Comment by CJ — December 21, 2011 @ 12:07 pm

  38. Now if only managers and GMs could actually use their top relievers in the most high leverage situations……

    Comment by Eric W. — December 21, 2011 @ 1:37 pm

  39. Embiggen is not a word.

    Comment by Lisa the Iconoclast — December 21, 2011 @ 2:55 pm

  40. It’s a perfectly cromulent word.

    Comment by Eric W. — December 21, 2011 @ 5:42 pm

  41. The problem with WAR is that it treats a contribution in a 1-0 game the same as in a 10-0 game. Elite relief pitchers, particularly elite closers, on the other hand have what we might call “dense WAR”. Their value can be directed to those situations where a fractional WAR contribution can be the difference between victory and loss (and on the other hand their efforts won’t be wasted in blowouts on either direction.

    Comment by Blue — December 22, 2011 @ 1:03 am

  42. I agree completely. WAR works better when all events are equal. A HR in the 1st inning, 4th inning, 7th inning, are all kind of similar. An IP for a starter is similar to most other IPs.

    Where it unravels is that an IP by a starter is rewarded because it is better than an IP from the replacement level player. Lackey received a high WAR last year for putting up a lot of IPs. There is an intuitive logic that rewards a mediocre talent that puts up 200 IPs rather than a great, but injured SP that puts up 140 IPs. The logic is that the 60 IP difference will be filled in by a minor leaguer with a bad ERA.

    But you can’t penalize an RP in the same manner for only throwing 70 IPs because he is only supposed to throw 70 IPs. The RS are truly penalized by Buch missing all that time, because he is replaced bad pitching. The RS aren’t penalized by Paps only pitching 70 IPs.

    To take it to the final conclusion, the worst guy in the BP might well pitch more IPs than the best guy in the BP because he does mopup work that you wouldn’t waste on your more valuable RPs. So if he pitches mediocre in 120 mopup innings, being the worst guy in the BP could conceivably give you the best WAR for no other reason than being bad.

    Comment by Joey B — December 22, 2011 @ 12:58 pm

Leave a comment

Line and paragraph breaks automatic, e-mail address never displayed, HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Close this window.

0.265 Powered by WordPress