I’m in a tough position with Smith, because I don’t think he necessarily deserves to be in, but he is also more deserving then people that are already in, or will most likely make it in. personally, I’d put him in, but that’s mainly because he was always a favorite of mine, which is usually enough to push someone over the edge for me.
The reliever who belongs in the Hall of Fame, beyond any doubt, is Hoyt Wilhelm, and he was duly elected in 1985. He was an excellent starter for 3 seasons in the middle of his career, but he pitched as many or nearly as many innings in the years when he was a reliever.
His 2254 IP dwarfs the relievers discussed in this article. His 3.06 FIP (2.54 ERA) must be as good or better than those guys, though I haven’t checked.
He was also a good hitter for a reliever (.121 wOBA in 493 PA’s).
Roy Face, a great reliever, is arguably HOF quality for his career, but should definitely have some kind of honorable mention plaque for his 1959 season: 2.70 FIP, 2.60 ERA, 18-1 W-L. I know pitchers wins and losses don’t normally mean much, but 18-1!
Poz did a great bit on the hall, in which it was found that the miserly bbwa led the vet committee to overcompensate with crap … especially the catcher position, which is remarkable since it is so under-represented. I hope that they do not think they have to overcompensate with crap since someone unworthy is in and because the position does not have much representation. That said, Wagner was awesome!
Just look at that Rivera cumulative WAR graph – maybe it’s just because I have a small screen on my netbook, but it looks almost perfectly linear. I’d call him inhuman, but there would probably be more variation between seasons if he actually WAS a machine than there is.
Fingers got in because he was the all time saves leader, and the only man with 300+ saves at the time. Add in many postseason appearances, and he was voted in, but more as a factor of luck, given he pitched when saves first became “important” and was used accordingly. Had he come up 10-15 years later, he would not have made the Hall, and fell into the second tier of all time closers.
It seems like it should be much, much easier to maintain top performance if you know that you are only ever warming up once, and then pitching one inning at the most. Giving modern closers credit for a more effective usage pattern that they have nothing to do with seems out of line, which is why I would be fine with a “just Rivera” reliever selection from here on out. The managers/GMs are the ones who should get the credit, such as it is.
I don’t think an analogous situation is even possible for hitters – there doesn’t appear to be a in-game usage pattern that allows hitters a dramtic boost in their performance relative to what you would see in full-time action (platoon advantage is about it)
On those additional facts – how do Fingers and Gossage end up facing so few lefties compared to the other guys? Was it the pinch-hitting rate during their era?
Sutter may not have made it to the Hall on statistics alone, but you can be sure that future voters will be looking at the statistics when they vote. If Hoffman and Wagner stay on the ballot for multiple years, we are talking about an examination that will be done in 10-15 years, 40 years after Sutter’s prime. Gossage getting in the Hall shortly after Sutter also speaks to voters making comparisons between similar players.
Comment by craigjedwards — January 26, 2012 @ 12:20 pm
Re: facing fewer lefties
I think Fingers and Gossage faced fewer lefties for the reason you stated regarding pinch hitting being less popular, but it was also not as feasible if a pitcher was going to pitch a couple innings. If a pitcher pitched two innings and let one runner on, he faced seven batters. There are only so many lefties you can stock your roster with, and the pitchers are more likely to face a team’s heart of the order where hitters would not likely be taken out of the game. Smith pitched in both eras, and his PAs against lefties got bigger in the mid to late eighties. I suspect Rivera and Hoffman’s aren’t as big because the cutter and changeup, respectively, neutralized lefties and did not provide as much of an advantage for hitters.
Comment by craigjedwards — January 26, 2012 @ 12:35 pm
I think Billy Wagner belongs in the Hall of Fame. When you look at pure nastiness, Mariano is his only peer. Had he pitched ~100 more innings, he would have qualified for the ERA+ and K/9 title. His ERA+ was 187(2nd all-time) and his 11.90 K/9 would easily be the best mark in baseball history.
It would just be a shame to me to keep someone out of the Hall when they were clearly this historically great. Had he played two or more years, he might have 500+ saves and his WAR total would only be surpassed by Rivera and possibly Gossage. The guy just belongs.
Lou Whitaker’s career WAR (74) more than doubles all of these closers’ WAR…except Rivera (39). Even adjusting for some of WAR’s deficiencies, it upsets me at how famous a closer can become despite not impacting his team’s performance that much.
Comment by OzzieGuillen — February 18, 2012 @ 10:39 pm
Well, you’re only looking at an FIP-based fWAR. You know that Rivera has significantly outperformed his FIP for his entire career, right? That 39 number is a complete joke.
If we’re comparing relief pitchers over an entire career, I think I prefer WPA to WAR. This is because relievers (especially “ace” relievers or closers) tend to rack up small numbers of innings, but they are highly leveraged innings that tend to have disproportional influence on the outcome of a game.
Being highly effective for 1-2 highly leveraged innings is the hallmark of modern reliever effectiveness. WAR doesn’t care at all about leverage, it just cares about quality (performance, specifically FIP) and quantity (innings).
If you look at career WPA, Hoffman (32.98) and Gossage (31.40) soundly defeat Lee Smith (23.97). Rivera trounces them both (54.70).
Sutter sits at 19.61 career WPA. I think that this stat, combined with the WAR data you have presented, makes for a pretty compelling case that perhaps he does not belong in the HOF. But in terms of Lee Smith’s consideration, I think it’s a big mistake when you are thinking about the HOF to always be comparing a player to the *worst* current HOFer at his position. If you don’t think player A deserved to be in the HOF, you shouldn’t vote for player B just because player A wound up getting in.
OzzieGuillen (above) is upset about closers getting famous when they don’t affect a team’s performance that much. But again, this statement relies on WAR, which discounts the leverage of the situation. WAR considers team performance in the “Pythagorean Win” sense—that is, a function of total runs scored and total runs allowed. And of course, since a reliever only pitches a small amount of innings, his influence over a team’s total runs allowed over the course of a long season is going to be small. As well, the difference in WAR between the best and worst relievers in baseball will not be anywhere near as striking as the difference between the best and worst shortstop or starting pitcher.
But on the other hand, a properly-utilized ace reliever will be injected into games in the most highly-leveraged situations, where the difference between allowing 0 and 1 runs can be the difference between winning or losing. I don’t think it’s naive, archaic, or sabermetrically-challenged for me to make the claim that superior performance in these situations provides value not captured by WAR.
Since a reliever’s perceived value is almost entirely tied up in performance in highly-leveraged situations, if you use a metric that is intentionally agnostic to this context, of course it’s not going to capture the reliever’s full value.
I encourage you to at least consider WPA when evaluating relievers. I’m pretty sure WPA only takes into account a position player’s offense, so take this with a grain of salt, but Lou Whitaker’s career WPA of 28.01 now seems like a saner comparison with the likes of Goose Gossage (31.40) and Trevor Hoffman (32.98).
Just to make another important point, statistics don’t need to be the absolute only criterion one uses when considering a player for the HOF. There will always be a long list of players that are on the very fuzzy borderline (statistically) for enshrinement. But what about the less quantifiable aspects of a player’s career that ultimately leave a lasting positive influence on the game and its fans? Breaking the color barrier (Jackie Robinson), being an iconic player defining baseball in a city for 20 years (Willie Stargell), giving inspiring and dominant postseason performances (here’s a more recent example, in Curt Schilling), being a race-transcending star on and off the field (Roberto Clemente)…
Lee Smith may be a marginal HOF case statistically, as many players are, for his regular season performance. But If I’m on-the-fence, I’d rather give the nod to a player like Curt Schilling, or Chipper Jones on the basis of what they’ve truly mean to the game of baseball over the last couple of decades. I’m sure Lee Smith had his moments, but he’s no icon in any of the 8 cities he played in, and he never left his mark in the postseason.
You make several very good points, especially in relation to the highly leveraged nature of relievers. I also agree that looking at the lowest level of hall of famer for comparison is generally not a good idea, although with hoffman, smith, and jones, we are dealing with non-hofers. A couple of points though.
1. WAR, I believe, does take into account, leverage index for relievers. Those relievers do get credit for pitching in more important situations.
2. WPA is a good tool, but it doesn’t tell you how the outs are made or how the runs are given up. WAR takes the most important factors, ks, bbs, and homers. WPA is not aware, nor does it consider park factors or defense that the pitcher has no control over.
1. I don’t believe WAR takes leverage index into account for relievers. If you can show me a link from fangraphs where it says WAR is calculated that way, please share it. I would be surprised if it did, since one of the theoretical bases of WAR is that performance across situations in terms of leverage is random and therefore not indicative of true value.
2. WAR takes into account ks, bbs, and homers, but it doesn’t take *everything* meaningful into account. It makes assumptions that pitchers aren’t able to to systematically induce weak contact (leading to lower BABIP) and that “clutchness” doesn’t exist. It doesn’t account for the pitcher’s own defensive abilities, nor his ability to keep runners from advancing on the basepaths via steal or wild pitch. You’ll notice that Mariano Rivera, who has had all of these fine qualities, routinely outperforms his FIP, and therefore is undervalued by WAR.
You can make the argument that ks, bbs, and homers are important because they tend to correlate well year-to-year, while other stats don’t. Therefore, as tools for *projection* people have had success with them.
That doesn’t mean that when a career’s player is finished you aren’t allowed to consider anything else, especially considering that a lot of that pure luck is going to even out over a 15 year career. WPA is more of a descriptive statistic than a projective statistic, but when we’re looking back at a player’s career we’re trying to describe, not project. You’re allowed to entertain the possibility that a player outperforming his K, BB, and HR rates might not have just gotten lucky for 15 years, and that maybe give that player credit for at least some of the other variance contributing to his performance.
Since it seems at least one person is listening, I’m going to use this opportunity to make another case for WPA, in the context of how one votes for MVP. A good way to define MVP, in my opinion, is: What player has contributed most to the success of his team?
As in the case of rating HOFers, we want to *describe* what happened in a given season, rather than infer the player’s underlying skill. We give credit for what a player did, just like we award the World Series trophy to the team in the series that gets the most wins, not the most Pythagorean wins.
Going with WAR, I think, can really undermine the spirit of the MVP. Consider two players: Player A is an absolute terror at the plate, but only when the game is entirely out of reach in either direction. In situations where good hitting can actually effect the outcome of the game, he strikes out virtually every time. Player B strikes out virtually every time in meaningless situations, but is an absolute terror when it counts. Let’s say both these players finish with identical stats. WAR will say both these players are equally valuable, but it is evident that player B has contributed much more to the success of his team.
WPA tends to be highly correlated with WAR—you don’t have to be superhumanly clutch to have a very high WPA—but I think it adds something highly desirable.
I believe that is still accurate. All implementations of the framework are going to be better at some things than others. I’m not going to argue that rivera is underrated according to his WAR. He is an outlier. The rest of the relievers cited don’t have those same issues. Some stats may even out over the course of a career, but if a pitcher pitches half of all his innings in a pitcher’s park, that will not even out. If there is evidence that pitchers have control over the type of contact they induce and that they can sustain over the course of a career, I have not seen it. There are going to be outliers going both ways over the course of a career with great pitchers on either side of it. I feel more comfortable relying on ks, bbs, and homers not just because it is predictive, but also because it is descriptive of the events that the pitcher has the must control over. How much control a pitcher has over the type of contact I’m not sure about, but I’m very confident when there is no contact.
I went to the link. That honestly surprises me that they would even give “half credit” for leverage, as it seems (to me) to go generally against the whole spirit of WAR.
Surely you don’t really mean there is no evidence that pitchers have control over the type of contact they induce? GB/FB rates certainly have some stability, even over the course of a career. For example, Hoffman was consistently a flyball pitcher on balls in play.
A pitcher’s park won’t even out over the course of a career (a big park would certainly have played to Hoffman’s strengths). Nor will the contextual effects of league and era (In this case, though their careers overlapped, Smith played in a weaker offensive era than Hoffman, no?). But I was careful to say that “some” things would even out—namely the pure luck aspects.
Basically, my goal here is for neither of us to throw out the baby with the bathwater. Because WAR takes into account things like ballpark effects, it should not be dismissed. Then again, if you’ve got the time (say, you’re voting for the HOF), why even limit yourself to such a coarse summary statistic, right?
Anyway, I hope we both got something to mull over from this exchange.
Was doing some stat perusal and, while we’re on the topic of modern big name relief pitchers, there is one such closer from the 1990s/2000s who actually is 2nd *all-time* in lowest career BABIP against (with at least 300 IP). Any guesses?
Troy Percival (astonishingly) held opponents to a .230 BABIP over 700+ innings of relief. Considering his great (fortune?/skill?) on balls in play, and that he additionally struck out guys at a pretty good clip, and that he didn’t give up an insane amount of walks/homers (though a fair amount of each), one would think he’d have managed an ERA even better than his career 3.17 mark.
This is a great article, bummed I didn’t see it until now.
Basically, after Rivera, there’s no real sure things for modern era relievers and the HoF (Eckersley and Smoltz are hybrids).
I really like shutdowns and meltdowns, especially shutdown/meltdown ratio, but that seems to favor more current relievers and I’m not sure why. Perhaps the longer outings made relievers like Gossage and Fingers less effective?
Looking at the stats, Bruce Sutter is basically Francisco Rodriguez. Not that there’s anything wrong with that, but there’s no way K-Rod gets in the Hall. I say forget the past mistakes and keep ’em all out except Rivera.
WAR has to be the single worst stat in the world to use for closers, bar none. It is totally useless. For example, Valverde was a perfect 49-49 last year, with a 2.24 ERA. But he had a negligible 1.0 WAR. Benoit, who had a good season, but not great, had a 1.3 WAR. Coke, who started and relieved, was maybe average, at best, with a 4.49, and he had a 2.0 WAR.
So in Coke’s case, you are basically rewarding him for throwing an extra 36.1 innings, in far lower leverage situations, and allowing an extra 36 ERs in those extra 36.1 IPs.
The key thing I learned in statistics classes 30 years ago was that, if your model tells you that Coke is a much better pitcher than a guy that saves 49-49, then throw the model away.
The guy that went 49 for 49 in saves walked more than 4 hits per nine innings. There was a lot of luck involved. Coke made 14 starts. Those 36 extra innings are not meaningless. In fact, Coke pitched 50 percent more innings than Valverde. If a hitter had 162 at bats in the ninth inning and was above average that wouldn’t make him more valuable than an average hitter with 250 at bats that were spaced throughout the game. Coke is not being rewarded for giving up more earned runs. He is simply not being punished for a lot of bad luck while Valverde is not being rewarded for a lot of good luck.
If the model says that Coke is better than Valverde, perhaps it is not the model that should be thrown out, but the conventional wisdom that closers are as valuable as everyone thinks they are. WAR says that Rivera, Gossage, and Smith have been the most valuable closers of the past thirty years. Seems like it is doing something right.
I’ll give you a +1 on Mo, Goose and Smith being the top-3 according to WAR. From that perspective, it works as a relational tool for closers only.
But sometimes it’s good to let democracy work for you. Is there even one person in the entire world of 7B people that think Coke was better or more valuable than Valverde last year?
Here is the weakness of WAR for RPs, particularly closers.
An average player who consistently gets 650 PAs is a more valuable (WAR) player than a slightly better player who only gets 500 PAs. That’s because the loss on value to the replacement player is higher than the gain in value because the 500 PA player is slightly better.
But the replacement value doesn’t apply to an RP. There is no replacement for Valverde since he pitched the entire season. Therefore, an RP with 100 IPs should not accrue benefits for value over replacement because he pitched 30 more IPs than Valverde.
Thought from a more intuitive perspective,
As a RS fan, I have always appreciate Wakefield. He was generally a nice #4 SP overall. 200-180 with a 4.41. One might even call him an adequate #3. But no one would even call him more than an average pitcher.
As a RS fan, I have a duty to hate Mo, but as a BB fan, he’s about the best closer in history, and no one would rank him any lower than maybe #5.
But Wake has a career WAR of 38.6, while Mo has a career WAR of 39.4, virtually identical. Would anyone consider them to be equals?
I agree with your premise regarding wakefield and rivera. Rivera has definitely been the better pitcher, but at the same time all those innings have value.
Valverde did pitch the whole year, but consider if he hadn’t. All the relievers move up an inning, a AAA pitcher then gets 60 innings in the fifth and sixth of blowout games. The lost impact isn’t that great. Giving a replacement player 180 starter innings is going to hurt a team a lot more than losing a closer.
You’ve uncovered something that people who take pitcher WAR too seriously fail to acknowledge: its’ a counting stat and basically an IP contest. The difference in WAR/inning between elite closers(and often elite starters) is phenomenally small. So, yes, whoever accumulates the most innings is going to finish on top. Also, when using fWAR, which doesn’t properly measure the ability of pitchers who can induce weak contact, you have to be a certain type of pitcher to finish on top. (See Greinke, Zack this year)