On the Closer Position: The Save and RP Usage
One of the most interesting aspects of roster construction in today’s major league baseball is the bullpen , and how it revolves around the closer. The closer position has reached mythical status in today’s MLB, exemplified by Mariano Rivera. Since 1996, the game for the Yankees has been to find a way to lead after eight innings, and then to turn the ball over to the undisputed best one-inning pitcher in the history of the game.
Rivera may rank behind Trevor Hoffman in terms of career saves, but Mo’s 14 year span of dominance is unprecedented. And yet, he only ranks 76th on Sean Smith’s list of pitchers by WAR. Hoffman is all the way down at number 209. For me, the idea that a role with such a seemingly low value can be placed in such a high regard evokes some sort of curiosity.
Today, we look at how the position of the closer has evolved since the inception of the save, the statistic which will be forever linked with the closer. The save was introduced in 1969, but the idea of the one-inning closer which we are so familiar with did not immediately catch on. Goose Gossage, for instance, is specifically noted as having the ability to earn a multiple-inning save with regularity. In today’s game, on the other hand, it is an event when a closer is called upon to make a two-inning save. Let’s take a look at the average innings per game finished for those pitchers with 30 saves or more since 1969. Games is used instead of saves to account for blown saves as well as games entered that weren’t save situations.

Two things jump out right away. First, the sheer numbers of 30 save guys ballooned in the 90s and the new millennium. Second, as we already knew, for the most part, “closers” pitched many more innings in the early parts of what we can call the “Save Era.” The correlation between IP per game is high, with R^2 = .56. We especially see this decline around 1986, when the average IP/G for these players drops from 1.51 to 1.32. Tom Henke‘s 34 save season in 1992, in which he pitched 57 games and 55 2/3 innings, was the first 30+ save season with less than 1 IP/G.
Things have been relatively constant since the strike of 1994. From 1995-2008, the average IP/G for 30 save closers ranged from 1.03-1.07, with only two pitchers (Danny Graves in 2002 and Ryan Dempster in 2005) going over 1.25. The role of the closer has now been quite well-defined, and the Goose Gossage style of pitcher is dead.
Here, we can see the undeniable effect that the save has had on the game of baseball. The way teams build rosters is different. The way managers attack game strategy is altered. The market for relief pitchers has changed. Between these and other changes, we’ve seen one simple statistic dramatically effect the way the game is played.
The worst statistic ever invented.
Yup.
Unless you are Mo Rivera, you are basically overrated as a closer. It should be mentioned that even Mo came up as a Starter and was moved to the BP to better use his dominance of one pitch.
yea, Mariano was a failed starter who became a great reliever.
So was Joe Nathan.
Two best relievers of our era were both guys who would’ve washed out of MLB if they weren’t moved to the pen.
Yep. Small sample size alarms should go off but Nathan had a 5.95 FIP in 1999 and 5.65 FIP in 2000 when he was used primarily as a starter.
Well here was his last year in AAA Fresno in 2002:
6-12, 5.60 ERA, 31 G, 25 GS, 117 K, 74 BB, 20 HR, 1.647 WHIP, 1.58 K/BB
Rivera was better in the minors, but his K/9 in the higher levels (AA and AAA) was 5.0 (!).
I don’t believe it’s accurate to characterize Rivera as a failed starter. Rivera started ten games in the majors in 1995, his first year in the big leagues. In one of those starts, Rivera tossed eight shutout innings and struck-out eleven batters.
Rivera was transplanted to the bullpen after the Yankees acquired David Cone. Rivera shined in his playoff role in the ’95 ALDS, where he pitched 5.33 innings, allowing no runs, three hits, and one walk, with eight strikeouts.
Going into ’96, the Yankees had David Cone, Kenny Rogers, Jimmy Key, Andy Pettitte, and Dwight Gooden, among others, in the mix in the rotation. Rivera began the year as a long reliever, and was so dominant that he quickly became the Yankees primary set-up man to John Wetteland.
This formula worked, and the Yankees won the WS, after which closer John Wetteland departed for free agency. Mariano Rivera was the natural choice to fill the vacated closer’s role.
The rest, as they say, is history.
I don’t think it’s fair to say Rivera failed as a starter, as he was never given the chance to start. In 1996, the Yankees had starting pitching, and needed help in the bullpen. Rivera filled that need. In 1997, the Yankees were still fairly thick in the starting rotation, but needed a closer. Rivera filled that role.
We’ll never know if Rivera could have succeeded as a starter. Chances are, however, that he could never have equaled the dominance he has achieved as a reliever in the starter’s role.
I don’t think Nathan’s move the pen was due to his pitching struggles so much as his inability to stay healthy.
Also, if memory serves, this hyper-specialization at the altar of the Save hasn’t meant that late-inning leads are much safer than they were 30 years ago, which is of course the apparent goal of the strategy in the first place.
It’s like going for it on 4th down at this point. Sure, the stats say it’s the right thing to do, but most coaches are afraid of the consequences to failing.
And if they aren’t, and it doesn’t work, well, remember how Belicheck got treated by the media?
But Belichick legitimately doesn’t care what the media says, and will make the decisions based on what he feels is best. That’s what makes him a great coach (even though I can’t stand him).
Look at what the Red Sox did a few years back. They said before the season that they wouldn’t rely on one pitcher to close games, because they would use their best reliever when they felt it was appropriate. The media went nuts with this during the season labeling it “closer-by-committee” (which it really wasn’t) and eventually the Red Sox admitted defeat and went to a traditional closer role. Their thinking was sound, but they didn’t execute it well as they had a poor bullpen. If they had strong relievers and used them as they had intended it could have been a breakthrough to stop the idiocy of managing the game to a statistic rather than to best leverage your options.
Yep, this actually just came up in another post (the Moneyball article).
The problem with baseball v. football, though, is that football is something much more easily controlled by strategy. The difference between the best and worst NFL teams is huge. In MLB? Not so much. 2008, the difference from the best to the worst was 41 wins. The volatility of a single baseball game is ridiculous, yet a failing can become a rallying point immediately, while actual success can take a season or more to justify.
Perfect example: 36 players registered 10 or more saves in 2009. Of those 36 players, 15% of their save opportunities (177 of 1181) were blown. That is huge, yet no one in the MSM seems to care (at least none of those who used that one failing by the Red Sox bullpen).
aweb, I was just going to post asking about this. Has anyone done a study that explores whether or not leads are safer in the closer era versus in the pre-closer era? I would love to see it if so; it really is the crux of the anti-specialization argument.
http://www.hardballtimes.com/main/article/the-closer-and-the-damage-done/
This article covers some of the ground, but I can’t find anyone who directly answered the question – are ninth inning leads safer now than they were 80, 60, 40, 20 years ago? And if they are, has the trade-off been to throw away more leads earlier in the game waiting to use the best reliever in the ninth?
I imagine someone has done this work, somewhere, but my google-fu is not strong enough today.
Whilst I do not doubt that closers are overrated, I do not believe that there are AS overrated as the metrics believe them to be. There is, no doubt, a non-quantifiable effect that plays into the save, as well as the advantages played out by this specialisation as the game has evolved offensively.
@aweb I’m not doubting what you said at all, but I would love to see statistics on that if you have them.
Would it make more sense to look at relief pitchers in terms of WP and related stats instead of WAR?
This is my thought as well. Given the much higher leverage situations they’re in, using WAR may not be the best approach to valuation here. Of course, if managers were smart, and used them other than just the 9th inning in a save situation, their leverage would be even higher.
I’ve seen a lot of people back that idea of using WPA instead of WAR to evaluate a reliever’s contributions to his team. There is a lot of debate over that.
Well if you believe in the late-game mentality (which I sort of do, I mean, there’s a lot more pressure to get the job done in the 9th than the 2nd), then WPA does work. And your top 5 in reliever WPA are usually pretty awesome pitchers.
I still think K/BB is the best way to judge a reliever year in and year out. The ones that stick are the ones that strike men out and don’t give up freebies.
K/BB kind of breaks down at the extremes, though. Is a 4K, 1BB/9 guy the same as David Robertson, who issues something like 4-5BB/9, but gets half (!) his outs via strikeouts? It’s never kept, but I’ve always felt that K-BB/9 was a little better, sort of like a +/- for non-true outcomes (since we know HR aren’t entirely True Outcomes for pitchers).
That’s really two extreme ends of the spectrum, in my opinion, and a guy that only throws 4 K/9 out of the pen usually doesn’t make it to the majors in the first place.
Fair, though, since ability to get strikeouts IS more important for a reliever than not giving up walks. But I’d take an 8 K, 4 K/BB guy over a 12 K, 3 K/BB guy any day. Here’s why (assume 1 HR/9 for both):
Guy A, 19 outs recorded in the field, .300 BABIP, 8.14 HIP/9, 9.14 H/9, 1.238 WHIP.
Guy B, 15 outs recorded in the field, .300 BABIP, 6.43 HIP/9, 7.43 H/9, 1.270 WHIP.
So obviously K/BB overstates the difference a bit, but the middle of the road stuff guy w/ control looks to be a bit better than the guy w/ great stuff but iffy control.
Sean Smith’s WAR (the version cited in the article) does incorporate leverage index for relievers, so that’s not really an issue with Sean’s WAR numbers for Rivera.
“incorporate” can mean essentially anything in a complex statistic.
Is LI 1% of WAR for closers or 50%? What does “incorporate” mean in terms of the actual statistic?
Also, I think Sean Smith is the guy that wrote an article stating that WPA is a better measure than WAR for looking at closer’s performance.
There are BIG issues with WAR and closers. The difference between an effective closer and an average one is greater than 1 “win” (a real win, not a calculated win), but WAR only represents small differences between various relievers. It’s not, IMO, a good stat for relievers.
He probably did write an article on that, which is probably why he now incorporates LI into his version of WAR to correct that problem with non-leveraged WAR for closers.
Typically, a reliever’s production has the effective weight over a replacement-level pitcher of about halfway between his actual LI and 1. I believe this is what Sean does, or else something that ends up close to this general rule, which is really a shortcut for dealing with chaining. Sky Kalkman wrote an article on chaining the effects of LI on relievers early in 2009 if you’re interested in reading more on why the full leverage index isn’t used:
http://www.beyondtheboxscore.com/2009/4/29/856308/bullpen-chaining-and-reliever-war
Because Sean accounts for the chaining effect of replacing a closer’s leverage, his WAR is a more accurate reflection of how leverage affects value than just straight WPA. The other advantage WAR has over WPA is that WAR has a replacement level factored in while WPA doesn’t. The two disadvantages of WPA (that it doesn’t account for chaining and that it doesn’t add in replacement level) are in opposite directions, though (one overvalues the pitcher while the other undervalues him), so WPA can work out pretty well when those effects cancel each other out, and that makes WPA by itself more accurate for closers than non-leveraged WAR. In fact, if I’m remembering correctly, Mariano’s WPA is pretty close to what Sean has for him in WAR (for some reason, his page isn’t showing WPA properly for me right now, so I can’t check this). I don’t know how common it is for the two values to be so close, though.
The difference between a top closer and an average closer is way more than a win in Sean’s WAR. You seem to be thinking of unleveraged WAR with this criticism, which Sean’s version is not. On average this decade, the difference between the top full-time reliever and the 15th best full-time reliever (the average top performing pitcher in a given team’s bullpen) in Sean’s database in a given year is about 1.9 WAR. The average difference between the top reliever and the 30th best reliever (i.e. the minimum spread of value among the top relievers on each team, assuming each team has just one of the top 30 relievers) is 2.5 WAR. That’s not limiting to just closers, and many of those relievers aren’t closers. The average difference between the best and the 15th best pitchers to finish 25+ games in a given year (mostly actual closers, not just the highest valued relievers, though some teams still have multiple qualifying pitchers) is 2.4 WAR. So an elite closer is worth well more than a win more than an average closer in a given year according to Sean’s WAR.
Nevermind on Mariano’s WPA not showing; somehow I was on his batting page instead of his pitching page. His career WPA has him at 48.9 wins, and Sean has him at 49.9 wins.
I’m fairly new to WAR and WPA, so I did not realize that there were “various” versions of WAR. So far, I have only gone by the WAR listed on this site … and it says Ryan Franklin was only 1 WAR different in 08 and 09, where in 08 he was basically a replacement level closer (and one of the worst in one of the worst bullpens), and in 08 he was MUCH better (more than 1 “real win”) for certain, and it was a huge difference for StL.
I’ll have to look and see how Sean rated Franklin’s WAR 08v09.
Thanks for the info.
Sean’s WAR for Ryan Franklin compared to FG’s WAR:
Sean 08: 0.6
FG 08: – 0.4
Sean 09: 2.6
FG 09: 0.9
I like Sean’s WAR MUCH better, for closers. I understand the chaining aspect, but IMO it still undervalues closers a bit (but likely not much, and I have no solution that can be applied to all pitchers in all situations).
Sorry if this is a rehashing, but are there statistics that compares the stats (all of them, ERA, WAR, BS, etc…) of the same relievers in both the closer role and non-closer reliever role?
I ask because I remember Phil Coke (I think it was him) talking about trying to pitch the 9th inning. He said it was way more stressful and a totally different animal. Maybe the statistics could shed some light on the psychological effect of pitching the 9th.
Of course, there are a slew a problems you face with pinch hitters, pinch runners, etc…, being more prevalent in the 9th, but I’d still be curious.
How did the save statistic come to be defined the way it is today? Why is a 3 run lead considered a save opportunity and not just a 1 or 2 run lead? Even untrustworthy relievers will save a 3 run lead 95% of the time.
Jerome Holtzman – a beat writer (at the time) – is credited with inventing it. Here he is discussing it: http://findarticles.com/p/articles/mi_m0FCI/is_5_61/ai_84542687/
“The ERA wasn’t a good index because many of the runs scored off a reliever are charged to the previous pitcher; the reliever’s ERA should be at least one run less than a starter. The W-L record was equally meaningless; the reliever, particularly the closer, is supposed to protect a lead, not win the game.
For example, Elroy Face of the Pittsburgh Pirates was 18-1 in 1959, still the one-season record for the most victories by a reliever. Face was immediately acclaimed as the best bullpen artist in all baseball history.
I knew better. He was much more effective the year before when he was 5-2.
In 10 of his 18 victories, Face coughed up the tying or lead run but got the win because the Pirates had a strong hitting team and rallied for the victory while he was the pitcher of record.”
Face:
1958: 5-2, 2.89 ERA, 84 IP, 2.14 K/BB, 134 ERA+, 1.179 WHIP
1959: 18-1, 2.70 ERA, 93 1/3 IP, 2.76 K/BB, 143 ERA+, 1.243 WHIP.
So the grounds for the discovery of saves was even misguided. Whoopsie.
Rivera is an outlier in this. The guy is a freak and has the best ERA in history.
Last year, I crunched some numbers and found that while the elite RP core does positively impact one’s fantasy pitching statistics, the inferior RP core simultaneously NEGLIGIBLY impacts the pitching line — while the superior RP core lowers ERA well below the projected “what it takes to win” threshold (while also lowering WHIP a sizeable chunk), the inferior RP core only increased ERA by .06 and whip by .01.
More detail here:
http://gameofinches.blogspot.com/2009/03/saves_17.html
Yeah closing is made out as “clutch” and since they do not log many inning, WPA makes sense. Saves is such a ridiculous stat. 3 run leads and 1 run leads grouped together sway numbers when comparing.
If a guy comes in with no outs in the 7th with a one run lead, gets 8 straight outs, and then gives up an infield single, gets an error, and then another flukey hit, he gets saddled with a Blown Save
If a guy comes in with 2 outs in the 9th, with a 3 run lead, proceeds to give up two straight bombs, then walk the bases loaded, and then gives up a line drive only for his CF to rob a triple in the corner, he gets a Save.
Awesome statistic.
If a guy comes in with 2 outs in the 9th, with a 3 run lead, proceeds to give up two straight bombs, then walk the bases loaded, and then gives up a line drive only for his CF to rob a triple in the corner, he gets a Save.
No he doesn’t.
Moot is correct. If the reliever comes in with no-outs in the ninth, retires the first two and THEN follows your script, he does however get a save.
My mistake.
Corrected version is bad enough, though.
One issue in the evaluation of closers is that the spread of reliever talents is a lot narrower than the possible value of those relievers in any given season.
Mariano might only be 4 wins better than a replacement reliever, on average, but in a year where he only allows runs when they don’t matter and some scrub reliever gives up his runs when they matter the most, that could cause a ten win gap in their performances. That bad reliever’s performance really did cost the team a crapload of wins, but headed into next season, you would go back to expecting that four-win gap.
In other words, there’s a lot of flukiness in the timing of events for all positions, but because of leverage, flukiness probably has the biggest impact on the value of relievers. And when a medicore closer has a flukily bad year by value, we tend to include that in our evaluation of his ability, too, when we shouldn’t.
I’ve always loved the argument that closers have this special trait that allows them to pitch in the “high pressure” 9th. Why is it that only failed starters have this trait? And why is it that when successful starters (John Smoltz) need to pitch in relief for one reason or another, they have no problem doing it even though they don’t have the special trait?
But Glen! Clearly they had that trait all along, we just didn’t know about it!
Aren’t all relief pitchers failed starters? I assume that if a pitcher rising through the amateur and pro ranks can succeed as a starter, he will start. If he fails, he will be tried as a reliever before shown the door.
You clearly haven’t followed the saga of Joba Chamberlain, owner of a career 4.18 ERA and 8.4 K/9 as a starter, all at the tender ages of 22 and 23.
I forgot to add: I was a relief pitcher in Little League.
I don’t really worry about save rules, but a guy could get one if he comes in in the 6th with no outs and a 30-0 lead, give up 10 runs over the final 4 innings and still get the save.
Wes Littleton, anyone?
yea, that happened recently when teh Rangers beat teh Orioles 30-3 and a guy got a save for pitching the last three innings. That must have been intense pressure.
Hey, that kind of save might have more weight, because it “saves” the bullpen.
To state that we’ve seen “one simple statistic dramatically effect [sic] the way the game is played” is a gross oversimplification. Many things have affected the way the game is played: five-man rotations, free agency, agents, escalating salaries, statistical analysis, specialty relievers, training and exercise regimens, television rights, etc…
To state that the “simple Save statistic” is the primary reason the game has evolved the way it has is to ignore so many other factors in baseball’s evolution.
The Save stat should be abolished. I would like to see the best relievers in the game pitch 2+ innings more often. If starters pitch 32 starts/200 innings and one inning relievers average 60/70 appearances/70 innings, why can’t you physically train pitchers to pitch 40-50 times/100-120 innings a year to pitch when you really have a jam and pitch 2+ innings in tight ballgames.
Also need to get rid of the DH!
They dont need to abolish it…just get rid of the 3 run lead save…that’s ridiculous.
It could stand to be tweaked a bit but not eliminated.
I’m so tired of this “closers are just failed starters” type of comment, as if it were an insult or marginalizes what the closer does well. Really, that sounds like something people say if they’ve never played baseball at a decently high level. Just sayin’.
First, closers and starters are two different roles and require different “stuff”. It’s like a pit bull and a greyhound. The closers that didn’t make it as starters could have resulted as [1] not having endurance to go 5-7 IP, [2] have one dominant pitch, but not 2 or 3 good ones, [3] have the mindset to come out and just “throw their balls off” versus pacing themselves over 5+ IP, and/or [4] are more effective pitching in short duration more frequently versus longer duration less frequently. Actually, this is the exact type of thing that screams “Joba is a reliever”, but a whole ‘nother discussion.
Rivera’s cutter, Hoffman’s change-up, Sutter’s splitter, Nathan’s slider, and basically every closer’s “heat” is not something they can throw 50% of the time 2 or 3 times through the lineup and be as successful as they are doing it for 1 IP. It really is that simple.
For example, I started and closed in college (especially on spring trips), and the roles are different in mindset and approach. As a starter, I had to pace myself and not reveal *everything* in the 1st 2 IP, then as the lineup came around again, I’d show some more, a curve and change (using just FB and CUT in the 1st 2 IP), etc. As a closer, I could come in with just throwing my best pitch, a curveball, as much as I wanted to … especially to LHBs. Completely different scenarios. The pressure is also much more dramatic in the last inning in a close game. How could it NOT be? You give up the lead and you either have 0 or 1 chance to get it back. Like, duh.
An article in THT’s 2010 Annual also addresses a question asked in this thread …
The LI has increased in late game situations over the years.
The article did not say why, but I would suggest that it’s due to not have 4 good starters go deep into game and then having the ace reliever come in and pitch no more than 1 time through the lineup (1 or 2 IP). We get 2-3 “other” relievers, which really are the 9th through 11th best pitchers on the team. Anyone who follows a team with bullpen struggles knows that the leads are often either significntly reduced or even lost in these innings. It’s a combination of 5 starters, not going deep enough into games, and the closer being relegated to the 9th. So, we get 2-3 IP of guys that *might* not be MLB quality. Certainly not quality enough to be among the 5 starters or the closer.
Those pitchers also pitched in eras where the 6-9 hitters were not “run producing threats” in general, and you’d basically have to give up 2-5 hits to give up a couple of runs. Not so today. Not that bottom of the order guys are beasts now, but they’re not the .270-7-60 guys of decades past.
I do definately think there is merit for having the relief ace pitch in the 8th inning if that’s where the 1-run lead, runner on 2nd situation comes into play. In the 7th inning? Naw, that’s like going for it on 4th down late in the 3rd quarter … there’s still time to get the lead back in the 8th and 9th innings.
I don’t want to any more about the defense of Zimmerman, Polanco, Utley, and Zobrist until it is prefaced with “even though they are a failed shortstop …” … see? Pretty dumb isn’t it?
If you simply eliminate the ‘regular season total save stat’ for evaluating purposes it helps a great deal. You have many other more meaningful stats to look at in regular and post season with late inning relievers. The total save stat is a marketing vehicle to jazz up fans, sell seats, and help out a team’s reliever if possible. The stat is helped if you never use a certain reliever more than one inning, then it helps if your team is well ahead in the division- or well out of it so a manager can manipulate cookie save stats. Other teams always know there is a chance Rivera will come in before the 9th inning. And that he hasn’t been sitting on the couch much in October or sometimes even November. If a team is in a pennant race they don’t have the luxury of sculpting a regular season total save stat cookie for him. Goose Gossage is not an issue except that he continually whines about the save stat. The Yankees are not the ones who ever made a big deal about the save stat. Some in the media have as have other teams who have used it as a marketing angle. The save stat isn’t the problem. If people stopped talking about it and focused on other stats, it could help matters..
For those of you who don’t see why making an out in the 9th inning is usually viewed as more important than an out in the 7th inning, do you also think a penalty kick blocked by the goalie in the 70th minute is as much a game-changer as one blocked in the 90th minute?
You’re talking to (mostly) Americans. Not sure I’d use a soccer reference (grin).
Translation: “do you also think a lead-changing FG blocked by the lineman at the start of the 4th quarter is as much a game-changer as one blocked at after the 2-minute warning (End of the 4th quarter)?”
——————————————
The Q & A is obvious, and we could choose numerous analogies to illustrate the point. The obstacle is not realization of the importance of the situations, but that conceding the situation would be an indirect admission of wrongness *gasp*.
Any reliever could likely go out and save 70% of the games. The elite relievers are closer to 90%. The difference between the 70% and 90% is likely the difference between a team being successful and disappointing. While “closing” seems to be up and down, highly unstable or unpredictable, etc … I do think it is undervalued.
People also say a lot of goofy stuff about “closers” or hand-wave off “attributes” closers must have, and I don’t know why because the commentary of anyone who has ever pitched in the 9th with the game on the line is vastly different than what many fans say.
So, whose opinion to take seriously, the person that’s been in the situation or a person who is about as far from it as one can possibly get?
When players that have closed and ‘not closed’ talk about the big difference in pressure involved and/or the amount of concentration required, or the small margin of error, why don;t fans listen? I don’t get it.
I’m not even sure it has anything to do with having pitched. If you’ve a watched a game where you have a horribly unreliable closer it’s really not a fun experience. I’m a Yankees fan and as soon as I saw Brian Fuentes come out I knew we were going to win it. Same thing with Brad Lidge. And I knew that Ryan Franklin wasn’t holding their lead over the Dodgers, Matt Holliday error or not. When someone like Mariano or Hoffman comes in it’s game over.
My friends who are Cubs fans were completely miserable during every 9th inning.
That said, I still think Saves are a terrible stat. It doesn’t mean that it has zero value though.
To keep with the football analogies, would you rather your favorite teams have an unreliable kicker or an unreliable closer? I know which one I’d choose…