WAR and Eating Innings

A WAR Carol

Winter has come, baseball season is over, and Ebenezer finishes his analysis and goes home to his cold bed and DVD of Game 4 of the 2004 ALCS since there are no longer any current games to watch.

The Ghost of Pitcher Wins appears and informs him that he will be visited by the Ghosts of Relievers Past, Present, and Future, who will explain to him the errors of his ways.

Mike Marshall appears. “In 1974 I pitched 208.1 innings in relief during the regular season and 12 more in the post season. I accumulated 4.4 WAR. It would have been about 2 wins higher, but I was penalized for being a relief pitcher. It seems that giving my manager over 200 innings of good pitching becomes less valuable if I do it out of the bullpen when and where he needs it rather than as a starter on a schedule. I’m not alone — in MLB history there have been 393 pitcher seasons with over 100 regular season innings and no starts. I don’t even hold the record for most relief innings in a season. Why must I suffer for being a reliever when I carried a starter’s load?”

Next is the Ghost of Relievers Present. Tony Watson appears. “In 2016 I went 67.2 innings as a lefty with a large platoon split (.049 difference in wOBA between lefties and righties). But because I wasn’t totally hopeless against righties I faced well over twice as many righties as lefties (195 to 77) and ended the season with –0.1 WAR. Had I been a worse pitcher so Clint Hurdle used me less I’d have had a positive WAR. Would any manager have actually preferred a LOOGY who faced fewer batters and was inferior against both righties and lefties? Why am I penalized for being too good to be used only as a LOOGY?”

Next comes a group of six — it’s the Ghost of Relievers Future, and they say, “In the distant future a team attempts starting by committee. They have nine pitchers who typically go 18 batters each on a three-day rotation so as to avoid the third time through the order penalty. We come in in the middle of a game, at an unpredictable time, and do the same job for the same length of time as the starters. Occasionally we’re asked to cover additional outs if an earlier pitcher melts down or is injured so we enter early and go long. The starters have a lower replacement level than we do despite having the easier job with greater certainty about both when they will enter and leave a game. How is this fair?”

They fade from view, and The Ghost of Pitcher Wins reappears and says, “Seriously; the reason relievers have a higher replacement level is because their usage is different than that of a starter in ways that affect their value. But different relievers can have drastically different usage, and that also affects their value. Fix this, Ebenezer, or in the long run WAR for relievers will suffer my fate and be superseded by a better tool for reliever evaluation.”

Why the Problem Exists

Why does the replacement level differ between starters and relievers? That’s easy — replacement level is different because it’s easier to find a reliever with a given xFIP, wOBA, RA/9, ERA, or pretty much any other rate stat than it is to find a starter that good. Starters improve when sent to the bullpen; relief is an easier job, so it has a higher replacement level.

But if that were all there was to it then pretty much everyone would do nothing but have bullpen games. Relievers are better and the goal is to win games. So why employ starters at all, much less pay them lots of money?

I’m going to assume that a team has seven roster spots for relievers and five for starters. I’m going to exclude September and October from this analysis as the limit to a 25-man roster doesn’t apply in those months.

In 2016, prior to September, a starter roster spot averaged 151.8 innings (decimal fraction rather than outs obviously). A reliever roster spot averaged 60.8 innings. A team averaged 1184.6 innings.

With those utilization numbers a team would need 20 roster spots of typical relief pitching to get to September. This is not viable. A reliever is less valuable than a starter because eating innings has real value. Getting lots of outs has value beyond simple run prevention, because the team not only needs to prevent runs per at-bat for one or two lefties a game, but someone also needs to get through a large number of innings, and most relievers provide far less of this value than a starter.

The problems with reliever WAR in the fable above all come from the fact that we’re using reliever or starter status as a proxy for the ability to eat innings and changing replacement level to reflect this, rather than giving an explicit adjustment for being able to eat innings as a thing of value in its own right and otherwise evaluating pitcher results on a common basis.

Not all starters are equally good at eating innings, not all relievers are equally bad at it, and the ability to eat innings per roster spot used on the pitching staff has value.

When Steve Carlton went 346.1 innings of 11.1 WAR ball in ’72, he not only pitched quite well on the batters he faced, but he also gave his managers a lot of added flexibility by eating far more than his share of the innings. This is a source of substantial value not captured in the current methods. When Mike Marshall ate over 208 innings in relief that was again a source of substantial value not captured in the current methods. Marshall is in fact penalized on the assumption that he is failing to do exactly the thing that he clearly did.


One problem with what I’ve been saying is that the value added depends on innings/roster spot over time, and I don’t have good information about roster usage. Even if I did have good information about exactly how long each pitcher spent on a roster, I don’t want to give a pitcher a negative WAR for being called up and never used. For that matter I also don’t want to have to change the formula in September when roster spots drop in value.

I’m going to use appearances as a proxy for roster-spot usage. Appearances are readily available and this doesn’t penalize a pitcher just for sitting on the bench. Outs/appearance gives an indication of how good a pitcher is at eating innings, or at least of how good his manager thinks he is. Once a pitcher is in, he typically stays in until the manager has a reason to take him out or the game ends. Closers put in only at the end see such short appearances because the manager doesn’t want to use him for longer appearances.

Note that this is all extremely preliminary; I’m mostly hoping someone else will come up with a better solution than I have below.

Cut to the Chase

I had a bunch of stuff typed up, and reading it puts me to sleep.

I ended up convincing myself that I wasn’t going to do better than a simple linear approach. I’m using outs/appearance as a proxy for efficiency at eating innings; I split this into two terms.

The proposed replacement formula for pitcher WAR is:

WAR = (Runs above League Average)/(R/W) + (C1 × total outs recorded) – (C2 × total appearances)

That’s familiar enough — the first term is wins above (or below) average, the C1 × total outs recorded term is simply adding in a replacement level, the C2 × total appearances is a penalty to represent eating a roster spot.

What I’m actually doing is reducing the replacement level and adding a small penalty based on number of appearances. Unless you think there is something magical about being the “starter,” the different replacement levels for starters and replacements already add such a penalty. They simply do so in an ad-hoc way by adjusting the replacement level and assuming that relievers are the ones with short appearances.

The elephant in the room is that relievers and starters record different numbers of appearances over the same amount of time and I’m using appearances as a proxy for roster usage. This is where the math I removed comes in. On June 17, 1915, George Washington “Zip” Zabel came in for 18 1/3 innings in relief in a single game. I don’t think he needed less rest than a starter. Outs/appearance is being used as a stand-in for the ability to eat innings, and rest requirements would also be reasonably modeled as a term dependent on Outs/appearance. I don’t need a separate term for the things already being accounted for.

Let’s run the numbers for 2016. I’m going to assume that the total WAR given to starters and relievers in each league is at least approximately correct, and that all I’m doing is redistributing that WAR slightly.

Player Type

WAR xFIP Total Outs Total Appearances
AL starter 155.3 4.34 41,450 2,428
AL reliever 58.8 3.94 23,383 7,301
AL pitcher 214.1 4.22 64,833 9,729
NL starter 171.6 4.14 40,788 2,428
NL reliever 43.8 4.18 24,298 8,002
NL pitcher 215.4 4.16 65,086 10,430

I don’t have league-specific runs/win handy; 9.778 is the combined value, so I’ll use that. I also don’t have a good way to correct for the fact that some fraction of reliever WAR is due to leverage concerns and won’t apply to the average values I’m using here.

155.3 = outs/27×(4.22−4.34)/.92/9.778 + 41,450×C1 −2,428×C2

58.8 = outs/27×(4.22−3.94)/.92/9.778 + 23,383×C1 −7,301×C2

And if follows that for the AL the value of C1 is 0.004906 and C2 is 0.01135.

The AL C1 value gives a replacement level of 0.1345 below league average, or replacement of 0.3655, slightly less than the .38 currently used for starters. Then the AL C2 value penalizes a pitcher an 88th of a win for each time he comes into a game.

The same calculation for the NL comes out with a C1 of 0.004856 and C2 of 0.009520; or replacement of 0.3689, and a penalty of one 105th of a win per appearance.

Call it a replacement level of 1 win less than average per 200 outs recorded and a penalty of 1 win per 100 appearances and you’d be close enough for a first cut.

I strongly suspect that more detailed analysis with better starting numbers and taking leverage effects into account would work better, but the basic method will give long relievers some credit for what they’re doing, and give exceptionally long or short starters a small amount of credit for their ability (or inability) to eat innings also.

Print This Post

newest oldest most voted
Mark Davidson

That was splendid


Excellent work.

Shirtless Bartolo Colon

What about exceptionally wide starters, and their ability to eat between innings?