Evaluating Pitchers as a Concept: Average, Replacement Level, or Just Totals?

Don’t read further until you’ve already participated in this thought-exercise Cy Young poll.


The intent of this poll was to determine how the Fangraphs readers evaluated pitchers with different amounts of innings pitched. Basically, how does a reader balance better rate stats with lesser playing time stats?

The Pitcher X that all other pitchers were being compared to had a 13-4 W/L record, 2.00 ERA, and 153 IP. The league average pitcher has a 4.00 ERA. (You can presume that all the other missing stats would be consistent with that kind of W/L and ERA record.)

The reader was presented with a sample list of pitchers in the league, and was asked to rank Pitcher X in this continuum:
Pitcher A: 20-4, 1.50
Pitcher B: 19-5, 1.80
Pitcher C: 18-6, 2.10
Pitcher D: 17-7, 2.40
Pitcher E: 16-8, 2.70
Pitcher F: 15-9, 3.00
Pitcher G: 14-10, 3.30
Pitcher H: 13-11, 3.65

Each of these pitchers had 216 IP. (You can presume here as well that all the other missing stats would be consistent with that kind of W/L and ERA record.)

The list was chosen so that a person could place Pitcher X behind any pitcher, and be able to justify it, to some extent or other.

Do you believe only in ERA? Then you can place the pitcher as high as between Pitcher B and Pitcher C. Do you believe only in IP? Then you can place the pitcher below Pitcher H. Do you believe only in W? Then you can place him below Pitcher G, maybe Pitcher H. Do you believe only in L? Then you can place him below Pitcher A.

The more you start to combine the various stats, the wins, losses, ERA, and innings pitched, then the more decisions you have to make.

Suppose for example that you simply believe in comparing to the league average. In that case, you can rely just on the won-loss differential. Since Pitcher X was a +9, then you’d slot him between Pitcher D (+10) and Pitcher E (+8). Twenty-nine percent of readers chose won-loss as their answer, and therefore implied that their baseline is the AVERAGE pitcher.

Suppose that you believe in making the baseline as 2 points for a win and minus 1 point for a loss (that is, your zero point is a pitcher with a 1-2 or 8-16 record). In that case, Pitcher X (13-4, or 22 “points”) would fit snugly between Pitcher E (16-8, or 24 “points”) and Pitcher F (15-9, or 21 “points”). Thirty percent of readers chose that baseline as their answer, and therefore implied that their baseline was around a .300-.400 pitcher. That is, they treated that as their personal replacement level.

Another way to say the same thing here is to give each pitcher 0.350 wins for each missing game. So, our Pitcher X, being 7 games short, would get around 2.5 wins and 4.5 losses. That bumps his record from 13-4 to an EQUIVALENT 15.5-8.5. As you can see, that puts him snugly between the 16-8 pitcher and the 15-9 pitcher.

But what if you think that giving a pitcher 0.350 wins for each missing game is still too much. What if instead you want to give a pitcher 0.200 wins for each missing game. In that case, your Pitcher X would get an extra 1.5 wins and 5.5 losses, to bump his 13-4 record to an EQUIVALENT 14.5-9.5 record. That places him right between Pitcher F (15-9) and Pitcher G (14-10). Fifteen percent of readers chose 0.200 for a missed hame as their option.

We’ve captured the implied decisions of 75% of the readers.

But what about the rest? Well, 10% of the readers chose to place Pitcher X between Pitcher C (18-6) and Pitcher D (17-7). This would mean that his 13-4 record is equivalent to 17.5-6.5. Pitcher X was given an extra 4.5 wins and 2.5 losses. In this case, these readers are more interested in his rate stats, his 2.00 ERA and its equivalent 0.765 win%. Since the Cy Young is an award for Most Outstanding Pitcher, these readers reasoned (or their answers may have an implied reasoning) that only a pretty good pitcher would post a 13-4, 2.00 record, and so, they set the extra games equivalency at something between what he posted and league average.

Then we had 3% of the readers that placed Pitcher X (2.00 ERA) between Pitcher B (1.80 ERA) and Pitcher C (2.10 ERA). Those readers simply ignored the IP, and went completely on the rate stats.

On the other end, we had 5% of the readers that placed him between Pitcher G (14-10) and Pitcher H (13-11). In essence, the readers relied almost entirely on the wins, and ignored the losses almost completely.

The readers are broken down into four fairly distinct groups:
1. they like their rate stats mostly
2. they like to compare to league average
3. they like to compare to replacement level
4. they like their IP (counting) stats mostly

Hence the reason we get so many arguments, because each group has decided to use their own perspective. If someone preaches at the alter of rates stats, and the other at the alter of innings pitched, then they won’t be able to compromise. Even those that are closer in concept – choosing between league average and replacement level – have pretty much drawn a line in the sand.

For what it’s worth, the overall average of the respondents was to make Pitcher X the equivalent of 15.8 wins. This means that his missing 7 games was given 2.8 wins, which is a 0.400 win% record. And this is extremely close to the replacement level that I choose for starting pitchers.

While I’d like to say we have a consensus, the answers being so far-ranging would really limit that conclusion. Only a third of the readers actually evaluated the pitcher in terms of this replacement level.

Anyway, consider this poll as my introduction to readers to the concept of replacement level, without actually directly talking about replacement level.

Print This Post

21 Responses to “Evaluating Pitchers as a Concept: Average, Replacement Level, or Just Totals?”

You can follow any responses to this entry through the RSS 2.0 feed.
  1. Telo says:

    I like this series. To me, the material difference between hitters and pitchers in this exercise is the effect of a high IP total on the bullpen and other pitchers on the team. I have a feeling that I am overestimating this, and that my personal anecdotal evidence of emptying your bullpen to finish a game isn’t quite as impactful as it feels, but it’s definitely a non-zero factor. Whereas for the hitter in the first exercise, to me it was merely a matter of “how many runs over replacement did this player contribute while he played” – that was simply his value to the team – in this second exercise there is some additional inherent value to innings pitched per game in this scenario. That’s how I approached it, at least. 80% of the pitcher’s value derived from runs saved over replacement, 20% for the “Roy Halladay Factor”.

    Vote -1 Vote +1

    • Telo says:

      Which brings up an interesting point in that we don’t know how exactly the 150 IP are distributed versus the 216 IP. They could both go an average of 6 1/3 innings into games, which by my definition would have an equal effect on the bullpens and erase the Roy Halladay Factor from the equation (though you could aruge that the replacement level pitcher replacing X would NOT go 6 1/3 deep on average). I just assumed the 216 was going deeper into games, and credited him for it, which looking back on it is a pretty faulty assumption. In any case, not the overriding factor, I voted on replacement level and that probably would’ve kept my answer the same, regardless of how I handled IP.

      Vote -1 Vote +1

  2. tangotiger says:

    Your question seems to also imply a difference between Most Valuable Pitcher and Most Outstanding Pitcher.

    Does it matter if a pitcher throws 153 IP over 25 starts in the first 5 months, or in the last 5 months?

    Does it matter if he throws it over 20 starts, but spread out throughout the season?

    So, in answering, for yourself, what “Outstanding” means (and that is what the Cy Young award is for), what additional criteria do you need? Do you need number of starts? Do you need to know number of off days? Do you need to know injury, callup?

    Vote -1 Vote +1

    • Doug Lampert says:

      For a reward for outstanding I went for runs prevented above average.

      Average rather than replacement is my baseline because the average is the least outstanding. Amoung MLB pitchers replacement is actually an outlier itself and thus hardly a good baseline for what is outstanding on the other side.

      Your pitcher was 306 runs better than average. Pitcher D was 346 above average, pitcher E was only 281 above average. Pitcher D was further above average and hence more outstanding.

      If it were value I’d have to worry about what a replacement ERA is, but I don’t. I also ignore W-L because it’s not something the pitcher controls and this is an individual award.

      Vote -1 Vote +1

    • Telo says:

      Ironically, I’d vote the CYA to the most valuable, not the most outstanding, regardless of what the trophy says. I thought that Strasburg was the most outstanding pitcher last year, in that he literally stood out, and was the most exceptional player in the league when he was on the mound, but he wouldn’t crack my top 10 for CYA.

      Thinking in terms of value just makes everything more simple for me, since there are a slew of minor details that can muddle up MVP/CYA discussions. And plus, that’s how I would vote each award anyway – on field value added to team above replacement, regardless of other factors.

      Vote -1 Vote +1

  3. The Ancient Mariner says:

    I suspect that a great many of us actually don’t fall into one of those four camps, but rather are unclear in our own minds how we want to weight rate stats vs. number of innings in comparing pitchers to replacement level (or, I suppose, to league average). It isn’t logical, but there’s some reason for it, given that a) a given pitcher’s effect on games is so much more concentrated than a given hitter’s effect — thus arguably enhancing the value of dominance vs. accumulation — and b) that pitching stresses the body so much more than hitting does.

    Also, I think there was more opportunity for mathematical miscalculation on this question than on the question regarding hitters, so voter error might have skewed the results somewhat.

    Vote -1 Vote +1

  4. Padman Jones says:

    I slotted Pitcher X after Pitcher G, but W-L played no role in my decision making. I simply (and arbitrarily) felt that that was the proper discount for throwing so few innings. Which, I suppose, means that I ignored “outstanding,” and instead went with “valuable.”

    Vote -1 Vote +1

  5. Dan says:

    On tangotigers point, the biggest issue i faced is that i needed a story to accompany the innings total before i could make up my mind.

    WasPitcher X acquired in a trade in say mid-may, and only pitched 153 innings in his new League?

    Was he a rookie who was called up in May, or maybe in the rotation in April but on a stricter pitch\innings count than his veteran couterpart, which is a decision of the Manager\Gm to take him out of games earlier than he would like?

    Did he start the season off in the bullpen, or as a swingman before being stretched out as a starter; or the opposite; start the season as a starter and move into the closer role for the last month of the season?

    Did he pitch 153 innings out of the bullpen (in which case he wins the Cy Young for me), or as a swingman (still may deserve the award) for the entirety of the season?

    Did he pitch for a team with a 12 game lead in August and was taken out of games early, and given a few starts off and extra days off down the stretch to keep himself fresh for the playoffs?

    Is Pitcher X a hitter in the Mike Hampton\ Micah Owings mold, who also hit 310/415/500 with 10 HR and 12 SB in 200 ABS, and was seen to be more valuable getting 2 ABs and a few relief innings late in a close game? (this guy also wins the MVP for me.)

    As you can see my mind began to wander with all the possiblities of Pitcher X.
    My personal favorite involves him being a super intelligent mutant; the love child of Greg Maddux and Professor X if you will. (though he never used his power of telepathy to overcome a batter, that would be cheating and he is too good for that.) This pitcher X missed 6 weeks of the season due to Negotiations with the UN over a Sovereign Mutant Nation.

    As much as i consider myself a stat geek, there is a story outside the stats (traditional or new-fangled metrics) that come into play, that is the beauty of the game to me. Its not quality or quantity; its some where in the middle, and the story that accompanies it can be just as important.

    Also I’m bored at work felt like wasting a few minutes coming up with Pitcher X scenarios.

    Vote -1 Vote +1

    • So would that mean, then, that you factor hitting into your Cy Young rankings?

      Vote -1 Vote +1

      • Dan says:

        No i don’t factor hitting into Cy Young ranking.
        The point i was trying to make is if I’m only given three peices of in formation (record, innings, and era) I need the reasoning for why said Pitcher X “only” pitched 153 innings.
        If it is because he pitched 2ish innings per in 75 games or 1 inning in 153 games due to he has an added value as also being a pinch-hitter/pinch runner/high volume relief arm , than the reason for his 153 innings is valid.
        If he made two stints on the DL or just pitched 5 innings a start because he ran up high pitch counts early on in a game, then i wouldnt think so highly of 153 innings.

        Generally speaking you can take Mariano Rivera, the greatest closer (for a career) in the history of baseball, and I’m not sure i would vote him into the Hall of Fame. I dont consider 70-80 innings a year, or 1400 innings or so over a career (including postseason) as great as they are\were, to be Hall worthy numbers. I would take the 200 per season or 3000 plus innings in the careers of David Wells, Kenny Rogers, Kevin Brown (none of whom i consider Hall of famers, maybe Kevin Brown who should have won two Cy Youngs, but that is a different arguemnet for a different time.) before Rivera.

        But this imaginary player is piching 153 innings out of the bull pen and doing it quite well (he has a 2.00 ERA). He is essentially fullfilling the role of two members of the bullpen, so the hitting may be the reason for the innings total, but the numbers are quite impressive on there own.

        However for a historical example of piching\hitting in1999:
        Randy Johnson 17-9 2.48 era in 271.2 innings
        Mike Hampton 22-4 2.90 era in 239 innings

        If you want the rest of the numbers you can look them up, but RJ was the superiour pitcher and therefore deserving of the CY Young that he won.
        However Hampton hit 311/373/432 which is a pretty nice line for a pitcher, and he was also one of the better defensive pitchers I’ve ever seen. So i think the total package for Hampton was more valuable than RJ, therefore i would have voted him higher in MVP voting.

        Once again bored at work, and felt like writing a long winded explanation to a simple question.

        Vote -1 Vote +1

      • tangotiger says:

        You presume all other things equal.

        So, they are all starters.

        As for whether it was injury, or late callup, or simply being pulled early, I didn’t specify, so apply some odds on each one.

        Vote -1 Vote +1

  6. James says:

    Really like this series. Bringing out what we value most in a way.

    I didn’t place any value on wins, but I placed him behind pitcher E. Stats and IP were the two big factors.

    In my mind IP is huge. It’s a big monster and 200 IP is a big threshold. If you can maintain that 200 innings, a good ERA, and a good K/9 (I know we don’t have that here) that’s a good formula for Cy Young to me (Think King Felix).

    153 IP is a nice partial year. Sure you may have come to the party late, but that’s a big handicap to overcome. If you said the guy threw a bunch of CGs in his quest and was a superhero down the stretch and add narrative, who knows. But based purely on stats it’s Rich Harden-ish to me and not a Cy Young vote.

    Vote -1 Vote +1

  7. Michael says:

    I like the analysis you have been doing in these pieces, and I think the point of the exercise is important. However, with all the emphasis that has been placed on the pitfalls of placing importance in W-L record, I was disappointed that you immediately resorted to believing that W-L is the measure of a pitcher’s value. I know that for me personally the W-L was almost insignificant in my determination of where to slot the pitcher, but rather I valued the ERA and the number of innings pitched by each of them. You could have given Pitcher A zero wins and twenty losses and I still would have thought him the best one out there. While extremely unlikely, it is possible that a pitcher goes an entire season with a 1.80 ERA and still wins no games (maybe the defense behind him just sucked). I would be very interested to see you analyze this from the angle of how important more IP is compared to ERA instead of W-L. Great article nevertheless.

    Vote -1 Vote +1

    • tangotiger says:

      I noted the following:

      (You can presume that all the other missing stats would be consistent with that kind of W/L and ERA record.)

      Basically, I could have given you the K, BB, H, HR, and whatever else numbers you wanted, but I synthesized that into its EQUIVALENT ERA and W/L. Every number there was consistent with each other.

      The value of the W/L is that it presents a two-dimensional number, denominated in the only currency we care about: wins.

      If you looked at the continuum of pitchers I listed, it had a very strong pattern that directly linked W/L to ERA.

      I obviously could have been clearer in telling the voter to presume “hitting and bullpen support are league average”.

      Vote -1 Vote +1

    • Elwin says:

      In this case the W-L record is scaled directly with ERA, so it’s an expected W-L record. It’s included to allow you to easily figure out the ERA for an expected winning % to set your baseline for comparison.

      Vote -1 Vote +1

  8. Blue says:

    This would be a far more meaningful exercise if, rather than bringing in the Cy Young award, you would have simply asked something along the lines of: “You are the GM of an MLB team. Which of these pitchers would you prefer to have on your team.” Adding the post-season hardware to the question only muddies up the analysis.

    Vote -1 Vote +1

  9. CC says:

    You’ve divided readers into 4 categories. 130 people do not fit in any of those categories. What can those who chose the Behind Player A option be classified as?

    Vote -1 Vote +1

  10. kds says:

    Blue, I think those are 2 different questions. For most outstanding I would use a standard of (at least) league average. But for trade value, free agent signing, etc, I would use replacement level as my comparison. So, for Cy Young I would slot him below pitcher D, but for I would not trade pitcher E for pitcher X straight up.

    Vote -1 Vote +1