After Trevor Cahill lowered his ERA to 2.43 last night, Buster Olney tweeted that his numbers made Cahill a top contender for the AL Cy Young award. Keith Law quickly responded, noting that Cahill was 31st among AL starters in WAR and had a 4.07 FIP, suggesting that Cahill was in no way a Cy Young candidate despite the shiny low ERA.
Olney and Law clearly approach the award from different angles. Buster is more traditional, and prefers to use the numbers that have always been the standard for evaluating pitchers. Keith just wants to reward the guy that he thinks pitched the best, and doesn’t care about the way things have always been done. But their discussion raises an interesting question: what role should our stats have in the Cy Young discussion?
The award is ostensibly about rewarding the best pitcher in a league in a given year. For most of history, we’ve judged pitchers by how many runs they’ve allowed, but over the last 10 years, there has been a shift to try to isolate the actual abilities of the pitcher from that of the teammates that surround him, which is a worthy pursuit, as I don’t think anyone believes that a player should receive an individual award based on the work of others. Metrics like FIP have gained popularity, and it’s the primary reason for Law is using it in his Cy Young argument.
But FIP was not designed to give us a better insight into what actually happened, but, instead, what is likely to happen in the future. FIP is part of the collection of metrics that do a good job of predicting what will happen in the future by focusing on things that are under a player’s control. FIP was never designed to be a backward-looking metric designed to tell us what actually did happen. And there’s a decent argument to be made that the Cy Young award should be awarded based on what did happen, not on what should have happened or what will happen in the future.
In Cahill’s case, the real sticking point is his BABIP, which currently stands at an absurdly low .217. The next lowest batting average on balls in play for an AL starter is C.J. Wilson, at .263. The gap between Cahill and the rest of the league is enormous, and it is the driving force behind his low ERA. That .217 BABIP is not sustainable in any way, shape, or form. Unlike things like strikeout rate, BABIP is simply not a skill that a pitcher has much control over, which is why it’s not included in the FIP calculation.
But it is highly unlikely that Cahill has had absolutely nothing to do with his low BABIP to date. Yes, that number is driven down by a combination of outside factors, including his home park, his defense, and some bad hitting by his opponents, but it would be folly to assume that Cahill hasn’t had anything to do with hitters having a tough time getting hits off of him so far. We certainly should not give him credit for all of the hit prevention, and we should not expect it to continue, but logically, I think we have to assume that he has contributed, at least in some way, to the amount of balls that have found their way into the gloves of his defenders. Perhaps he has just hit his spots really well for several months – history says he can’t keep doing it, but do we want to assume that he’s had nothing to do with the results? I don’t.
So, just like I would not rely solely on ERA to make a judgment about who deserves the Cy Young award, neither would I rely solely on FIP. When trying to evaluate how a pitcher did in the past, ERA includes too many things that aren’t under his control, while FIP strips out too much. If I had to choose one or the other, I’d go with FIP over ERA, because I think it gets you closer to reality, but we don’t have to choose. We can look at the whole picture, and that’s what I suggest people do with their Cy Young picks.
Look at a pitcher’s walk rate, strikeout rate, home run rate, batting average on balls in play, and his left-on-base rate. The summation of how well he has pitched will be found in these five metrics. ERA includes all five and is affected greatly by the last two, while FIP only deals with the first three. I think we should look at all five, but not weight them evenly.
Walk rate and strikeout rate have little outside influence – they should be weighted the most heavily. Those are the two areas where a pitcher has the most control over the outcome. Home Run rate is certainly something a pitcher has some control over, but park and luck can interfere here, so I would give it a little less weight than the first two. Batting average on balls in play gets even less credit than home run rate, since there are so many contributing outside factors, but pitchers should get a little bit of credit or blame for the results on balls in play. Left-on-base rate is generally going to follow closely with HR/9 and BABIP, as posting low numbers in those areas will allow a pitcher to strand a lot of runners on base, so it gets the least credit, to avoid double counting something a pitcher has already gotten credit for. But some pitchers have historically been better at leaving runners on than others, and they should get credit for that in retrospective awards.
If I had to quantify these weights, I’d suggest it should be something like this:
Strikeout Rate: 40 percent
Walk Rate: 30 percent
Home Run Rate: 15 percent
Batting Average On Balls In Play: 10 Percent
Left-On-Base Rate: 5 percent
Reward a pitcher the most for what we know he can control, but reward him at least a little for things he may have had some influence on, even if he can’t keep it up. Don’t just go blindly into the discussion quoting ERA or FIP. Neither tells the whole story. .