Trying to Vote for Dickey Over Kershaw

Later tonight, the National League Cy Young Award winner will be announced. My own fake awards picks have already been made public, and I am sure everyone was thrilled to read them. The NL Cy Young gave me the most trouble. I ended up voting for the Dodgers’ Clayton Kershaw, but I really wanted to cast my non-ballot for the Mets’ R.A. Dickey.

What’s not to like about R.A. Dickey? He names his bats after fictional swords (only the master smiths of Gondolin could forge a weapon that enables a pitcher to rake to the tune of a career 6 wRC+). He climbed Mount Kilimanjaro during the off-season to raise money to combat human trafficking. He is trying to help others by sharing about being abused as as child. He writes children’s books. He makes an awesome face while pitching. Best of all (strictly from a purely baseball perspective), he is a knuckleballer. Oh, yeah, he also had an awesome season in 2012.

However, when I tried to justify voting for Dickey over Kershaw, I just could not do it. It was not for lack of trying, though.

(I realize that one could make Cy Young cases for other pitchers such as Gio Gonzalez or Johnny Cueto, but I am sticking with the players I think are the two best choices for the sake of simplicity.)

It might seem easy enough to look at the Wins Above Replacement leaderboard for NL pitchers and see that Kershaw was valued at 5.5 wins and Dickey at 4.6 wins in 2012. However, even for those of us who believe that a DIPS-based value metric is, in general, the best alternative, it is not that simple. “In general” is a qualification — although I think that FIP is generally better than RA, it may not work as well in some particular cases. My view is that FIP (and other DIPS metrics like xFIP, tRA, SIERA, and so on) should not be seen as perfect in all cases, but as provisionally better in most cases.

That sort of thing is discussed at length in other places, so I want to focus on how it is relevant in this case. While there are many elements about DIPS metrics that are widely debated, even DIPS’ firmer advocates acknowledge that it does not really work for knuckleballers. Metrics like FIP include a built-in assumption that all pitchers basically have the same amount of control over balls in play. We know this is false, but generally, FIP is seen as doing better because that assumption seems be closer to the truth than the assumption than that the contribution of balls in play to ERA reflects.

Knuckleballers historically have a lower BABIP than the league average, so they are a clear exception. Knuckleballers are, to a certain extent, a population unto themselves. In short, it would be unfair to judge Dickey by something based on FIP. As one would expect Dickey’s FIP is an excellent 3.27, but his ERA is an even better 2.73. Adjusting for league average and park, those are 87 FIP- and 72 ERA-.

However, basing one’s Cy Young vote for Dickey on ERA will not quite do the trick. (Dickey and Kershaw both pitched around 230 innings, so we do not have to worry about that factor.) Kershaw not only had a better FIP (2.89, 78 FIP-), but a better ERA (2.53, 67 ERA-), too. So that locks it up for Kershaw, right?

Not necessarily. Keep in mind what was said above: FIP and other DIPS-based metrics may not be perfect or universally applicable, but they do the work in most cases. If we should not use them for knuckleballers like Dickey, that does not necessarily mean that we should not still use them for a non-knuckleball pitcher like Kershaw. In other words, maybe one can make Dickey’s case by using ERA for him, and FIP for Kershaw. That would seem to put Dickey (72 ERA-) just ahead of Kershaw (78 FIP-). That might be close enough to go either way, but does give the edge to Dickey.

Can we really justify rigging the comparison in that way? Maybe with some non-knuckleballers, but probably not with Clayton Kershaw. We need to be careful about using a single-season ERA as the go-to metric for most non-knuckeball pitchers, but Kershaw is not most pitchers. If he had managed to outperform his FIP (which has just been used as a stand-in for DIPs metrics in general, going through them all would have made this post too long) via his low BABIP just this season, maybe we could dismiss it on what Phil Birnbaum calls Bayesian grounds. I do not think we can. Let’s compare the two pitchers.

As one would expect from a knuckleballer, Dickey is a low-BABIP pitcher. From 2010 to 2012, his seasonal BABIPs are .276, .278, and .275, respectively. However check out Kershaw’s over the same seasons: .275, .269, .262. It goes back even further for Kershaw, in 2009, his BABIP-against was .269. For his career as a professional, Kershaw’s BABIP is .275 in 944 innings — the same BABIP as Dickey this season. So while there is still uncertainty and a margin of error with Kershaw’s “true” BABIP, there is a strong body of evidence that, despite not being a knuckleballer, Kershaw may be a low-BABIP pitcher whose contribution is not adequately captured by DIPS-metrics, either. So in this case, it would not really be fair to use FIP to evaluate Kershaw and ERA to evaluate Dickey.

While I suppose there are other ways one could try to justify make an objective choice for Dickey over Kershaw, I just do not see it working. I really tried. I even looked up the relative quality of the hitters they faced, and that favored Kershaw, too. I will not insult your intelligence by making something out of Dickey having six more pitcher wins (20) than Kershaw (14).

I will not be upset if R.A. Dickey wins the Cy Young this year. For reasons outlined at the beginning of this post, I actually would be very happy for him. But I think that Clayton Kershaw outpitched Dickey this year, and thus deserves the honor more. Shucks.



Print This Post



Matt Klaassen reads and writes obituaries in the Greater Toronto Area. If you can't get enough of him, follow him on Twitter.


Sort by:   newest | oldest | most voted
Garrett
Guest
Garrett

I’m statistics until I die, but when it comes to season awards, all that matters to me are the results, not what the predicted results should have been. I don’t know why FIP should be used in this sort of context at all. We aren’t predicting who the better pitcher will be for 2013, we want to know who had the better season in 2012.

While I support a guy like Felix Hernandez getting the Cy Young when he blows away the competition in ERA, to me, in a race as tight as this one was – Dickey was 20-6 with a 2.73 ERA and Kershaw was 14-9 with a 2.53 ERA – goes to the guy with the better win total, especially when their HR/BB/K splits are relatively similar (Dickey trails Kershaw in HR by 8, a significant number but not earthshattering).

That may be the only time I ever advocate wins as an important statistic, but I think it matters here. I can see votes for both sides, but I think if anything it’s a toss up, not a Kershaw victory.

FIP, wOBA, and UZR
Guest
FIP, wOBA, and UZR

We’re statistics too!

O-Swing%
Guest
O-Swing%

I’m statistics and so’s my wife!

Joe
Guest
Joe

FIP and UZR are models based on statistics. There is a difference.

vivaelpujols
Guest
vivaelpujols

ERA is a model as well.

Kogoruhn
Member
Kogoruhn

FIP is not meant to be predictive. FIP tells us what happened if you strip out components that DIPS theory tells us is out of the pitchers control.

Garrett
Guest
Garrett

A good point. I guess I meant predictive in the sense that, “we knew X pitcher was better than Y pitcher by his bottom line results this year, so we have a better grasp on which may be better going forward.” Not that cut and dry obviously. Maybe I need to think of the Cy Young the way I do the MVP (RBIs being irrelevant), but I can’t get past the idea that the pitcher’s/team’s record in his games started should matter for that one award when the ERAs are similar.

That being said, I went back and forth in my head on which side I was taking as I wrote this, so who knows. :)

Bip
Member
Member
Bip

It is predictive. In some cases it is more predictive than an expressly predictive stat such as xFIP, such as when a pitcher has a consistently below average HR/FB rate.

Joe
Guest
Joe

Wins are influenced even more by the offense and bullpen that by what the pitcher actually does. I understand wanting to look at what actually happened rather than what really should have happened in terms of DIPS, but wins just don’t have that much to do with the pitcher. This line of thinking seems to make more sense with stats like ERA, WHIP, BABIP, or HR/FB where the pitcher can have fluky numbers due to luck but they are still things that his pitching limited.

vivalajeter
Guest
vivalajeter

To me, this is a great example of when people take the “I’m smarter than you!” attitude too far. Instead of saying pitcher wins are overrated, you’re saying they don’t have that much to do with the pitcher. That’s absurd. Look at the top-10 Wins leaderboard over the last few years, and you’ll see a list that consists primarily of some of the best starters in the league. Look at the top winning % of all-time, and you’ll see a list of great starters. Don’t act like that doesn’t have much to do with the pitcher.

Yes, you’ll see some fluky years where a decent (but not great) starter wins 18 or 20 games, and you’ll see the reverse (like Cliff Lee this year, with only 6 wins), but by and large, the Wins-leaderboard is filled with some mighty fine pitchers.

vivalajeter
Guest
vivalajeter

I’m with you. Both pitchers were very similar this year, and I have no problem giving it to the pitcher who went 20-6 over the pitcher who went 14-9. I realize that pitcher wins aren’t nearly as important as people thought they were 20 years ago, but I disagree that they’re completely irrelevant, and I have no issues using it to tip the scales in Dickey’s direction.

Bip
Member
Member
Bip

The problem with pitcher wins is not that they are totally useless and unreliable. The problem is that they are a less reliable way of getting at the same information. If you want to compare two pitchers, you want to use a tool that most reliably measures their pure performance. If that tool doesn’t yield a winner, you wouldn’t then use another statistic that less reliably measures the same thing. That is the problem with wins. It measures the same thing as ERA, IP, FIP, WAR, SIERA and others, except with more confounding factors. So if those don’t give you a winner, then until you come up with a stat more reliable then those, then you may just have to settle for “too close to call” because you’ve already consulted the most reliable measurement of pitcher value.

If you could somehow show that a pitcher has control of a game outcome outside of direct run prevention and that that ability is captured by wins, then wins now are a source of new information not measured by FIP or WAR, but that has not been demonstrated.

vivaelpujols
Guest
vivaelpujols

This is crap dude. I agree that wins are strongly correlated with pitcher talent, but once you already know a pitchers IP, ERA, FIP, etc. wins add absolutely zero value because all they are capturing are run support and bullpen effects. Maybe think a little before you rationalize.

Haishan
Guest
Haishan

Okay, but let’s try and reductio that argument ad absurdum. If you’re really interested in results, or how a pitcher affected his team, shouldn’t you look first at something like pitcher wins, or W-L record in games started, or, probably the best option, WPA? I don’t think you would — essentially because none of those statistics are all that predictive or reliable. So already we’re striking a balance between reliability and actual results. ERA already adjusts for fielding, albeit in a 19th-century way, by discounting runs due to errors. Presumably you don’t think that adding those back into a pitcher’s run total would lead to a better statistic. So why not adjust for fielding in a much more statistically valid way? But that gives you FIP.

C
Guest
C

*argumentum ad absurdem reducemus

B says
Guest
B says

* Expecto Patronum!

wpDiscuz