## Basic Pitching Metric Correlation 1955-2012, 2002-2012

Last week, I took a look at year-to-year correlations for hitting metrics. This post follows up by doing the same thing with pitching metrics. Here, with a bit of commentary, are the results.

I am not presenting this as terrible original or groundbreaking. I do think this can at least be a helpful reference. I made a number of qualifications in the post on hitters, so if you have not read that post, I recommend you take a look at it.

Some metrics correlate better than others, but that does not necessarily tell us that one is “better” than the other, as I have included various metrics that have different uses. It is more precise to write that this tells us about relative sample size in relation to true talent. A metric with lower year-to-year correlation likely needs more regression to the mean when we are trying to estimate a player’s true talent. Finally, keep in mind that year-to-year correlation is not necessarily the only or always the best way to establish this sort of thing. It is, however, relatively easy to do for a basic study.

I neglected to include even a basic explanation correlation in last week’s post, probably because I typically assume I am the least mathematically knowledgeable person in any group. Better explanations are certainly Google-able, but for the lazy and non-picky: the range of possibilities for correlation is between 1 and -1. Results closer to 1 or -1 indicate that the two sets of numbers are strongly correlated (in the case of negative numbers, inversely correlated). Results closer to 0 indicates less of a relationship.

Deciding on what limits to put on my data was a bit more complicated with pitchers than with hitters. Last week I simply used batter seasons with at least 400 plate appearances. For pitchers, part of the problem was that many of them switch roles in and between seasons. We know that relieving almost always improves a pitcher’s across-the-board performance, so if a pitcher sees significant time relieving one season and not another, it could mess with the sample.

Without going through all my thought processes and options I considered, I ended up simply setting the minimum innings for single season at 140, and placing a strict limit on the relative proportion of non-starting appearances the pitcher made. Yes, that means that for all practical purposes, this is about starting pitchers, but we already know that relievers present their own set of issues (primarily around sample size). I wanted to keep this basic without shrinking my sample by excluding starters who made a few relief appearances during a season. As I did last week, I matched on team to mitigate the effect home park switches.

People naturally will want to compare the correlations of metrics held in common between this post on pitchers and the hitters post from last week. That is fine, but be cautious. It is not as if the samples are exactly equivalent — I somewhat arbitrarily picked the minimums (400 plate apparances for hitters, 140 innings pitched for pitchers), so it is not as if they are mathematically equivalent.

Okay, let’s look at some numbers, starting with some basic metrics from 1955 through 2012. As with last week, some of these metrics will look redundant, but I wanted to see how much difference the small differences make. “TBF” is “total batters faced,” the pitcher equivalent of plate appearances.

Pitching Metric | Year-to-Year Correlation 1955-2012 |
---|---|

SO/9 | 0.829 |

SO/TBF | 0.823 |

uBB/TBF | 0.721 |

BB/9 | 0.688 |

K/BB | 0.674 |

FIP | 0.620 |

HBP/TBF | 0.513 |

HR/(TBF-BB-HBP-SO) | 0.480 |

WP/TBF | 0.474 |

HR/TBF | 0.471 |

HR/9 | 0.470 |

WHIP | 0.442 |

ERA | 0.409 |

IBB/TBF | 0.382 |

BABIP | 0.351 |

LOB% | 0.226 |

For some, this will contain few surprises. As we found with hitters, so with pitchers — strikeouts are the most consistent year-to-year. I guess it all starts with making contact (or some other empty platitude). However, while home runs followed strikeouts pretty closely in correlation for hitters, that is not the case for pitchers. This sort of thing is sometimes parsed as meaning hitters have “more control” over home runs (or whatever event) than pitchers, but that is not quite correct. Hopefully, I will explain this in a way that is not too distorting or confusing: it is more accurate to say (and this requires a different and more complex sort of mathematical investigation to demonstrate; certainly it is above my pay grade) that there is less variation in skill between pitchers than for hitters in this respect. In a given match up, the hitter and the pitcher equally contribute to the possibility of, say, a home run happening on contact. However, there is probably more variation in this skill between hitters than between pitchers. That shows up here as less year-to-year correlation for pitchers.

FIP correlating better year-to-year than ERA or WHIP is pretty much what we would expect, and seeing strikeouts, walks, home runs, and BABIP broken down separately just highlights that on the component level. Wild pitches and hit by pitches per plate appearance correlating more strongly that BABIP is at least kind of funny to me for some reason.

I wanted to get a nice big sample for the basic metrics, and thus went back to 1955 (the data set has some gaps before that), but I also wanted to compare more recently available metrics involving stuff like batted ball data to each other and the older metrics. As with last week, these results should not be taken as a commentary on their quality either way. As with last week, the different sample naturally means that the results for some metrics found in both tables will be different. Here are the 2002-2012 results.

Pitching Metric | Year-to-Year Correlation 2002-2012 |
---|---|

GB/FB | 0.871 |

GB% | 0.839 |

FB% | 0.817 |

SwStr% | 0.804 |

SO/TBF | 0.803 |

SO/9 | 0.803 |

Contact% | 0.789 |

O-Contact% | 0.782 |

Swing% | 0.747 |

Zone% | 0.744 |

O-Swing% | 0.730 |

uBB/TBF | 0.711 |

Z-Swing% | 0.701 |

xFIP | 0.699 |

BB/9 | 0.692 |

Z-Contact% | 0.664 |

F-Strike% | 0.663 |

K/BB | 0.630 |

tRA | 0.589 |

FIP | 0.584 |

WP/TBF | 0.458 |

WHIP | 0.430 |

IFFB% | 0.422 |

HBP/TBF | 0.404 |

HR/TBF | 0.390 |

HR/9 | 0.390 |

ERA | 0.373 |

IBB/TBF | 0.358 |

HR/(TBF-BB-HBP-SO) | 0.349 |

LOB% | 0.238 |

BABIP | 0.235 |

LD% | 0.088 |

HR/FB | -0.029 |

In what might be a bit of an upset, strikeout and contact rates are dethroned by not just ground ball and fly ball rate, but by ground ball/fly ball ratio. A lot of people love swinging strike percentage, and stuff like this shows why, although further steps would be needed to show whether and how swinging strike rate would help better predict strikeout rate. Another point of interest might be while ground ball and fly ball rates correlating more strongly for pitchers than we found for hitters, while line drive rate correlates even less strongly — indeed, it can barely be said to correlate at all for pitchers. Much has been written about this sort of issue, I will simply ask that people remember the distinction discussed above between the issues of how much variation there is in a population and how much control each participant in a plate appearance has over the result.

The other metric that seems to basically not correlate at all is home runs per fly ball. This is another issue that has been long discussed, and is also the reason Studes came up with xFIP and its high correlation relative to other pitching metrics. In addition to the control issue, remember that just because something does not correlate does not necessarily mean no control is involved. Correlation is just one way of trying to get at this issue. Furthermore, just about everything measurable in baseball involves some skill, there is just more variation between players than others. Some individuals may have this skill, but we simply cannot measure it (yet?) in the population as a whole.

This is a simple study, but there is much more that can be said and discussed. I hope that these findings are helpful and interesting to some.

Print This Post

Can you also report on the correlation for BB/K and GB/FB (and compare them to K/BB and FB/GB, respectively)?

Tango- Would you mind explaining why you are interested in those differences (or what this comparison will tell us)? Just curious. Thanks

It’s the ratio v rate discussion that I have all the time. What you choose as the denominator makes a huge difference, especially if it’s something that’s at a 10:1 ratio (or 1:10).

This is why you should not do a/b but a/(a+b). Because correlating a/(a+b) to a/(a+b) in two different sets will give you IDENTICAL results to correlating b/(a+b) to b/(a+b) in two different data sets.

But, a/b to a/b and b/a to b/a will not give you the same results, especially if a:b ratio is extreme.

Thanks

Really curious to hear how SIERA correlates compared to the other estimators.

The HR/FB correlation is really odd. No Y/Y correlation for pitchers but a pretty high correlation for batters. At first glance, this seems like it should be closer since pitchers pitch a large portion of their games in their own divisions facing the same batters year after year.

OTOH – maybe this particular metric indicates an underlying reason for trades and other team changes. Teams get rid of pitchers who give up HR – and acquire bats who hit HR. And if this is really a pervasive enough practice throughout MLB, then is there a “Moneyball” inefficiency here somewhere?

I interpret it as, there is much more variation in power (Stantons vs Scutaros) than in HR-suppression ability.

Batter variation in power among individual batters wouldn’t affect this. If batters themselves are, relatively consistently, power hitters or slap hitters; then the only thing that can account for widely different pitching correlations (apart from some data quirk) is change in the individual batters a pitcher faces from one year to the next. ie they face more Scutaros this year and more Stantons next year etc.

For

my first article, I found a 0.262 year-to-year correlation in HR/FB over 2002-2012, but I had a different sample (I used qualifying IP as the minimum). I also tried it at a 50 IP minimum, and got an average YTY correlation of 0.163. One of us probably made a mistake.That’s still very low and a big difference from the batter correlation on HR/FB — .74

Very true. I think it goes along with what I said earlier (down below) — pitchers can’t control whether the ball will be hit, or how hard — all they can do is make things difficult on the batter.

As in the case of the Angels…they traded off Santana who last year yeilded 39 home runs and picked up Hamilton who launched 43.

I really liked these two articles. It is very thought-provoking in terms of ways I can imagine the projection systems could be tweaked to convey an idea of the range of possible results by taking into account these types of correlations.

Random thoughts:

It would be interesting to see if edge% correlates with HR/FB or LD%

It would be interesting to see if the batters faced impact HR/FB or LD% — maybe by looking at HR/LD totals/rates from the prior year.

“In a given match up, the hitter and the pitcher equally contribute to the possibility of, say, a home run happening on contact.”

Why do you say that? A pitcher can throw a perfect pitch and still give up a HR on it (I mean, it’s a lot less likely than on a meatball, but it can still happen).

I think that the hitter basically controls the outcome, and all the pitcher does is make things more difficult on him. Any pitch in the zone can be hit, whether by skill or luck.

Couldn’t the same be applied in reverse? A pitcher can throw meatballs, but the hitter can still drive it into the ground, miss or be fooled by speed differences.

Well, yes, but I say it’s the hitter’s fault if he gets fooled or puts a bad swing on a meatball. It’s not the pitcher’s fault if he gives up a HR on a perfect pitch.

If a pitch is thrown in the zone, and the hitter fails to put solid wood on it, it’s his own fault, whether due to misjudgment or some technical or physical flaw (swinging at a pitch outside the zone is a misjudgment as well). I’m not saying he should be *expected* to make good contact, because nobody’s perfect. However, I say a hypothetical *perfect* hitter would make solid contact on any pitch in the zone (even those thrown by a “perfect” pitcher).

I think this is really the main reason things like ERA and pitcher BABIP are so inconsistent. It’s not that there’s less variation in skill amongst pitchers; it’s that their skill is less relevant to the outcomes.

I’d say on any individual “perfect pitch”, it’s not the pitcher’s fault if he gives up a home run. But I’d be willing to bet that over time the pitchers who are best at throwing “perfectly” will reflect that in their results.

I agree, Drew — some pitchers make things a lot more difficult on hitters than others. But I also think if there were such a thing as a perfect pitcher, who threw 110 mph heaters and amazing breaking pitches with pinpoint accuracy, he would still give up runs here and there. Once he releases that ball, it’s out of his hands, as the saying goes.

I guess in terms of what order things happen, the outcome is out of the pitcher’s hand as soon as the ball is thrown. But as far as analyzing large sets of data, I don’t see anything wrong with either perspective. I say this mostly because pitchers see batters of all caliber that the pitcher’s stats should reflect their skill level, and we would be right to say that the pitcher is in control of his outcome.

You make a good point that ERA and BABIP are inconsistent. That could be a reflection of pitchers controlling a smaller relative fraction of the result. However, I think things like AVG and HR from hitters are also inconsistent, so I don’t know if we are better off saying the hitter has more control than the pitcher. As far as I can tell, we just don’t gain anything by assuming batters have more control.

OK, well, according to Matt’s article on hitting correlations, there are extremely strong year-to-year correlations for both HR and unintentional walks (about 0.8). Hitters deserve most of the credit for each of those events, IMO (particularly HR).

Hitters’ YTY for strikeouts is even stronger. Their inconsistencies in avg (0.48) are almost all due to inconsistencies in their BABIPs.

BABIP’s YTY for hitters was only 0.46, which is still a heck of a lot better than it is for pitchers — 0.235, in his sample above. Whether a fielder will catch a batted ball is mainly out of the hitter’s hands, though — all they can control is that they can make things hard on the fielders by hitting the ball sharply (or by executing a perfect bunt) and by running fast to first base. Unless they somehow can actually aim for gaps between fielders (as some have said Ichiro can do).

Anyway control vs. influence is the key here, IMO: each person in the three steps only controls what they themselves do, not what the person down the chain does; they can only influence the level of difficulty for those down the chain.

If I’m correct in that FB=OFFB+IFFFB in FanGraphs terminology, how do the year-to-year correlation values for GB/OFFB, HR/OFFB, and OFFB% look compared to those for GB/FB, HR/FB, and FB% respectively? I wouldn’t expect a huge difference given the relative magnitudes of IFFB% and OFFB%, but in which direction do the coefficients move if at all?

love these matt, loving having an easy repository for correlations

WHAT ABOUT WINS???

but really, just for kicks, how bad is wins? worse than ERA?

I actually wanted to run this for Quality Start percentage (QS/GS). I realize the correlation should be small but am still curious.

If I have 12 years of data, do I just split it into two data sets (odd years and even years)? Or are you breaking this out, game by game in your analysis? So game 1 goes into column A, game 2 into column B, game 3 into A, game 4 into B, etc.?