Reliever Pitching Metric Correlations, Year-to-Year

A little over a year ago I published the results of a study that examined which metrics were most consistent on a year-to-year basis for starting pitchers. My colleague, Matt Klaassen, followed up and expanded on that study recently here at FanGraphs. Matt’s study also focused on starting pitchers–those with a minimum of 140 innings pitched in consecutive years.

Recently I was asked the following on Twitter:

I can’t speak specifically to what the common wisdom is Justin is referring to, but I can certainly run the correlations for relief pitchers and compare them to what I found for starters.

I pulled data for all qualified relief pitchers from 2002-2012. I then matched up pitchers that qualified as relievers in consecutive seasons and ran basic correlations for a variety of statistics. Here are the results:

Year-to-Year Correlation
GB% .795**
GB/FB .781**
FB% .769**
Swing% (pfx) .749**
pLI .685**
K% .666**
Zone% (pfx) .666**
Contact% (pfx) .663**
SwStr% .663**
Z-Contact% (pfx) .653**
Z-Swing% (pfx) .646**
inLI .635**
exLI .603**
O-Swing% (pfx) .591**
O-Contact% (pfx) .557**
gmLI .551**
BB% .536**
SIERA .529**
SD .524**
HLD .519**
xFIP .519**
xFIP- .491**
FIP .418**
FIP- .413**
tERA .384**
AVG .366**
WHIP .278**
IFFB% .276**
WPA/LI .273**
HR/9 .237**
MD .216**
ERA- .178**
WPA .175**
ERA .174**
LOB% .117*
BABIP .099
HR/FB .071
LD% .033

**Indicates correlation is significant at the .01 level. *Indicates correlation is significant at the .05 level.

The order of the strength of the correlations for relievers is quite similar to those for starting pitchers–batted ball data tends to bring the greatest consistency, year-to-year. So, if a reliever has a tendency to induce a large percentage of ground balls we would expect that pattern to continue from season to season. Additionally, many of the leverage metrics show average to above-average correlation, with pLI garnering the strongest relationship.

While the order of the strength of each correlation is pretty consistent with starters the strength of those correlations differs in some significant ways.

Here is a comparison of reliever and starter statistic correlations–the final column shows the difference between reliever correlation and starter correlation for each statistic:

Relievers Starters Difference
GB% 0.80 0.85 -0.05
GB/FB 0.78 0.87 -0.09
FB% 0.77 0.86 -0.09
K% 0.67 0.82 -0.15
SwStr% 0.66 0.81 -0.15
BB% 0.54 0.67 -0.13
SIERA 0.53 0.72 -0.19
xFIP 0.52 0.68 -0.16
xFIP- 0.49 0.70 -0.21
FIP 0.42 0.59 -0.17
FIP- 0.41 0.58 -0.17
tERA 0.38 0.61 -0.23
AVG 0.37 0.53 -0.16
WHIP 0.28 0.41 -0.13
IFFB% 0.28 0.37 -0.09
WPA/LI 0.27 0.42 -0.15
HR/9 0.24 0.42 -0.18
ERA- 0.18 0.36 -0.18
WPA 0.18 0.33 -0.16
ERA 0.17 0.38 -0.21
LOB% 0.12 0.22 -0.10
E-F 0.11 0.11 0.00
BABIP 0.10 0.20 -0.10
BUH% 0.07 0.18 -0.11
HR/FB 0.07 0.29 -0.22
IFH% 0.06 0.11 -0.05
LD% 0.03 0.11 -0.08

In terms of basic batted ball data, the correlations for both sets of pitchers compare quite well. But after the first three, things really begin to separate.

For example, relievers have a year-to-year correlation of .67 in terms of their strike out rate. That’s a pretty solid correlation, but it’s .15 less than for starters. The same goes for swinging strike rate (-.15) and walk rate (-.13). Even outcomes such as HR/FB rate that has a low correlation for starters (.29) is significantly less reliable for relievers (.07).

This lines up well with what Jeff Zimmerman and I found regarding pitcher aging and how it differs depending on a pitchers role.

Let’s take the example of strike outs. Jeff and I found that while starting pitchers were able to mitigate against their decline in velocity–and therefore experienced a less drastic decline in their strike out rate–relievers were far more dependent on their velocity. As a result, relievers generally were more likely to see sharper declines in strike out rates from year to year.

So, if the common wisdom says that reliever performance is more erratic from season to season than starters then I’d say it’s pretty solid wisdom at this point.




Print This Post



Bill works as a consultant by day. In his free time, he writes for The Hardball Times, speaks about baseball research and analytics, consults for a Major League Baseball team and appears on MLB Network's Clubhouse Confidential. Along with Jeff Zimmerman, he won the 2013 SABR Analytics Research Award for Contemporary Analysis. Follow him on Tumblr or Twitter @BillPetti.


17 Responses to “Reliever Pitching Metric Correlations, Year-to-Year”

You can follow any responses to this entry through the RSS 2.0 feed.
  1. rustydude says:

    This is some great research and information. Pieces like this one are what keeps me coming back to Fangraphs.

    +5 Vote -1 Vote +1

    • Baltar says:

      I agree, mostly, but there is one important omission. Much of the reason for higher year-to-year correlation for starters must be due to the much larger sample sizes.

      +8 Vote -1 Vote +1

      • MGL says:

        Yes, that is correct. Without matching up the underlying number of TBF (so that it is the same for starters and relievers), a comparison of correlations is worthless.

        As it turns out, if you use the same number of underlying TBF (which is difficult obviously, since relievers max out at around 300 or so), you will find that relievers have a HIGHER correlation than starters. My guess is that their true talent changes less from season to season for a variety of reasons, perhaps one of them being more stable health due to fewer pitches thrown.

        Vote -1 Vote +1

  2. BSLJeffLong says:

    I ran some analysis similar to this a while back, but focused on overall performance rather than individual statistics.

    My goal was to build a model that would let me accurately predict future performance based on how a relief pitcher performed in the past season. Unfortunately, I could never get the R^2 above .55. Ultimately I guess this confirms your findings above (or at least fails to refute them).

    Vote -1 Vote +1

  3. Bill Josephson says:

    When you say significant, are you saying significant compared to a null of correlation = 0?

    Vote -1 Vote +1

  4. question says:

    You did holds but not saves?

    Vote -1 Vote +1

  5. Anon says:

    The last chart has potential sampling bias issues. The linked article referred to it as ‘survivor issues’.

    I mention this because you said, “starting pitchers were able to mitigate against their decline in velocity–and therefore experienced a less drastic decline in their strike out rate“. This isn’t necessarily true. I suspect many starters who can’t mitigate a velocity decline lose their starter role, which would leave a biased sample when comparing the population of starters at different ages.

    Vote -1 Vote +1

  6. Adam says:

    Great research. I would’ve liked to see RE24 on this list though.

    Vote -1 Vote +1

  7. Jim says:

    How does this impact Mariano Rivera?

    Vote -1 Vote +1

  8. Bob says:

    CW is that relievers are fungible. The answer is probably, it depends.

    Vote -1 Vote +1

  9. MGL says:

    The idea of relievers being “fungible” (because they fluctuate so much from year to year) is merely an illusion based on their BF small sample sizes.

    As I said above, if anything, relievers are MORE reliable than starters given the same sample sizes of TBF.

    Most teams do a poor job of constructing and paying a bullpen because they too are fooled by the small yearly sample sizes. There are not many teams (maybe none) that go exclusively by projections for players plus scouting reports, which is really all you should do.

    What am I leaving out (isn’t that all you CAN go by)? Many, if not all, teams, are fooled by placing too much emphasis on small samples (like one or even two years for a reliever) and by garbage stats (saves, wins, ERA, etc.).

    Vote -1 Vote +1

    • Ruki Motomiya says:

      I agree with the general idea that the bullpen is less fungible than people think and that part of the fungibility is improper usage of bullpen and improper evaluation.

      Vote -1 Vote +1

  10. MGL says:

    Note: ERA (or ERA+ or ERA-) is not quite a garbage stat, as it does somewhat approach a pitcher’s true talent at preventing runs in the long run, but since it includes sequencing, does not include unearned runs, and is not adjusted for park, defense, catcher, and opponents, in the raw version.

    Vote -1 Vote +1

  11. LD50 club says:

    What am I missing? It seems if GD% and FB% are some of the highest correlations then LD% should also be a high correlation, but it is the lowest.

    Vote -1 Vote +1

    • Kris says:

      LD% is pulled from the smallest sample of outcomes for a pitcher. If a RP has 3 GBs and 3 FB in a year turn in to LD with all else being equal, his GB% and FB% will not change much (because 3 doesn’t constitute a very big proportion of either of those outcomes), but 6 new LD will creat a pretty big difference in LD% (because 6 constitutes a rather big proportion of line drives).

      Vote -1 Vote +1

  12. Pedro Martinez says:

    Karim Garcia, who’s Karim Garcia?

    Vote -1 Vote +1

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>