Plate Discipline Correlations, 2008-2013

Plate Discipline Correlations, 2008-2013

In fall 2008 FanGraphs was kind enough to release new plate-discipline metrics, including first-pitch strike percentage (F-Strike %), outside-the-zone swing rate (O-Swing %), and inside-the-zone swing rate (Z-Swing %).  At the time, Eric Seidman was even kinder when he investigated the correlation of these plate-discipline statistics with standard pitcher metrics like WHIP, FIP, BB/9, and K/9. Very thoughtful indeed.

Now we have another 4.5 years of plate discipline data, compiled by Pitch f/x rather than Baseball Info Solutions. It may be worthwhile to see how these numbers compare with Seidman’s, as well as add a measure of uncertainty to the correlations. It is possible for two factors to have a strong relationship, but because of small sample sizes or other forms of variability, the correlation value may not be as precise a measure as a high R-value may suggest.

Bootstrapping

Correlation coefficients, which fall between -1 and 1, allow us to measure the strength of linear dependence between two variables, such as O-Swing % and K %. We can use bootstrapping techniques to obtain 95% confidence intervals for these correlation coefficients. Calculating confidence intervals for correlations adds a measure of uncertainty to the process—narrow intervals indicate we can have greater confidence that the R-value we obtain represents the true correlation between the two metrics.

Bootstrapping is a statistical technique in which we resample our current sample, in this case 500 times. This repeated process allows us to assign measures of accuracy to sample estimates, such as medians, means, or correlation coefficients. For our purposes here, it is only important to note that we can be 95% confident that the true R-value lies between the intervals. If the interval includes 0, meaning absolutely no correlation, we can conclude that there is not enough evidence to indicate any relationship between the two variables.

First Strike %

These correspond well enough to the values obtained by Seidman, with one exception worth noting. While he used K/9 and BB/9 to correlate with F-Strike %, here we examine the correlation with strike and base on balls percentages. Our correlation coefficient is similar in magnitude at .24 versus .19, but its wide confidence interval approaches the null value and suggests the estimate is not very precise. This is worth noting, especially considering that BB % appears to have such a strong correlation with F-Strike % of -.79 with relatively narrow confidence intervals. Seidman observed a similar pattern—pitchers who get into an 0-1 count are more prone to not walking batters than striking them out.

 First Strike % R-Value                    (95% CI) K% 0.24 (.024, .455) BB% -0.72 (-.848, -.604) WHIP -0.52 (-.649, -.376) FIP -0.41 (-.576, -.237)

O-Swing %

O-Swing % is the percentage of pitches a pitcher pitched outside the zone but still generated a swinging strike. Think anyone facing Pablo Sandoval. Here we again see relatively moderate correlations with relatively tight confidence intervals ranging from 0.30 to 0.19. Pitchers who induce swings at pitches outside the zone may be especially tricky for hitters to do damage against. So far this season Adam Wainwright and Matt Harvey are both in the top three in O-Swing %, and top two in both WHIP and FIP.

 O-Swing % R-Value        (95% CI) K% 0.39 (.274, .548) BB% -0.44 (-.637, -.254) WHIP -0.50 (-.677, -.317) FIP -0.45 (-.650, -.283)

Z-Swing %

We can see from the results below that Z-Swing %, the rate of inducing swings at pitches in the zone, bears little relationship with any of these metrics. Seidman’s analysis showed that the correlations were negligible at best. The confidence intervals for all of these measure metrics include 0, meaning that we cannot be 95% confident that there is any relationship present. A quick glance at the leaderboards shows that Ian Kennedy and Miguel Gonzalez are near the top of the list this season, and these guys aren’t exactly shoving.

 Z-Swing % R-Value        (95% CI) K% -0.17 (-.370, .035) BB% -0.17 (-.381, .048) WHIP -0.09 (-.276, .111) FIP 0.10 (-0.09, .286)

All data courtesy of FanGraphs.

Because I’m a believer in open data, you can find my R code here.

Print This Post

9 Responses to “Plate Discipline Correlations, 2008-2013”

You can follow any responses to this entry through the RSS 2.0 feed.
1. stich09 says:

Can’t find your R code… I’m learning to play around with R and it would be fun to see your code.

2. Filip Piasevoli says:

I’m learning R as well and I’m looking forward to examining your code. Very interesting stuff!!

3. How did you get the correlation graphs into the article? I’ve written a few things for this section and haven’t had any luck with images. Thanks.

• Simon says:

In the HTML tab in WordPress there’s an “img” button where you can copy-paste the URL of an image to embed it In the article. Worked for me.

• Excellent thanks. Can’t believe I couldn’t find that.