A few weeks ago, Eno Sarris took a look at a few batters with high swinging-strike rates and average strikeout rates, showing that a batter with a penchant for (or weakness in) whiffing on pitches doesn’t necessarily post as a high number of strikeouts as you would expect. Josh Hamilton, Delmon Young, and Vladimir Guerrero were identified as players who combine decent strikeout rates with high swinging-strike rates. These batters are characterized by their below-average walk rates while being known as free-swingers. Their aggressive approach presents both fewer strikeout opportunities and fewer walk opportunities as they try to put the ball in play early in the count.

This got me thinking: Since there are batters who can avoid strikeouts who presumably swing early, are there batters who get too many strikeouts because they don’t swing enough? I mean, clearly swinging strikes are not the only way to strike out a batter, and a batter who leaves his bat on the shoulder too often will get lots of called strikes. A conservative approach with few swings at anything in the hopes of drawing a walk could backfire. Such batters do exist — it’s just about identifying who they are.

I plotted 2010 batters with 200+ PA and their K/PA against SwStr%. Take a look below:

Please click here for an adjusted and embiggenzified image if you want to see the names. It may be more useful to *right click* and open the link and view it in a new window or tab.

There are plenty of outliers worth looking at. Based on their SwStr%, other batters with lower K rates than you would expect (a la Vlad) include Jake Fox, Juan Uribe, A.J. Pierzynski, and Pedro Feliz. On the other side, batters with higher K rates than expected include Brett Gardner, Eric Patterson, and Wes Helms.

Just eyeballing this scatter plot tells us that there is indeed a decent positive relationship between SwStr% and K rate for batters (and why not?). Note also that if we ignore outliers such as Mark Reynolds (chuckle), Rick Ankiel, Miguel Olivo, Fox, Guerrero, and Patterson, the variance in K/PA for any particular value of SwStr% appears to be consistent (that is, as SwStr% increases, the variance in K/PA is approximately the same). In the statistics world, data behaving as described is known to exhibit homoscedasticity as opposed to heteroscedasticity, where the variance dramatically differs with the *x* value.

A regression on this relationship shows a positive trend between the two stats with a decent correlation coefficient of 61.6%. Using the regression model to predict K/PA, I found the “expected K rate” or expected K/PA based on SwStr%.

Here are the top batters with 500+ PA who struck out “less” than expected, sorted by the difference between expected K rate and actual K rate. K/PA is actual K/PA while xK/PA is expected K/PA:

Name | PA | Swing% | Contact% | SwStr% | K/PA | xK/PA | Diff |

Vladimir Guerrero | 643 | 60.6% | 80.3% | 11.3% | 9.3% | 22.2% | -12.8% |

A.J. Pierzynski | 503 | 56.7% | 86.3% | 7.5% | 7.8% | 16.8% | -9.0% |

Josh Hamilton | 571 | 55.3% | 75.1% | 13.3% | 16.6% | 25.0% | -8.4% |

Juan Uribe | 575 | 54.8% | 76.8% | 12.4% | 16.0% | 23.7% | -7.7% |

Delmon Young | 613 | 59.0% | 82.4% | 10.2% | 13.2% | 20.6% | -7.4% |

Brandon Phillips | 687 | 52.7% | 81.9% | 9.3% | 12.1% | 19.3% | -7.2% |

Vernon Wells | 646 | 50.8% | 81.1% | 9.6% | 13.0% | 19.7% | -6.7% |

Pablo Sandoval | 616 | 57.8% | 82.8% | 9.3% | 13.1% | 19.3% | -6.2% |

Jeff Francoeur | 503 | 60.4% | 80.5% | 11.3% | 16.1% | 22.2% | -6.1% |

Carlos Quentin | 527 | 50.6% | 77.5% | 11.0% | 15.7% | 21.7% | -6.0% |

And here are the batters who struck out “more” than expected:

Name | PA | Swing% | Contact% | SwStr% | K/PA | xK/PA | Diff |

Brett Gardner | 569 | 31.0% | 90.6% | 2.9% | 17.8% | 10.2% | +7.6% |

Casey Blake | 571 | 41.8% | 80.2% | 8.0% | 24.2% | 17.5% | +6.7% |

Colby Rasmus | 534 | 46.7% | 75.7% | 10.9% | 27.7% | 21.6% | +6.1% |

Drew Stubbs | 583 | 43.9% | 72.3% | 11.7% | 28.8% | 22.7% | +6.1% |

Bobby Abreu | 667 | 32.9% | 83.1% | 5.4% | 19.8% | 13.8% | +6.0% |

Justin Upton | 571 | 41.5% | 74.3% | 10.2% | 26.6% | 20.6% | +6.0% |

Adam LaRoche | 615 | 45.2% | 74.1% | 11.3% | 28.0% | 22.2% | +5.8% |

Austin Jackson | 675 | 47.0% | 79.4% | 9.4% | 25.2% | 19.5% | +5.7% |

Adam Dunn | 648 | 45.0% | 68.2% | 13.8% | 30.7% | 25.7% | +5.0% |

Mark Reynolds | 596 | 47.0% | 62.2% | 17.1% | 35.4% | 30.4% | +5.0% |

One consequence of homoscedastic data finds itself in the tables above. SwStr% appears to have no bearing on whether the batter struck out more than he was expected to or less than he was expected to. It also should have no bearing on how closely the expected K rate predicted the actual K rate.

Another trend to note is that the first group of batters swing at pitches a lot more often than the second group of batters. Guys like Brett Gardner and Bobby Abreu in the second group swing so rarely that merely an average or slightly below average K rate will place them high on this list (low Swing% leads to low SwStr%, which predicts a low xK/PA via the regression model).

So what does this all mean? I’m not exactly sure yet. This running commentary demands more work to be done in this department, and there are plenty of interesting studies to continue from this:

– How do low-swing and high-swing batters distribute swings based on the count?

– Which batters strike out via swinging strikes the most? Via called strikes?

– Can swing rate and swinging-strike rate (and others) predict strikeout rate?

– Is there such a thing as batters who should *swing more* in order to avoid strikeouts?

– Aggressive vs. conservative approach: Which to use based on ability to make contact?

– And how about pitchers?

– Etc. (Any other thoughts?)

Concerning the third point, you might expect batters who swing a lot tend to also strike out a lot. Turns out that there is little correlation between the two when you consider how varied Major League hitters are at making contact and putting the ball in play. Multicollinearity would also play a role in a potential multiple regression model that uses swing rate and swinging strike rate to predict strikeout rate.

At this point, I suppose the end goal is to find out which batters swing at pitches purposefully and which do so recklessly and how such approaches helped or hurt the batter in terms of strikeout rate. More on this to be continued. Feel free to post your ideas, thoughts, or criticisms below on investigating the relationship between plate discipline statistics and strikeout rate.