Last week, I examined the factors which affect a pitcher’s HR/FB rate and constructed a model which can be used to predict the pitcher’s future rate of allowing home runs. Using that model, we can examine which pitchers truly have good and bad HR/FB rates, find the realistic range of HR/FB rates, and analyze the pitchers which over- or under-performed their projections.
With this new perspective, Matt Cain no longer looks like a pitcher with completely unexplainable HR/FB numbers. Instead, he looks like a pitcher who has the ideal skill set and ballpark to minimize HR/FB rate.
One quick housekeeping note before we get to the charts. Many of the comments from last week’s post asked about including a lagged HR/FB term into the equation. This is an excellent point that I forgot to touch on previously. A lagged HR/FB variable would be highly significant and influential in the model, but there is a reason why it was excluded: the goal of the model is to examine what factors could help a pitcher exert control over his HR/FB rate. Truthfully, including a lag would improve the predictions of the model on the whole, but it injects problems into it’s utilization. By excluding a lag, it’s possible to make HR/FB projections for rookies, imports, pitchers who change teams, and pitchers with evolving skill sets, rather than relying on the pitcher’s historical rate.
It’s also important to remember that these projections will only include starters who threw at least 80 innings and did not switch teams during the season. Also note that, while much of this article is about where the projections differ from the actual results, overall the model’s predictions were accurate and had a lower forecast error than simply using the league average rate.
With that out of the way, on to the leaderboards. The first chart shows the 10 pitchers with the lowest projected HR/FB rate.
The biggest thing to note above is that Matt Cain has the third lowest projection of any pitcher. While he still out-performed his projected rate, it’s by a far less margin than had we just used the 10.6% league average. Cain was extremely good at keeping fly balls in the park last year, but only about one percentage point better than the model predicts.
The most surprising name on this list has to be Josh Beckett. The Red Sox starter is somewhat notorious for giving up a lot of home runs, but he fits many of the characteristics of having a low HR/FB rate. Beckett throws hard, has a high strikeout rate and plays in a park which, contrary to popular belief, depresses home runs. Although his 2010 HR/FB rate was in the teens, Beckett has been all over the charts during his career. Beckett posted a 7.2% HR/FB in 2003 and an astronomical 15% in 2006. The model suggests that Beckett’s true rate is between those two extremes, somewhere closer to 9%.
Every pitcher on this list has at least two out of these three characteristics: soft-tosser, control problems, extreme hitter-friendly ballpark. There is no surprise that those three variables matter, but the model provides validation for those long-held assumptions.
Mark Buehrle is the surprising name on this list, but with his ballpark, soft-tossing style, and a career 10.3-percent HR/FB rate, all signs point to 2010 as a fluke.
The two preceding charts provide a basement and ceiling for expected HR/FB variation. According to these predictions, any pitcher with a HR/FB rate below 8.5% or above 11.6% should expect to see some regression to the mean in the near future. This also indicates that pitchers with HR/FB rates between those thresholds might be able to sustain that performance long-term.
Next, let’s look at the pitchers who under- or over-performed what the model predicted for them. Of the 81 starters who met the selection criteria in 2010, only four had a difference between the model and their actual rate which was significant at the 95-percent level, meaning that the model predicted a rate that was more than two standard deviations away from their actual rate.
There are numerous theories why these pitchers outperformed their projected HR/FB rate. There is a definite possibility that a variable can be added or tweaked which could help explain these outliers. Outside of that, it is most likely the case that these results are just one-year flukes, and these pitchers’ HR/FB rates will fall closer to their predictions in 2011. All three pitchers who significantly outperformed their predictions have career rates far higher than their 2010 number.
Likewise, these pitchers may represent a flaw in the model that can be revised in future iterations. Even with a new variable or two, these pitchers likely had fluky 2010 HR/FB rates and are due to regress next season.