Yesterday, we took a look at the starting pitchers with the biggest difference between their ERAs and their Expected Fielding Independent ERAs, attempting to find which hurlers performed above or below their peripheral stats in 2009.
Today, let’s turn out attention to the hitters. I compiled a list of the batters (minimum 350 plate appearances) with the biggest gap between their batting average on balls in play (BABIP) and their expected batting average on balls in play (xBABIP).
What’s xBABIP? Last winter, Chris Dutton and Peter Bendix sought to find which variables were most strongly correlated with a batter’s BABIP. Using data from the 2002-2008 seasons, Dutton and Bendix found that a hitter’s eye (BB/K ratio), line drive percentage, speed score and pitches per plate appearance had a positive relationship with BABIP (the better a batter rated in those areas, the higher his BABIP). Pitches per extra-base hit, fly ball/ground ball rate, spray (distribution of hits to the entire field) and contact rate had a negative relationship with BABIP. From this research, they created a model for predicting a batter’s BABIP.
Prior to Dutton and Bendix’s work, a lot of people used to calculate a hitter’s expected batting average on balls in play by taking line drive rate and adding .120. It made some sense: line drives have the highest batting average of any batted ball type by far, falling for a hit well over 70 percent of the time.
However, line drive rates don’t show a high correlation from year to year. That makes the “LD% plus .120″ method unreliable. Dutton and Bendix’s model showed a 59 percent correlation between actual and expected BABIP. The LD +.120 method showed just an 18 percent correlation.
Some of the numbers used in Dutton and Bendix’s study are not readily available. However, Derek Carty of The Hardball Times and Slash12 of Beyond the Box Score have both come up with expected batting average on balls in play calculators based on the new findings.
For the purposes of this article, I used Slash12′s calculator. It uses the following variables:
- Line Drive Percentage (LD%)
- Ground Ball Percentage (GB%)
- Fly Ball Percentage (FB%)
- Infield/Fly Ball Percentage (IFFB%)
- Home Run/Fly Ball Percentage (HR/FB%)
- Infield Hit Percentage (IFH%)
While not identical to the variables used by Dutton and Bendix, these batted ball numbers do a good job of taking into account the aspects that lead to a higher or lower BABIP.
First, a disclaimer. Like the ERA-xFIP charts from yesterday, these lists of “lucky” and “unlucky” hitters are based on just one year of data. To get a better feel for how a hitter will perform in the future, it’s vital to take a good hard look at multiple seasons worth of performance. This is just a quick-and-dirty exercise.
To provide a little more context, I also included each batter’s actual BABIP since 2007, when possible. The three-year averages help us get a better picture of each hitter, and help us figure out which batters might be “tricking” the xBABIP calculator based on one year of abberrant batted ball numbers.
Take Jason Kendall, for instance. Kendall had a 12 percent infield hit rate in 2009, compared to a 7.6% career average. The calculator doesn’t know that Kendall’s ankle exploded like a cheap Acme bomb a decade ago, and that he’s a 35 year-old catcher who has a BABIP under .270 since 2007. It thinks he has speed due to the infield hit rate. That’s why you need to look at multi-year numbers.
Here are the hitters with actual batting average on balls in play figures exceeding the expected batting average on balls in play numbers. These are the guys who might see their batting averages fall in 2010:
Higher BABIP than xBABIP
And here are the batters with actual BABIPs falling well short of the XBABIP totals. These hitters could experience a bounce-back in 2010:
Lower BABIP than xBABIP