## Updated xBABIP Values

This past offseason, I found an xBABIP equation which correlated better than just BABIP when looking at season 1 to season 2 values . By using the the new Inside Edge data and player speed score, I kept the process simple yet accurate. I have tweeted out the results a few times during the season, but it is time for another full updated list.

The Inside Edge data tracks the normal bunts, grounders, fly balls and line drives. In addition to the four groups, they classify the batted ball into weak, medium and hard contract. For the xBABIP equation, I looked at all the line drives and all the hard hit fly balls and grounders. Additionally, speed is a component of getting on base so Bill James’s Speed Score is also added into the calculation.

Here are the xBABIP and BABIP scores as of yesterday. I ranked it by the top xBABIP value.

Thoughts on a few players

Josh Harrison – I am sure people may expect his BABIP and then his AVG to decline, but right now he may be under valued. While he had a .297 AVG in the season’s first half, it has dropped to .211 so far in the second half. With at least 10 games at 2B, 3B and OF, he looks like he may be poor man’s Zobrist. He is a nice bench player in daily lineup leagues and can be plugged in at several positions on other players’ off days.

Didi Gregorius – He has never really been a great hitter. This season he has used some patience and power to almost be a league average hitter (97 RC+). With his xBABIP (.328) being 68 percentage points more than his actual BABIP, he could see his AVG start to creep up.

Matt Holliday – Talk about a sell low candidate. He is already having his worse season since 2005. Looking at his xBABIP (.271) vs BABIP (.304) values, his production could even be worse. Besidesa possible lower AVG, his HR and Flyball distance is at a 6-year low.

2009: 296 ft

2010: 299 ft

2011: 295 ft

2012: 305 ft

2013: 289 ft

2014: 283 ft

You may want to see if someone in your league is looking to buy low on him and move him before his value declines even more.

Print This Post

Can you share some thoughts on AL hitters?

Well, that is a first, usually people claim I don’t do enough NL players.

These three were just the three stood out the most to me.

Is there a way to incorporate balls that are pulled, hit to center, or hit to opposite field? For example, Austin Jackson’s average on fly balls this year is .180 whereas his career average is .232 on fly balls. If you look at his spray chart, you’ll notice that a higher percentage of his fly balls are going to center and right field this year. Earlier in the season he seemed to work on hitting more fly balls, while also letting the ball get deeper into the zone. This helped him reduce his strikeout rate early, but it was at the expense of hitting lazy fly balls to the opposite field. Now he’s sped up his timing and is pulling more balls in the last month which has increased his BABIP from .280 early this season to .333 now.

These data may be picked up in the regression analysis already, but I would think there’s at least a chance they could be statistically relevant.

First off, his xBABIP and BABIP are off by .001, so it seems like the data is getting picked up already. I am only looking at hard hit balls and line drives, so if he is hitting his fly balls harder, it is taken into account.

I am not really looking at how their BABIP is at a level, instead what is it supposed to be.

I figured there would be substantial overlap. There definitely appears to be a correlation in the numbers of who is hitting the ball hard and the ratio of balls pulled rather than hit to opposite field. Yet I still think there could be a chance that there is some statistical relevance to using the location of where balls are hit and not just how hard they are hit.

Someone below mentioned defensive shifts and how it is affecting left handed hitters. To encompass that, I suspect that batting average on grounders is higher for right handed batters than left handed batters for a couple reasons. (1) I do think the defensive shifts are denting the batting averages of at least some left handed hitters, and (2) perhaps more importantly, nearly 63% of balls in play are pulled, and grounders hit to the left side of the infield are more likely to be hits simply by nature of the distance from first base. If a third baseman or shortstop bobbles a grounder, the runner is likely safe. If a first baseman or second baseman bobbles a ball, he is still likely able to make the play if the ball does not kick far away from him.

If you look at the players with batting averages above .300 on ground balls you’ll notice a couple of things. (1) Not unsurprisingly, most of the players are speedy, and (2) 86% of them are right handed. The only left handed hitters hitting above .300 on grounders are Brett Gardner, Adam Eaton, James Jones, and Dee Gordon. All are speedy and mainly singles hitters. The right handed hitters are generally speedy too, but the list also includes Miguel Cabrera, Jonathon Lucroy, Casey McGehee, Marlon Byrd, Troy Tulowitzki, J.J. Hardy, Paul Goldschmidt, and Marcell Ozuna. These players are not necessarily slow, but they aren’t known for their speed either.

My feeling is that if you just use grounders and judge whether they’re hit hard or soft, you’re data could be improved by also including the location of where it was hit. I don’t expect this to dramatically change your results, but I think it’s possible it could improve them. I really appreciate this info you’ve provided because I have been wanting to see stats that could more accurately predict batting average, similar to the FIP and xFIP stats we have that can better predict ERA. Thanks for your work!

MCab, Tulo, Goldy, Lucroy and Byrd all hit the ball hard, which increases the likelihood that they get through. McGehee and Hardy have been very fortunate this year, while Ozuna has a history of high BABIPs in the minors.

Bobbled balls that allow runners to reach first are typically classified as errors.

If it’s a hard grounder that takes a bad hop the third baseman or shortstop is often not charged with an error, resulting in a hit, whereas the same type of bobble by a first/second baseman often still results in an out.

Moreover, there are many balls that a shortstop ranges to get to that do not result in errors because the throw simply has no chance of making it in time. It is exceedingly rare for a second baseman to not be able to throw out a runner on any batted ball he can get to.

My inclination might be wrong. I don’t think it would be too difficult to test whether ground balls to the left side of the infield have a higher average than ground balls to the right side. However, I don’t know where to find this data.

Update: I found this article pretty interesting, and it gives the batting averages by type of hit and location of the field. http://www.crawfishboxes.com/2013/9/9/4677304/outcomes-of-flyballs-line-drives-and-groundballs-by-direction-2013

Ground balls that are hit to the opposite field are much more likely to be hits, and I assume this is heavily correlated with how hard the ground balls are hit. When a hitter rolls over on a ball, he hits a weak grounder. But if he doesn’t roll over, he is more likely to make better contact. To this extent, this will already be reflected in this xBABIP formula since it accounts for how hard balls are hit.

However, there does seem to be a statistically relevant characteristic for balls hit to the left side of the field. Overall, from the data in the linked article, left handed hitters have a .226 average on ground balls and righties have a .244 average (extrapolated by assuming a 75% rate of ground balls being pulled). If we assume that ground balls hit to the opposite field are generally hit harder than pulled ground balls, we see that lefties have the advantage in ground ball batting average (.336 vs. .329). Making the opposite conclusion for pulled ground balls, righties have the advantage (.216 vs. .189). In both cases, the left side of the field yields the higher ground ball average. Thus, weak grounders to left are more likely to be hits than weak grounders to right, and hard ground balls to left are more likely to be hits than hard grounders to right.

This isn’t necessarily conclusive (somewhat small sample size), but I do think it compelling evidence that including the location of ground balls hit would yield a higher R2 for the regression analysis.

Steamer is still very Holliday. What do I trust if not the projections?!

Hi Jeff,

I think your formula could probably use some adjusting for the shift. Judging by your post, it looks like you’d have to use data other than just Inside Edge, but a lot of the people near the bottom of the Actual – Expected column are heavily shifted LH hitters, enough that it’s worth accounting for.

Although his batted ball mixture has definitely changed this year, I think I trust Trout’s career .366 BABIP in 1932 PAs more than this xBABIP calculation of .347. After all, he has added 10 feet of HR/FB distance this year, so maybe he’s hitting the ball harder to make up for fewer line drives, and this is somehow not getting fully accounted for in xBABIP.

I had been thinking that Solarte was actually underrated due to a low babip and assumed his xbabip would be higher. I think I misread his line drive rate, which i thought was in the mid 20s but evidently is 19.7%. And it turns out he has a double digit IFFB%, so his xbabip is .254. Yikes.

His minor league numbers weren’t exactly awful, in fact in 2011 they were downright strong.

I see him as probably a 90 wRC+ guy, or perfectly acceptable as an MLB middle infielder. If he takes Headley’s place at 3B for the rest of the season, he’s still probably gonna outhit Frenchy.

He’s been playing at a 2 WAR pace so certainly acceptable. But I had suspected his babip was unluckily low when it’s actually high per xbabip.

Nice to see Dickerson, while a little lucky, has the highest xbabip. Guy sprays the ball all over.

Is there any hope that xBABIP can be added into the existing sortable stats? This is extremely valuable.

Unluckiest (min. 300 ABs)

C Carter

B Roberts

N Schierholtz

M Teixeira

D Ortiz

C Davis

M Zunino

E Encarnacion

J Donaldson

B Dozier

Lots of shift candidates on that list.

many in the AL east (roberts, tex, ortiz, davis, edwin)

Couldn’t find this anywhere so sorry if I missed it. Is this ROS or expected season total?

Is there a system that tracks batted balls that are just foul? I realize applying such data would require a qualitative overlay but it could be an interesting and potentially rewarding exercise. I’d be interested to know which, if any, players are a minor adjustment away from reaping the rewards of better timing, This is partly inspired by Gabe Kapler’s recent Oscar Taveras writeup in which he suggested that a lack of rhythm is the source of his struggles.

I think my first reaction to this was surprise out of how bad David Wright has been this season. I wonder if this is just a fluke season for him are the start of a major and rapid decline.

*or

Jeff, IMO the fact that your xBAbip equation spit out a .290 for Chris Johnson (off by 66 points) is NOT an indictment of your system.

If anything, the Inside Edge data gives us some insight and proves once and for all that CJ’s high BAbip is NOT caused by him making hard contact consistently. Really, I think it’s his skill at placing soft fliners right over the second baseman’s head that is the main cause.

I haven’t seen a hitter so skilled at doing that since Rod Carew. (NOT comparing CJ to Rod!)