## Expected BABIP for Pitchers

Recently on FanGraphs, we’ve been referring to a stat called xBABIP or Expected Batting Average on Balls in Play to help justify a pitcher’s current BABIP. There’s been a few questions about what this stat means, so I thought it’d be as good a time as any to try and explain the ins and outs of this particular metric.

The initial concept of BABIP is that pitchers do not have control over what happens to balls once they are hit into the field of play.

BABIP typically fluctuates from year to year with a baseline of around .300. If a pitcher has a particularly high or low BABIP, we may say he’s been lucky or unlucky. Things are of course not quite this simple, but for the most part the rule holds true.

In enters ball in play data; we know how many line drives, fly balls, and ground balls a pitcher allows in to play. Line drives fall for hits the most often and ground balls fall for hits more often than fly balls. What types of batted balls a pitcher allows into play are going to effect a pitcher’s overall BABIP.

BABIP by Type (2007):
Fly Balls – .15
Ground Balls – .24
Line Drives – .73

Ideally, the formula is going to look something like this to find out a player’s expected BABIP:
expected BABIP = .15 * FB% + .24 * GB% + .73 * LD%

For more accuracy you could remove home runs from the batted ball percentages at a rate of 92% from fly balls and 8% from line drives. You could even account for infield fly balls and remove that from total fly balls, but the formula above will get you pretty far.

Dave Studeman a couple of years ago calculated that adding .12 to LD% was good enough for a ball park estimate of a player’s expected BABIP. This is what you’ll often see writers on FanGraphs refer to as xBABIP.

The best way to use this statistic is to attempt to validate a pitcher’s current BABIP. For instance, a pitcher might have an high line drive percentage and a high BABIP. This would give a pitcher a high xBABIP as well and you could say: “Yes, his high line drive percentage is responsible for his high BABIP.”

While this is useful for looking at past performances, the difference in xBABIP and BABIP should not be used in an attempt to evaluate future performance. This is because LD% and BABIP are somewhat independent of each other. While there is some correlation between LD% and BABIP, it isn’t enough to suggest that they will always track each other.

LD% in itself is highly variable and it would be difficult to say that a pitcher with a BABIP of .300 and a LD% of 22% (xBABIP of .340) should do considerably worse going forward because you really don’t know what his LD% is going to be the rest of the season. His xBABIP of .340 was his expected BABIP and will not be his expected BABIP in the future. Typically a pitcher’s expected BABIP in the future will be around the original baseline of .300.

Print This Post

David Appelman is the creator of FanGraphs.

### 8 Responses to “Expected BABIP for Pitchers”

You can follow any responses to this entry through the RSS 2.0 feed.
1. Brett says:

Cool.

So one of the main points Voros originally made is that BABIP correlates very poorly from year to year. But we see that there is a relationship between BABIP and hit type, specifically LD% as it’s coefficient makes it the dominant term in your equation. Does this mean that LD% doesn’t correlate well year to year?

2. Brett: Exactly.

3. MrLomez says:

I get that LD% doesn’t stay consistent year to year, but why? If certain pitchers have extreme FB tendencies and others GB tendencies, why wouldn’t still other pitchers be more prone to a consistently high LD%?

Do any variables correlate to LD%? I’m thinking maybe pitch-type stats, like fastball%.

4. Dave Cameron says:

My theory – groundball and flyball rates are a function of fastball location. Groundball pitchers pitch down in the zone (usually with a two-seam fastball) and flyball pitchers pitch up in the zone (always with a four seam fastball). Both of these are by design – the guys who pitch up generally have good enough fastballs to get by hitters and rack up the strikeouts, and the guys who pitch down are just getting as many ground balls as they can.

To give up a lot of line drives, you’d have to consistently pitch in the middle of the strike zone, and really, that’s a pretty terrible idea, so pitchers don’t do it. Line drive rate is a function of missing location (either up or down), and mistakes have more variance.

5. Dave, have you ever seen these post I did a couple years ago: Pitch Location and Ground Balls, Pitch Zone Charts

Think they definitely help support your theory.

6. Dave Cameron says:

Yea, those posts were great. Perhaps it’s more your theory than mine…

7. MrLomez says:

David and Dave,

Thanks for those insights.

And where can I look at those posts with the Pitch Zone Charts?

8. You can click on the links in that post. I guess they’re not really highlighted, but just click on “Pitch Location and Groundballs” or “Pitch Zone Charts” in the above post.