# Hitting ‘Em Where They Are

*At the MIT Sloan Analytics Conference, Dan Rosenheck offered a presentation on the effects of infield fly rate and in-zone contact rate on predicting BABIP. He has generously let us republish his talk here. The full text of his presentation is published below. You can find more of Dan’s work at The Economist and The New York Times, or check out his previous project on Wins Above Replacement*.

Long before baseball’s statistical revolution entered the mainstream, Hollywood writers showed an impressive grasp of one of the game’s most confounding nuances. In the 1988 movie “Bull Durham,” Kevin Costner plays Crash Davis, a career minor league catcher tasked with tutoring the pitching prospect Nuke LaLoosh. Explaining the cruel randomness that determines the fates of so many aspiring major leaguers, Crash gives Nuke a brief math lesson at a bar.

Crash’s speech actually contradicted much of conventional wisdom about baseball at the time. Ever since the Hall of Famer Wee Willie Keeler explained the secret to his success as “hit ‘em where they ain’t,” fans and sportswriters have generally given hitters credit for expertly placing their seeing-eye grounders and Texas Leaguers into the gaps between opposing fielders. Similarly, they have praised pitchers who seem to induce opposing hitters into serving up a steady stream of routine plays for the defense. At one point, even Crash himself preaches the value of pitching to contact. “Don’t try to strike everybody out,” he advises Nuke. “Strikeouts are boring. Besides that, they’re fascist. Throw some ground balls. It’s more democratic.”

That school of thought went unquestioned until 1999, when Voros McCracken, then a 28-year-old paralegal in Chicago, unveiled what was arguably the most important discovery in baseball analytics of the last 20 years. Voros, if you’re in the audience, please tip your cap. Seeking an edge in his fantasy baseball league, he began investigating how well one could predict pitching statistics by looking at past performance. He broke up the events on a baseball field into two groups: those determined entirely by the pitcher and hitter, and those that required the defense to make a play. The former, what Crash would call “fascist” plays, included strikeouts, walks, and home runs—later dubbed the “Three True Outcomes” by McCracken’s devotees—as well as hit by pitch. The latter, Crash’s “democratic” results, incorporated singles, doubles, triples, and all fielded outs.

The results were stunning.

The defense-independent numbers were fairly steady: pitchers who struck out a lot of batters one year were extremely likely to do so in the following season, and those who walked too many guys once tended to do it again as well. Home runs seemed to be a bit messier, but there was no doubting that some pitchers had severe, incurable gopheritis year in and year out. But batting average on balls in play—the share of fielded balls that became singles, doubles, or triples—was a crapshoot. Using a standard measure of correlation known as r, its consistency from year to year came in at 0.153—barely half the level that social scientists have long mocked as meaningless with the quip, “The world is correlated at 0.3.” There seemed to be no rhyme or reason to the leaderboards. In 1999, Pedro Martínez and Greg Maddux had among the worst BABIP’s in baseball; the following year they posted two of the best.

McCracken did not shy away from the intellectual consequences of his discovery. As hard to believe as it seemed, Wee Willie Keeler was wrong and Crash Davis was right: pitchers had little to no control over the outcomes of balls hit into play against them. The only difference between Pedro Martínez and Pedro Borbón was the former’s superior strikeout, walk, and home run rates. Everything else—everything from screeching liners to gorps and dying quails—was up to the fielders and Lady Luck.

To make use of the finding, McCracken developed a formula to estimate a pitcher’s ERA based exclusively on the factors he could influence. Called “Defense Independent Pitching Statistics,” or DIPS for short, it presumed that all pitchers would have a league-average BABIP. And it predicted next year’s ERA better than this year’s ERA did. Bill James, widely recognized as the father of quantitative baseball analysis or “sabermetrics,” later wrote of DIPS, “I feel stupid for not having realized it 30 years ago,” and commented that “Voros’s realization has become one of the pivotal points in the history of sabermetrics.”

McCracken’s counterintuitive research prompted an uproar in the nascent online baseball analysis community. The game’s finest minds soon began combing through the data. Most studies found that McCracken had somewhat overstated his case. One obvious exception was knuckleball pitchers like Tim Wakefield, who consistently posted BABIP’s far below the league average. Superstar pitchers like Johan Santana also tended to have somewhat better-than-average BABIP’s over the course of their careers. At the other extreme, as a group, minor-league pitchers with brief appearances in MLB compiled BABIP’s well above the league average.

That suggests that preventing hits on balls in play is indeed a real skill—but that pitchers who are not major league-caliber in this department give up so many hits that they simply don’t last long in The Show. Finally, pitchers do exert a strong influence on their ratio of ground balls to fly balls. Grounders go through for hits more often than non-home-run fly balls do, though they are much less likely to yield extra bases.

Nonetheless, the core of McCracken’s research held up. Yes, there were some statistically significant differences among major league pitchers in their ability to prevent hits on balls in play. However, in the non-knuckleballing division, the effect was so small, and the yearly fluctuations so big, that by the time you had enough data to determine that a pitcher possessed such a skill, he was probably on the verge of retirement. You’d get better projections just by assuming that everyone had major league average BABIP skill than you would by trying to decipher the BABIP mystery.

13 years after DIPS took the baseball world by storm, today’s most popular ERA estimator is the Fielding Independent Pitching equation developed by Tom Tango. Used by the statistical website Fangraphs.com to calculate its all-in-one value stat for pitchers, it is simply a stripped-down version of McCracken’s venerable Three True Outcomes: homers times 13, plus walks and hit by pitch times 3, minus strikeouts times 2, divided by innings pitched and adjusted for the league average. It doesn’t get much simpler than that.

Yet one important thing has changed since McCracken published his groundbreaking work: the Internet has made a treasure trove of statistics available to anyone with a computer. Today, we can download reams of data on pitch type, velocity, swing rates, batted ball trajectories, and any number of other variables instantly. I never had any intention of wading into the DIPS debate. But like McCracken, I am an avid fantasy baseball player, and every year I compile my own projections in the hopes of getting a leg up on the competition. And last year, in the course of my annual fantasy draft preparation, I stumbled on some data that seemed to lie at the start of the path towards sabermetrics’ holy grail.

My calculations confirmed McCracken’s contention that a pitcher’s BABIP in one season provides little information about what it will be the next year. However, I noticed that two other statistics did seem to offer helpful clues about hit prevention. The first was popup rate — the share of batted balls that are flies to the infield. Popups have virtually the same effect as strikeouts, since they are almost always caught and runners cannot advance on them. All other things equal, a pitcher who induces more popups should have a lower BABIP. And unlike some other batted ball types, such as line drives, popups do show a strong year-to-year correlation — pitchers who get lots of them in one year are likely to induce an above-average total the next year as well. From 2010 to 2011, popups correlated just as well as walks, significantly better than homers and of course far better than BABIP. Even more encouragingly, a pitcher’s popup rate in 2010 had a negative correlation with his BABIP in 2011. Although analysts have long known that inducing popups is a repeatable skill, I had never seen a publicly available equation that used it for the purpose of BABIP prediction.

The second promising variable was a statistic provided by Fangraphs.com called Z-Contact. It measures how often opposing batters make contact when they swing at pitches thrown within the strike zone. Since almost all strikes are hittable, and most of them are easy to square up, if a pitcher gets batters to miss a high proportion of his easier-to-hit pitches, then the balls that do get put into play against him will tend to come on harder-to-hit pitches. All other things equal, such batted balls should be more likely to be hit weakly and become outs. Z-Contact also has a very strong year-to-year correlation, and again seemed to have some predictive power regarding BABIP. At the time, I had never come across any reference to the relationship between Z-Contact and BABIP, although about seven months later Steve Staude of Fangraphs.com independently reported it.

This piqued my interest enough to do a longer-term study. I started by looking at every pitcher who had thrown at least 100 innings in four consecutive years between 2002 and 2011, giving me a sample of 387 pitcher-seasons. In order to get the best possible forecasts for popup rate and Z-Contact for the years in question, I first examined how they correlate from year to year, to calculate the correct weights for predicting future performance from past performance. For example, I project Z-Contact for a given year by using 39% of the previous year’s Z-Contact, 21% of the number from the year before that, 14% from three seasons ago, and 26% the league average.

Next, I had to factor out the confounding effects of ballpark and the quality of fielders, which together influence BABIP far more than pitchers themselves do. For example, in 2011, the Rays’ Wade Davis posted a .280 BABIP, just below the AL starters’ average that year of .283. However, the Rays’ rotation collectively compiled a .265 BABIP, thanks to the team’s expert positioning of its infielders and outstanding defense from Evan Longoria, Sam Fuld, and Ben Zobrist. Davis’s BABIP was actually 19 points worse than that of Tampa Bay’s other starters. So for each pitcher-season, I measured the gap between their BABIP and that of the rest of their team’s rotation.

Finally, I tested the relationship between my projections for popup rate and Z-Contact for each pitcher-season—derived exclusively from prior years’ data—against the gap between that pitcher’s BABIP for that season and that of his teammates. Both statistics showed a very strong effect, and a similar pattern of impact. More popups are always good, and a higher Z-Contact is always bad. However, not all changes in popups or Z-Contact rates are created equal: the BABIP’s of pitchers who were worse than average in those categories suffered only modestly, whereas pitchers who were among the best in the league in them consistently had BABIP’s far below those of their teammates.

If we plot forecast popup rate against BABIP relative to team, you can see that the slope is flat at one end and steep at the other. By using a curved fit, we can give appropriate credit to the game’s finest popup machines, without projecting utterly ghastly BABIP’s for pitchers who induce very few popups. The same trend is apparent for Z-Contact, though to a lesser degree. There’s a chance this pattern is a mere product of selection bias—perhaps only the pitchers who lucked into tolerable BABIP’s despite having low popup rates and high Z-Contact scores got the chance to pitch enough innings to qualify for the study. But the existence of a zero lower bound for popups, and a 100% upper bound for Z-Contact, provides intuitive support for the curved fit.

Combining these two equations, I was able to produce retroactive forecasts for all 387 pitcher-seasons. On the whole, projected popup rate and Z-Contact were able to explain 15% of the variance in pitchers’ BABIP relative to their teammates. That might not sound like much—indeed, it largely confirms McCracken’s original finding about the randomness of BABIP. But it leads to far, far more accurate forecasts than simply assuming that all pitchers will have a league-average BABIP, as Fielding Independent Pitching does.

The equation correctly nails every one of the major BABIP outliers of the past decade. Of the 43 pitcher-seasons it projected to have a BABIP at least 15 points below their teammates’ average, seven were by Tim Wakefield, the game’s premier knuckleballer. Seven more were by Ted Lilly, whose .268 career BABIP is the second-lowest among active starting pitchers. (The average is around .290). Johan Santana, whose .272 lifetime BABIP ranks fifth among active pitchers, shows up third on this list, with six seasons forecast to be 15 points or more below the team average. Next come Jered Weaver and Barry Zito with three seasons each.

Their career BABIP’s of .269 and .271 are currently the game’s third- and fourth-lowest. Matt Cain, baseball’s active career BABIP leader, also shows up twice on the list, as does Pedro Martínez, whose lifetime mark was an outstanding .279.

The formula’s single favorite pitcher-season was Chris Young going into 2008, whom it projected to best his teammates’ BABIP by a whopping 43 points. The 6’10” soft-tosser fell just short of that forecast that year, posting a BABIP merely 38 points lower than the rest of the Padres’ staff. His .254 lifetime BABIP would easily be baseball’s best if he had surpassed my cutoff of 1,000 career innings pitched.

The equation also does a good job of spotting BABIP laggards. Of the five pitchers with the most seasons projected for a BABIP at least 7 points above their teammates’ average, three—John Lackey, Liván Hernández, and Paul Maholm—have career BABIP’s among the league’s worst. A fourth, Jeff Suppan, is also a significant underperformer. Of the five, only the formula’s distaste for Jason Marquis is not justified. The equation also correctly singles out Zach Duke, who has the highest lifetime BABIP among active pitchers. He only qualified for the study in two seasons, but they were both among the nine worst projected single-season BABIP’s the equation spit out.

Despite these seemingly impressive results, retroactive projections like these must always be taken with a hefty grain of salt. Historical data sets are full of spurious relationships between variables as well as genuine ones—noise as well as signal, as Nate Silver, who may also be here, would put it. Science is rife with purported discoveries leading to predictions that fail miserably in the real world, because they reflect nothing more than random variation within the researcher’s specific sample of information. Indeed, Silver says that this phenomenon, known as overfitting, is “the most important scientific problem you’ve never heard of.” How can we determine whether the correlation between popup rate, Z-Contact, and BABIP is legitimate, or a mere quirk that happened to show up in recent years but will soon disappear?

The best way to avoid falling victim to the fallacy of overfitting is to test one’s equations not just on the data set from which they were derived but also on an entirely separate sample. If the formula retains its predictive power even when confronted with cases that it doesn’t already “know,” there’s a good chance you’re onto something real. Fortunately, baseball generates fresh new data sets with every season, giving researchers endless opportunities to distinguish signal from noise. We know that we can predict BABIP from 2005 to 2011 with some accuracy using other data from 2005 to 2011. But what happens when we use the 2005-11 data to predict 2012?

The answer is that not only does the formula continue to be useful, it actually improved. The equation was able to account for 17% of the variance in 2012 BABIP performance, slightly better than its 15% mark for the 2005-11 period. To be fair, the 2012 sample is small—just 51 pitchers—and the results are heavily influenced by a single outstanding prediction. The formula expected Jered Weaver to have a BABIP 41 points lower than that of his teammates. The Angels’ ace kindly obliged by posting a .241 mark, his lowest since 2006 and precisely 42 points lower than the rest of the Angels’ staff. But even if you remove Weaver from the sample, the equation still retained much of its predictive power when confronted with unfamiliar data. Its five favorite pitchers were all significantly better than average, and four of its five least-favorite were worse. The equation’s success not only helped me to win the annual forecasting competition published at Fangraphs.com this year, but far more importantly, gave me the extra edge I needed to eke out the championship in my fantasy league.

The next step in this analysis would be to study what types of pitchers tend to have extremely high or low popup rates and Z-Contact scores, in order to understand better what hurlers can do to reduce their BABIP. I had a few guesses, beyond the previously well-known discrepancies between flyball and groundball pitchers. Two of the biggest positive outliers, Jered Weaver and Chris Young, are extremely tall at 6’7” and 6’10”, and I thought it might make sense that batters would be more likely to pop up balls thrown from a higher release point. And Johan Santana, Cole Hamels, and Pedro Martínez are all renowned changeup artists, which made me think that a Bugs Bunny offspeed offering might lead to fewer hits on balls in play.

Unfortunately, when I compared groups of pitchers with the highest and lowest projected BABIP’s in the study, neither of these hypotheses held up. Both sets averaged 6 feet, 3 inches, and their changeup percentages differed by only two points. So if anyone in the audience has any ideas on this front—what distinguishes, say, Ted Lilly and Barry Zito from Mark Buehrle and Tom Glavine—perhaps you can move forward our understanding of baseball’s most enduring statistical mystery much further than I have.

*To see the slides from the presentation, click here.*

Print This Post

Nice job, particularly on the Z-Score front. But popups have been part of predicting BABIP for a long time. As one example, check out the xBABIP calculator from 2009:

http://www.hardballtimes.com/main/fantasy/article/simple-xbabip-calculator/

But not for pitchers

This was my reaction as well. I’ve been using that tool for years now.

Cliff, why not pitchers? You can use that same tool for a pitcher. The interpretation of the result is subtly different.

It doesn’t work to predict pitchers’ BABIP, because pitchers don’t appear to actually have control over all the variables that go into that xBABIP equation.

I’m a little surprised that Mr. Rosenheck did not look mention Jeremy Hellickson at any point. Over the last two years Hellickson has the 9th worst FIP (4.52) among 79 qualified starters, but he has the best BABIP (.242), 12th best Z-Contact (84.9), and 6th most infield fly balls (63). I suppose folks can continue to wait for the other shoe to drop, but Hellickson is the perfect exemplar of everything that Mr. Rosenheck is talking about here. As I showed here: http://dockoftherays.wordpress.com/2013/03/07/babip-insight-and-the-rays/

That’s not really comparable. That formula was a) for hitters, not pitchers and b) was designed for use on same-year data. The “discovery” here, to the extent that there is one, isn’t that popups reduce BABIP. That’s a truism. It’s that popups are one of just two variables I could find whose year-to-year correlations for starting pitchers are strong enough that they can be used to predict future BABIP relative to rotation-mates, and the proper measuring and weighting of this effect.

Thanks for the clarification. Makes sense. But when you say that popups and z-scores are the only two variables you could find that predictably impact pitcher BABIP, did you find that ground balls rates don’t?

Tango’s shown that GB pitchers and FB pitchers have the same BABIP.

Same BABIP or same BA on contact (including HR)? The latter I think is true; the former I believe is not (though it’s not a huge difference).

No, I’ve never shown that. What I have shown is that after you exclude HR, GB and FB pitchers have the same *run value*.

Sorry, I guess I read it wrong.

So, on average, GB pitchers allow the same number of fielding dependent runs as FB pitchers, despite allowing more baserunners? Is that because GB pitchers are better at inducing GIDPs on ground balls than FB pitchers?

YanksFan: I think it’s mainly because, while groundballs are less likely than flyballs to turn into outs, they’re also less likely to turn into doubles and triples.

Hell boy doesn’t have enough seasons to qualify for the study since a pitcher must have logged 100 innings for at least four years.

I agree that he is a perfect example of the type of pitcher he’s talking about though. As he began the article Cain and Hellickson were the first two people I thought of.

Just make sure you ignore the actual play on the field – it might get in the way of all the number crunching.

Just make sure you shut off your brain when watching baseball – it might cause you to enjoy the game in a deeper way and the world could explode.

Very good point! Infield popups and hitters making contact with balls in the strikezone are definitely not part of the “actual play on the field.”

Please don’t demean the good name of Yogi Berra with your sarcasm.

I recognize this is sarcasm, as if you were channeling Yogi Berra’s view.

Well struck.

“It’s the plays on the field that count, and I don’t know how those guys count ’em.”

(Just trying to come up with something Yogi might actually say.)

Yogi might say “you can count everything that happens all you want, that doesn’t mean it really happened that way”. Although most of what Yogi says you can derive a philosophical, life changing lesson from.

Hellickson didn’t qualify for the study–I required four straight years of 100+ IP to make sure I got robust data. I’m sure he’ll be one of the outliers people focus on going forwards (though some of his low BABIP is just the TB defense).

All these years I’ve been ridiculing those who didn’t believe in DIPS theory. Now I feel like a complete ass-hat. I suppose one-hop choppers have about the same relation that pop-ups do, but there is no data for those easy to field ground balls. Seems like some pitchers have a knack for getting double play ground balls also.

I’d be curious if you can correlate movement – pop-ups may come from “rising” fastballs, while grounders can come from sinking pitches.

Absolutely: http://www.fangraphs.com/community/index.php/proejcting-babip-using-batted-ball-data/

In November I offered up a formula with a .627 correlation to a pitcher’s average popup rate (FB%*IFFB%) that was based only on vertical 4-seam fastball movement, percentage of 4-seamers thrown, % of sinkers thrown, Zone%, and the speed differential between the fastball and changeup: http://www.fangraphs.com/community/index.php/babip-and-innings-pitched-plus-explaining-popups/

I integrated both Z-Contact% and popup rate into an ERA estimator formula in January: http://www.fangraphs.com/community/index.php/introducing-bera-another-era-estimator-to-confuse-you-all/

Saberguy–You shouldn’t feel that way. Pitchers do have some control over balls in play, but with a very small handful of exceptions–the Weavers and Lillys of the world–the vast majority cluster very close to the league average. And even the outliers are much closer to the mean in BABIP than the outliers in any of the fielding-independent categories are. Look, if pre-Voros, people thought pitchers controlled 100% of BABIP, and the strongest interpretation of Voros’s finding was that it was 0%, my research suggests the answer might be 10% or 15% or something? McCracken is still basically right.

You can somewhat infer routine ground balls from the fact that the very extreme groundball pitchers (Lowe, Hudson, Webb etc.) tend to have lower BABIP’s than their popup and Z-Contact% rates would indicate (I have a more complicated version of the formula that incorporates this effect). I presume that’s because they get an above-average share of easy-to-field grounders.

Duke–yes, you certainly can. Look up Steve Staude’s Community Research post on “explaining popups.” I’m going to try to work that into my projections this year.

Thanks. So you’re saying that you didn’t find that ground ball rates correlated positively with lower future BABIP? (Given that ground balls are fielded for outs more frequently than fly balls?)

I believe they do when taken alone. But popup rate and groundball rate are themselves inversely correlated, and once you know a pitcher’s popup rate, his groundball rate doesn’t give you any extra useful information unless it’s extremely high (>55% or so).

Gotcha. Makes perfect sense. Thanks.

I believe the exact opposite is true (more GB go for hits than FB): http://www.fangraphs.com/blogs/index.php/expected-babip-for-pitchers/

Hence some of the babip suppression seen by extreme FB pitchers like Lilly, Weaver, and Young

Yes, I imagine studes just accidentally got them backwards. Obviously the BABIP on GB is higher than FB (though the BA is not, at least not by much, because HR are hits).

can we also conclude from this new look under the hood, that among other things, pitchers who have a tendency to induce high rates of IFFBs are undervalued to an extent? Maybe that isn’t news, but quantifying it into some easily usable stat night be the next step. For example, I’ve gone back thru 4 yrs of SP data (’09 -’12) and simply revised K/BB ratios to include IFFB rate (K rate plus IFFB rate / BB rate). The resulting rankings have some surprises in them like Colon 4th, Blanton 7th, & Lilly 8th….

night s/b *might*… doh!

Yes, somewhat undervalued is right. Popups don’t correlate quite as well as K’s, but they’re about the same as walks, so if BB are good enough for your metric then I suppose IFFB should be as well. Remember that K and IFFB have different denominators, though (batters faced and batted balls).

ok, so when you expand the denominator the effect is not as great but it’s still something undervalued I think.

A system that gave proper credit to popup-inducing pitchers was my main goal in this article: http://www.fangraphs.com/community/index.php/introducing-bera-another-era-estimator-to-confuse-you-all/

Adding to what Dan said, popup rates match up year-to-year a lot better than HR rates.

You may want to consider (K + IFFB – BB – HBP)/PA or IP instead, by the way — K – BB is a better predictor of ERA than K/BB. But I’ve said before that I think IFFBs are more important than just their face value, because I think they’re also a sign of less-sharply hit outfield fly balls.

the question is, is ERA what we should be striving to forecast?

I maintain that inducing outs is a pitcher’s primary objective and that measurement, obpA, is one of the best measures of a pitcher (much like it is for a hitter), and minimizing bases given up when allowing hits, slgA, is also an important indicator of effectiveness and the combo, opsA, also has fairly good year to year reliability.

Runs are an indirect effect of a pitcher allowing runners on base. Runs score in a variety of different ways, but can’t score if they don’t reach base.

I wonder how closely the above formula (K+IFFB-BB-HBP)/PA correlates to opsA rather than to ERA.

Yes, ERA is what we should strive to forecast, because it corresponds directly to actual runs unlike outs. Why would you want to predict opsA when you can predict runs directly?

well, as someone else pointed out, runs are more or less a function of unfortunate sequencing of batters reaching base. The less frequently batters reach, the fewer the run scoring opportunities.

If we judge batters by the frequency they reach base and the quality of their hits and not by the number of runs they score, we should judge pitchers similarly, no?

Also, forecasting ERA is a worthy goal in the context that Dave mentioned – namely, trying to get a leg up on his fantasy competition. ERA likely correlates pretty strongly with WHIP also, so getting out ahead of the competition in 2/5 of the standard 5×5 pitching categories seems an admirable goal.

No, I don’t think so. I think knowing that a pitcher is projected to have a 3.52 ERA (for example) is more important than knowing he is projected to have a .726 opsA. Knowing projected opsA is great and all but it still doesn’t give me any idea of the number of runs he will allow which, again, I think is more important.

One more thing: I don’t really care about sequencing in this context. If a pitcher has a projected 3.52 ERA, that corresponds to actual runs allowed, so the order of hits, bb, etc are doesn’t really matter.

but do you agree that the lower the opsA the lower the ERA is likely to be?

Sure, but instead of turning this into a verbal war, how about forecasting both along side each other? There is no reason why we couldn’t and it might provide interesting (although perhaps largely useless) insight into a pitcher’s expected sequencing. For example, if a pitcher was projected to have a .690 opsA and 4.50 ERA it may mean that he is due for bad luck with sequencing next year.

I’m not against that. I’m more excited about the “discovery” about IFFBs being an undervalued skill. Rather than using it or incorporating it into a prediction measure, I’d like to have a raw metric (akin to WHIP for example) that I can point to as another data point which can be interpreted in valuing pitchers.

To me, the objective of pitching is to induce outs. And we know that pitching outcomes are not all controllable by pitchers (balls in play), but it would seem to me that the higher the percent of controllable outcomes (K + IFFB – BB – HBP)/PA the better the pitcher is since bip outcomes are pretty variable.

10-15% is enough to make FIP obsolete. We need something better than. Mike Fast done some nice research on how batted ball velocity correlates highly with BABIP. Seems like we should head in that direction. We should itemize ground ball velocities, along with fly ball hang times.

I’m not sure how much control pitchers have over batted ball velocities–you’d think that would show up in greater year-to-year persistence of line-drive percentage. But yes, GB velocity and FB hangtime are all in FieldFX, and I think BIS is doing something with that too. The issue is getting the info out in the public domain.

Simple. Line-drives rates probably only tell 1/3 of the story. There are weak line-drives and hard hit line-drives. Weak GB’s and hard hit GB’s. Weak FB’s and hard hit FB’s. There’s everything in between also. Getting that batted ball velocity data is critical for any BABIP study. Also in a 3 year study I did, I discovered that some pitchers have control of avoiding hitters counts (3 balls) and being in pitchers counts (2 strikes). Some of those ball-strike counts correlated well with BABIP. I turned it into a stat that was like a precursor to FIP. It would be nice to see something like that here.

Quite a bit – see Mike Fast’s study that Saberguy mentions. He also did a follow-up discussing how it affects BABIP (here.)

Saberguy – Do you have a link to that hitters/pitchers count study? I’ve always thought that might hold part of the key to unlocking BABIP fluctuations, so it’d be interesting to read what you came up with.

Ah, Steve beat me to it. For the purposes of clarification, the difference between my equation and the BABIP component of his is that mine is a forward-looking, predictive formula, whereas I believe his was based on same-year data. Obviously, guys who allow a lot of line drives in a given year will have a high BABIP, but giving up line drives is also pretty much random (or at least very difficult to predict).

Well, before writing the BERA article, I did come up with a predictive BABIP formula, but I’ve yet to write about it — I skipped ahead to ERA. BERA was somewhat based upon that predictive BABIP formula, except the BABIP formula is based on 3 previous years (like yours) instead of just 1 (like BERA).

Well, same here–I found this a year ago, but waited a season to test it out of sample. It’s Newton and Leibniz all over again! :)

Haha, no, I believe you — just trying to set the stage for when I do release an article on my predictive BABIP formula, since I have been promising to release it for a while, and it does indeed exist :)

So is the reason GBs go for more hits than FB’s primarily due to FB’s including these “popups” or is it just a small factor?

The big reason BABIP is higher on GB than FB is that a high percentage of hits on FB are HR, and thus not in play. Popups are a secondary reason. I think the overall BA (including homers) on both ball types is similar.

“Since almost all strikes are hittable, and most of them are easy to square up, if a pitcher gets batters to miss a high proportion of his easier-to-hit pitches, then the balls that do get put into play against him will tend to come on harder-to-hit pitches.”

That doesn’t necessarily follow, and I think there’s a useful implication. You will probably get a better result using O-contact / (Z-contact + O-contact), the percentage of contact which is outside the zone.

That’s a great idea. I’ll re-do the study with that variable and see if it improves my results.

Though there’s also a chance that Z-Contact works just as a measure of stuff. If the most extreme form of weak contact is no contact at all, then guys who induce a lot of misses on their strikes might also induce more weakly hit balls when batters do make contact.

First of all congratulations!

I really think you should examine those pitchers mechanics and more importantly their pitch mix and velocity differences between those pitches. Weaver and Hamels, for instance, have upwards of 15 mph difference between their fastballs and curveballs.

The pitchers you mention above are all exceptional and they must be keeping the hitters off-balance enough to produce some of the results you are seeing.

LaLoosh–the difference between OPS allowed and ERA is the sequencing of offensive events/strand rate. I’m not aware of anyone who’s been able to forecast above- or below-average runner-stranding ability, so for the purpose of prediction OPS allowed and ERA are the same.

Besides runner-holding ability, of course, which is obviously real and is a significant part of the effectiveness of guys like Mark Buehrle (along with pitcher fielding).

GB/FB rate probably has some influence on strand rate. It’s not going to be highly correlated, but I think you’ll find that groundball pitchers overall strand fewer runners than flyball pitchers do. So to the extent you can forecast GB/FB rate, you should be able to forecast strand rate, although it would be very messy.

whaa? shouldn’t it be the opposite? more GIDP and fewer SF.

I agree — correlations to LOB% (500 IP min, 2002-2012):

GB%: -0.19

FB%: 0.26

IFFB/Batted Ball: 0.33

LD%: -0.25

BABIP: -0.57

K%: 0.64

HR/PA: -0.31

Z-Contact%: -0.56

BABIP and K% dominate the mix (and Z-Contact% contributes to both low BABIP and high K%). Low Z-Contact% pitchers also tend to be flyball pitchers, so I think that’s why you’re seeing that.

*low Z-Contact% leads to a low BABIP and high K%, I meant to say.

The GIDP and SF advantages of GB pitchers that Dan brings up are apparently overcome by the BABIP and K% advantages associated with FB pitchers, apparently.

I think it’s probably also got something to do with batted ball outs, although that’s just guesswork on my part. Groundball outs (including errors, which aren’t counted in BABIP) are more likely to advance runners than flyball outs. At least that’s my supposition.

Wasn’t that the point of the most recent FG community article?

In 2012, Weaver’s Z-Contact rate was the highest of his career, and his IFFB% was the lowest of his career. Does that suggest his BABIP for 2013 to be in line with the rest of his rotation mates, or is that outweighed by the rates from earlier in his career?

It cuts his forecast advantage in half, from about 40 points below teammates to about 20.

In regards to the changeup hypothesis, instead of the % of changeups thrown, what about the average speed differential between the pitcher’s fastball and changeup? Also, is there a way to account for pitch movement in the analysis?

I did both of those things here: http://www.fangraphs.com/community/index.php/babip-and-innings-pitched-plus-explaining-popups/

“At the other extreme, as a group, minor-league pitchers with brief appearances in MLB compiled BABIP’s well above the league average.”

Sample bias? Those that had high BABIP have worse results in their small sample, and get sent down, while those with a lower BABIP get a longer look and get a chance to stick. This may not necessarily be due to skill.

This is great stuff. I’ve really started to rethink DIPS, BABIP and TTO stats in the last week or so.

One note on Weaver: while his BABIP matches your projection, his z-contact rate jumped to near league average in 2012. This seems to show that you got a little lucky on your Weaver projection. I’m still impressed by the results, in any event.

Well, the equation is designed to predict next year’s BABIP, not this year’s BABIP. I have no idea how Z-Contact relates to BABIP within the same season; I’ve never studied it. I do know that his higher Z-Contact% and lower popup rate, relative to previous years, have reduced his forecast BABIP advantage for 2013.

BABIP also varies by pitch count and pitch type so that is probably what will explain a major portion of the rest of BABIP. Another big portion is going to be when we can more accurately measure how hard balls are hit.

Is the difference massive curveballs? Or sinkers? Basically out pitches with severe downward movement, a la Chris Young in that ridiculous season of his?

http://baseballanalysts.com/archives/2010/04/some_research_o.php

Curveballs against same side hitters do suppress BABIP. Fastballs show the highest BABIP and in particular players with no movement on their FB like a Manny Parra really seem to spike their BABIP.

Thank you for the informative read. Despite the insurmountable evidence that pitchers have difficulty controlling balls in play, I still find my college coach brutally stubborn to acknowledge it. Maybe I’ll print this out.

I privately told Dan how much I thought his work was stupendous. I will do so publicly herein. Outstanding work! I have not read Mr. Saude’s articles yet, but I will soon. Sounds like his work was similar and fantastic as well.

Question for Dave A.: Does the definition of Z (or O) contact include foul balls? When I asked Dan, he was not sure whether “swings and misses” included foul balls or not. Hopefully it does (since a foul ball, by definition, is a ball that is not squared up either in location – on the bat – or timing), but I sort of doubt it, since contact is contact…

Does there have to be an explanation? Given a sample of 387 assuming normal distribution (I don’t know what the standard deviation of the BABIPs is) wouldn’t you expect that a handful of guys would overperform a couple of SDs based on simple probability?

Of course you would. But if it were pure random binomial variation, BABIP probably wouldn’t correlate so strongly to other variables (Z-Contact% and popup rate) from preceding seasons. And the equation based on those relationships certainly wouldn’t maintain its accuracy on an entirely different sample of data (2012).

I used IFFB% to predict pitcher BABIP a few years back, but I found that it correlated poorly from year to year. Last year I switched to a method that incorporated simply FB%, K%, team defense, and park factors into a pitchers projected BABIP. Fly balls in general (besides the IFFB variety) produce lower BABIP, and correlate much higher from year to year for a pitcher, that’s how I ended up using that. I got the idea of using K% from SIERRA, sounds like simply Z-Contact% might have been better.

My tests showed that this predicted pitcher babip was much better then what the projection systems were doing. Sadly, I didn’t reproduce the equation again for this years park factors/defense factors. The correlation was .396 and RMSE .029.

Sounds like you were using IFFB/FB, not IFFB per batted ball. The latter is about as consistent as walks from year to year.

yup, that’s indeed what I was doing.

Good work. It seems to me that more work needs to be done for pitcher projection purposes on the strand rate issue. Pitchers have different strand rates because of ability to control the stolen base, wildness (leading to increased number of wild pitches and passed balls) and effectiveness out of the stretch. You can look at A.J. Burnett’s early career as a negative example of this. It may be that ground-ball tendencies also affect this. Pitcher fielding (suggested above as a factor) would likely affect effectiveness with the bases empty and with runners on pretty much equally.

Is there any reason, analytical or otherwise, that Z-contact was used rather than overall contact rates?

One would imagine that ‘swing and miss stuff’ that deflates BABIP shouldn’t be restricted to only pitches in the zone.

Yes, I used Z-Contact% rather than overall Contact% for the simple reason that Z-Contact% is a much better predictor of BABIP. Why that’s the case is a topic for further discussion.

The thing that has always perplex me about comparing batted ball rates to stats like BABIP is the inverse correlation with LD%, which seems counter-intuitive. I am not completely convinced about the accuracy of these classifications, which naturally creates doubt about analysis based upon them (the same holds true for defensive metrics).

Until FieldFX is fully implemented and the data accessible, I don’t think we can properly quantify defensive and defense-independent pitching. Having said that, the work presented above is interesting, especially because it coincides with the correlation between league-wide rates and BABIP.

Inverse correlation?? Higher LD%, higher BABIP. The problem with using LD% for pitchers is that it doesn’t correlate well with itself from year to year.

Since 2002, there is a meaningful inverse correlation between LD% and BABIP for the entire season.

Should read entire league, not season.

The entire league? Why not look at individual pitchers, who have a .348 correlation between BABIP and LD% (qualified pitchers since 2002). I came up with the formula xBABIP = 0.4*LD% – 0.6*FB%*IFFB% + 0.235 to explain pitcher BABIPs, and it has a 0.628 correlation to their 2002-2012 overall numbers. http://www.fangraphs.com/community/index.php/proejcting-babip-using-batted-ball-data/

A lot of false correlations can show up in a sample size of only 11 (seasons). And that could just be evidence of some systematic change in LD% recording.

Are you saying that as the league LD% has risen, BABIP has fallen, or vice versa? That has everything to do with changes in how LD% is recorded. You always have to normalize those batted-ball stats.

I think it would be interesting to look at the pull/up the middle/opposite field distribution for batted balls against different pitchers. Pitch location and pitch type (two things that a pitcher controls) have been shown to correlate with these types of batted ball tendencies. Ball hit to the opposite field are around 4 times as likely to be a pop up as a balls that are not. If the ability to make hitters hit the ball to the opposite field is a repeatable skill, it might be one that would help pitcher out preform there FIP.