## Projecting BABIP Using Batted Ball Data

Hi everybody, this is my first post here. Today, I’ll be sharing some of my BABIP research with you. There will probably be several more in the near future.

Now, I don’t know about you, but Voros McCracken’s famous thesis stating that pitchers have practically no control over their batting average on balls in play (BABIP) always seemed counterintuitive to me, ever since I heard it about 10 years ago. Basically, my thought this whole time was that if an Average Joe were pitching to an MLB lineup, the hitters would rarely be fooled by the pitches, and would be crushing most of them, making it very tough on the fielders. Think Home Run Derby (only with a lot more walks). Now, the worst MLB pitcher is a lot closer in ability to the best pitcher than he is to an Average Joe, but there still must be a spectrum amongst MLB pitchers relating to their BABIP, I figured. After crunching some numbers, I have to say that intuition hasn’t completely failed me.

This is going to be a long article, so if you want the main point right here, right now, it’s this: in the long run, about 40% or more of the difference in pitchers’ BABIPs can be explained by two factors that are independent of their team’s defense: how often batters hit infield fly balls and line drives off of them. It is more difficult to predict on a yearly basis, where I can only say that those factors can predict over 22% of the difference. Line drive rates are fairly inconsistent, but pop fly rates are among the more predictable pitching stats (about as much as K/BB). I’ll explain the formula at the very end of the article.

## Part II: Curveball Velocity, Location, or Movement: What is more important?

Stated in as simplest terms as possible, the goal of pitching is to get batters out without allowing runs to score. There are three ways any given pitch can get a batter out. A pitch can either be swung on and missed, taken for a called strike, or batted in such a way that the batted ball does not result in the runner reaching base. Batted balls involve the defence and are therefore less directly related to the pitch’s effectiveness at getting outs. That leaves us with swinging strikes and called strikes as the two best ways to measure a pitch’s effectiveness.

In Part I of my research on curveballs, I looked at what makes a curveball effective from a swinging strike perspective. I used an outcome variable that I like to call: ratio of effectiveness. Ratio of effectiveness is simply a ratio between swinging strikes and home runs hit. In Part II of my research, I will look at the effectiveness of curveballs from a called strike perspective. This work will aim to answer two basic questions: 1) are curveballs taken for strikes more often than fastballs? And 2) what are the characteristics of curveballs most often taken for strikes?

Are curveballs taken for strikes more often than fastballs?

## Infield Fly Proposal

117 years ago, in response to an epidemic of infielders intentionally dropping popups to attempt double plays instead, the National League adopted the infield fly rule, and with some minor adjustments, the rule has survived to the present. Like many remedies from the 1800s, the intent- protecting the offense from chicanery- was good, but the implementation- calling their batter automatically out- was fraught with problems.

First, and most obviously in light of recent events, even when the defense can’t make the play, the rule intended to protect the offense punishes them by giving the defense the out anyway. Second, any time a fly ball can be intentionally dropped for a good shot at a double play, the offense should be protected from that, but because the play requires calling the batter automatically out, the rule as written can’t be invoked liberally. Third, and related to the second, the umpires have to make a judgment call based on the trajectory of the ball, the position of the fielder, environmental factors, and anything else they consider relevant to determining “ordinary effort”. That leads to late calls and inconsistent application.