Are The Umpires At It Again?

Pitchers have been throwing to different parts of the plate than they did last year. (via Brian Mills)

Pitchers have been throwing to different parts of the plate than they did last year. (via Brian Mills)

Editor’s Note: This piece was initially given as a presentation at the marvelous 2016 Saberseminar.

Back in 2014, it became clear that a large portion of the decrease in run scoring during the 2000s drop – as much as 40 percent – could be attributed to umpires expanding their strike zone downward about three inches. This was confirmed by three independent researchers: Jon Roegele here at THT, Ben Lindbergh at Grantland, and yours truly. The trend in the larger strike zone had started around 2009 and continued through 2014. Prior to the 2015 season, Jeff Passan reported it was something the league would look into, particularly if the low-scoring games were less interesting to fans. But, as Jon showed, the strike zone expansion continued as the 2015 season began.

Beginning last August, however, home runs started to increase and offense made a small comeback. In 2016 that trend has continued, and data from MLB’s new Statcast system show exit velocity is to blame: The ball is coming off the bat harder than it was last year. This has been well documented by others.

The increase in exit velocity and home runs has led to various theories about the return of steroids or “juiced” baseballs. These are pretty serious accusations. The former implies impropriety among players. The latter has precedent I’m sure Rob Manfred would prefer to avoid. It cost Ryozo Kato, the commissioner of the Nippon Professional League in Japan, his job.

Most recently, Rob Arthur and Ben Lindbergh put forth some interesting evidence regarding the juiced-ball theory over at FiveThirtyEight. But Alan Nathan presented some evidence here at THT that was inconsistent with the juiced-ball theory. The investigation into the juiced ball, at least to some extent, seems fueled by the findings of Jon Roegele after the 2015 season, concluding there wasn’t much changing there.

Contrary to Jon’s conclusion, my own data hinted at something different. When I looked at the average height of called strikes–an admittedly surface-level look–it seemed to be ticking upward in the latter part of the 2015 season.

Pairing that with Alan’s more recent presentation, I was rather skeptical that the juiced ball was a good explanation of the home run and scoring increases. But I wanted a larger sample than the last two months of the 2015 season to follow up on Jon’s findings with the strike zone.

So I decided to investigate for my Saberseminar talk. I figured I could show the evidence that, as before, the umpires are a large culprit in all of this. I returned to the data last month, adding games through July, 2016. The trend in called-strike height seemed to continue.

LOESS of Daily Average Called Strike Height

LOESS of Daily Average Called Strike Height

It would seem reasonable to expect much of this change would be caused by umpires cutting down on strikes at the knees. But it also could be a selection bias issue or simply an expansion of the zone up high. So let’s take a closer look. The obvious question from all of this is: What does the strike zone look like now, and has this impacted run scoring since the 2015 All-Star Game?

I’ve been modeling the strike zone using PITCHf/x data since about 2010 using the same non-parametric method, a generalized additive model (GAM). GAMs are relatively flexible and allow us to make pretty pictures of the zone as well as control for various factors related to changes in the strike zone (like the ball-strike count). I’ll start with a simple GAM of the strike zone to get you acquainted with the visuals, and I’ll tell the rest of my story mostly in pictures. (Note: You can see enlarged versions of any of the following visuals if you open them in a new tab.)

Strike Zone in the Pre-All Star Game 2015 Era

Strike Zone in the Pre-All Star Game 2015 Era

Retroactive Review: Ace
Looking back at some of Justin Verlander's most interesting moments.

Notice that the darker the red, the more likely a pitch will be called a strike. The darker the blue, the less likely it will be called a strike in that location. There’s nothing particularly surprising here. Pitches down the middle are almost always called strikes. And pitches five feet off the ground and two feet inside are never called strikes.

But comparing these figures across time periods is a bit difficult, especially with small changes. It’s much easier to plot the changes in the strike zone for data prior to the 2015 All-Star Game (the “PreASG15” era, starting at the beginning of 2015) and data from after the 2015 All-Star Game (the “PostASG15” era continuing through July 23, 2016).

In other words, I subtract the probability a pitch–given its location–is called a strike in the PreASG15era from the probability that same pitch is expected to be called a strike in the PostASG15 era. The result is below.

Change in Called Strike Probability After 2015 All-Star Game

Change in Called Strike Probability After 2015 All-Star Game

Notice the scale is similar to the previous plot. As the probability of a strike decreases in the PostASG15 era, relative to the PreASG15era, it is darker blue. On the other hand, if the probability of a strike call increases in the PostASG15 era, it is darker red. White or light grey is a neutral color, meaning there is no change from the PreASG15to the PostASG15 era in strike probability.

The story is relatively clear. For both right-handed and left-handed batters, the probability of a strike call on low pitches has decreased substantially. (The base rate of called strikes for these pitches ranges from 20 to 60 percent.) In some cases, the probability of a low outside strike has decreased by as much as 12 percentage points. Much of this difference is low and outside, pitches we know to be more difficult to hit with high exit velocities or for home runs.

Interestingly, there are also more strikes being called up in the zone. While this would mitigate some of the decrease in the total size of the strike zone, it could increase the rate at which balls are hit in the air, possibly leading to more home runs.

As an academic interested in economics and incentives, my first thought was that the players must have recognized this change and adjusted their behavior strategically. For example, we might expect these changes to induce pitchers to throw up in the zone and over the plate more often. With more pitches over the plate, it’s possible batters are squaring the ball up more consistently since they don’t have to worry about those low, outside pitches as much. The behavioral change would be easy enough to confirm in the data.

To identify changes in pitch location, I switched to using kernel density estimation, which simply evaluates the proportion of pitches in a given location, rather than the probability of calling those pitches strikes (or some other event), given location. I use the same differencing method as with the strike zone, calculating the change in the rate that pitchers throw to certain areas of the strike zone in the PostASG15 era. The result is visualized below.

Change in Pitch Location Density After 2015 All-Star Game

Change in Pitch Location Density After 2015 All-Star Game

The changes here are again rather clear. The reduction in called strikes on the low-outside corner has induced pitchers to throw to that location less often. That has moved pitches inward toward the plate, and the inner half is being targeted more often than it was in the PreASG15 era. These changes are pretty subtle at an extra pitch or two per game.

But given the increase in pitches on the inner half of the plate, we should see more swings and contact there as well. Again using kernel density visuals, this is evident in the data.

Change in Swing Location Density After 2015 All-Star Game

Change in Swing Location Density After 2015 All-Star Game

Change in Contact Location Density After 2015 All-Star Game

Change in Contact Location Density After 2015 All-Star Game

The net effect here is around one additional contacted pitch on the inside half per game, and an average contact point between one tenth and one quarter of an inch closer to the center of the plate. Should we expect these small changes to result in home run increases?

As I did with the probability of a strike call, I use a GAM to estimate the probability of hitting a home run when the batter swings, conditional on the location of the pitch. And, as I suspected, the most common location for home runs aligns almost perfectly with the locational increases in pitches, swings, and contact in the PostASG15 era.

Probability of Hitting a Home Run on a Swing (GAM Predictions)

Probability of Hitting a Home Run on a Swing (GAM Predictions)

Given all the evidence here, my next step was to apply this in the context of Simpson’s Paradox. The changes in the rate of pitches in locations that result in higher exit velocities and home run rates, when averaged in aggregate, could result in what looks like an increase in how hard the ball is coming off the bat globally. In other words, if we reallocate the proportion of contact locations with the same associated exit velocities, can we explain the increased average aggregated exit velocity?

To do this, I broke the zone down into 36 separate six-inch by six-inch boxes as you see in the grid below. I then averaged the exit velocity of batted balls in each zone in the PreASG15 era and calculated a weighted average exit velocity using the PostASG15 proportions of contact in each zone. If the average exit velocity overall using the PostASG15era proportions paired with the PreASG15 era exit velocities, then we could conclude the increase is largely due to changes external to a juiced ball.

Generic Grid of Discrete Zone Breakdown

Generic Grid of Discrete Zone Breakdown

From this grid, we take the average exit velocity for Zone 1 (top left) in the PreASG15 era and multiply it by the proportion of contacted pitches in Zone 1 in the PostASG15 era. We do the same for Zone 2, Zone 3, and so on. These estimates in each zone are just a discrete version of the density plots (contact proportion) and GAMs (home run rate, or in this case, exit velocity).

And doing this did result in an increased exit velocity estimate. However, the change was only about 0.055 mph, or 5.5 percent of the actual change of one mph in the PreASG15 and PostASG15 eras. But I wasn’t satisfied with this. There could be other considerations. (I won’t go through the mathematical gymnastics here, but the percentage point change in inside rate is about 0.1 to 0.3 percent in a given zone area. I’m happy to share the specific numbers with anyone interested.)

It’s also possible the ball-strike count has become more favorable to batters, and in turn, batters are swinging at pitches more often in 3-1 or 3-0 counts. So while the locational differences didn’t result in much, perhaps batters simply are more prepared to sit on pitches in these counts and, in turn, hit the ball harder on average. Going through the same re-weighting exercise, I accounted for another 0.025 mph. That brings the total increase explained to only eight percent.

I remained stumped and a bit more open to the idea that the manufacturing of the ball may have changed slightly. But there are a few additional considerations that could be contributors.

It’s clear batters have been hitting balls at more favorable launch angles for home runs. The figure below is from a GAM estimation that evaluates the change in probability a batter hits the ball in the “sweet spot” angle for home runs. Clearly, some things have changed here, too.

Change in Probability in HR Sweet Spot Launch Angle After 2015 All-Star Game

Change in Probability in HR Sweet Spot Launch Angle After 2015 All-Star Game

There are two important takeaways from this plot. First, hitting the ball at these angles should increase the probability of home runs and, in turn, explain some of the increase in scoring. But Alan Nathan shows us that it’s not enough to explain the increase in home runs alone. Second, if balls are being hit at more favorable angles more often, it may also be that batters are squaring them up better. There is some evidence this may be the case, as the variability in exit velocity has been reduced overall in the PostASG15 era (though, by only about two percent) as measured by the coefficient of variation.

The skewness of the exit velocity distribution has also changed (again, very slightly) in a way that implies more balls are being hit harder, but the hardest hit balls are necessarily being hit harder. Again, this seems to indicate some change toward more consistent squared contact across the league. But it’s very small, and as Rob Arthur recently showed, some of this could be due to systematic missing data in the tracking system in the PreASG15 era.

With more detailed, reliable data, one might be able to “reverse engineer” contact quality beyond just exit velocity and launch angle. For example, the batted-ball spin rate and direction may tell us more about the swing plane and how squarely balls are hit. Unfortunately, that data are not publicly available, and I’m told the results are still not particularly reliable.

Ultimately, even in the reverse-engineering scenario, it’s not clear we could account for the entire change. And it’s also not clear why the entire league so suddenly would change the way it approaches hitting. My hope is Alan Nathan can enlighten us all in the coming months regarding the exit-velocity puzzle.

So, after nearly 3,000 lines of R code and sifting through hundreds of thousands of observations, I explained very little. But sometimes that’s the fun of scientific inquiry. If we always made clear discoveries that could explain what was happening, we would be deprived of the challenge that makes the inquiry fun in the first place. It will be interesting to see how both offensive output and research on exit velocity continues to unfold.

What I do find fascinating, however, is that there are apparently some changes consistent with pitchers responding to incentives. The same behavior took place when the strike zone worked its way downward. If umpires continue to change the pitches they call strikes in different ways, it should be fun to keep tabs on how this induces strategic changes among pitchers.

References & Resources


Print This Post
Brian Mills is an Assistant Professor of Sport Management at the University of Florida, focusing on the economics of sports. He also teaches sports analytics courses at Data Camp, such as "Exploring Pitch Data with R". Follow him on Twitter @bmmillsy and/or email him here.
Sort by:   newest | oldest | most voted
dominik
Guest
dominik

Good article, however regarding high strikes causing more FBs:

Nathan states that FB rate is not up but the increase is due to harder contact.

League FB rate:
2014: 34.4
2015: 33.8
2016: 34.5

FBs are clearly not increased even though anyone talks about hitting fly balls (on twitter the Motto of the Internet hitting Gurus is now “elevate and celebrate”:)).

Brian Mills
Guest
Brian Mills

Hi dominik,

There has been a pretty clear increase in the average exit angle, though this doesn’t necessarily mean there are more fly balls as categorized within the FB rate. It could mean that there are more line drives and less GB, or all the change could be happening within the already defined designations (going from 29 to 30 degree launch angle, etc.). So perhaps my statement wasn’t as precise as it should have been. As I noted in the article, the change doesn’t seem to explain the HR change. So I’m with you on that.

Jimmy Sweetman
Guest
Jimmy Sweetman

Holy research paper, this article was fantastic!

I also noticed you have the same name as Liam Neeson in Taken. I am certain I’m not the first to comment on that.

Brian Mills
Guest
Brian Mills

I used my very special set of R skills for this inquiry.

I fill find you.

And I will thank you for the kind words.

phoenix2042
Guest
phoenix2042

“The skewness of the exit velocity distribution has also changed (again, very slightly) in a way that implies more balls are being hit harder, but the hardest hit balls are necessarily being hit harder. Again, this seems to indicate some change toward more consistent squared contact across the league.”

Did you mean “but the hardest hit balls are [NOT] necessarily being hit harder”? I just wanted to clarify. This was amazing work! I learned a lot (and yet, so very little, but as you say, that’s the fun of it).

Brian Mills
Guest
Brian Mills

Yes! That’s a typo. There should be a “NOT” there.

MGL
Guest
MGL
Excellent stuff Brian! So it sounds like the average exit velocity in each section of the strike zone IS up on the average which is important information. IOW, I want to know if average exit velocity is up using the delta method per section of the strike zone. So, up and in, avg. exit speed before is 80 mph, after is 81, # pitches in that zone is 1000. Up and middle, avg. exit speed is 85 before and 87 after, 500 pitches. So far, we have (81-80) * 1000 + (87-85) * 500 / (1000+500). That’s what I mean… Read more »
Brian Mills
Guest
Brian Mills
Hi MGL, Thanks. Yes, it is up in most areas of the strike zone. Though the differences vary pretty considerably by zone. All I did here was assume the same velocity as before, but with the new proportions (naive calculation). I do it precisely as you lay out here, but using the prior velocity rather than the change in velocity. Using the change should put us right at the aggregate velocity increase (I think…right?). I think there is a lot of room for a further breakdown though (especially in thinking about fastballs, etc.). I didn’t get too deep into it… Read more »
Guy
Guest
Guy
Really nice work, Brian. And a great blow against publication bias, since in the end you are (mostly) reporting a non-finding. Two questions, and a theory: Question: Does the reported 1 mph increase in exit velocity account for the big drop in sac bunts? If not, that might account for another 10-15% of the increase. Question: Has anyone looked at hitters who have especially increased their HR% in past 1.5 seasons, to see if they share any particular characteristics, or have made any common change in hitting approach? It might be worth looking to see if they are more heavily… Read more »
Brian Mills
Guest
Brian Mills
Thanks Guy. Good questions. On #1, I remove all bunts from my data (foul balls, too). I was worried that might affect results. It doesn’t seem to do much for the overall exit velocity (though maybe I should double check, since my data shows a 1 mph increase, while others have reported a 1.5 mph increase…so maybe some of it is my removal of bunts? Depends on how others treated them. I think I get a smaller velo change than Alan because I use data from months with higher exit velos in general (summer months) in 2015 and compare to… Read more »
Steve
Guest
Steve

Amazing work, great visuals

Don Smythe
Guest
Don Smythe
Really nice work, Brian. But one other possibility (I believe Bill James proposed this theory in his second Abstract). Is one reason for the uptick in home runs because it’s gotten more difficult to put together a sequential offense (i.e. it’s easier to swing for the fences instead of stringing together three singles)? IIRC, James (writing during one of the great hitters’ eras), wrote a few graphs suggesting that if baseball raised the mound to dampen hitting, the result would be lower batting averages and more hitters swinging for the fences. A bigger strike zone would likely have the same… Read more »
Brian Mills
Guest
Brian Mills

I honestly have no idea if that’s the case, but that’s a great behavioral economics idea! Seems worth looking into. I’d love to do so at some point.

dominik
Guest
dominik
I do believe that there now is more emphasis on getting the ball in the air, even among small middle IF types. they used to only “allow” big guys to try to hit homers but speedy guys were told to hit the ground. but now even smaller guys are allowed to trade a few hits for more power, for example trea turner said he wants to hit flies even though his coaches said to hit it on the ground and run. also the twins were slammed for telling buxton to hit it on the ground. there is a changing mentality… Read more »
Tim
Guest
Tim

Very informative article. What R package(s) did you use for the GAMs and kernel density estimates?

Brian Mills
Guest
Brian Mills

I estimate GAMs with the mgcv package with bam() and the parallel package, and two dimensional kernel density with kde2d. For the visuals, I do some combinations of filled.countour with RColorBrewer.

Some info on my old blog here on the estimation of GAM models for strike zones: http://princeofslides.blogspot.com/2013/07/advanced-sab-r-metrics-parallelization.html

Brian Mills
Guest
Brian Mills

I should note that my code for visualization is adapted very much from earlier work by Dave Allen.

Eric
Guest
Eric
Lovely analysis, and great visuals. Thank you! But are we certain this isn’t due to chance? You could simulate data by bootstrapping from the population of all the pitches, and see how often we see a change like this one, in called balls and strikes, between the simulated pre-ASB2015 pitches and the simulated post-ASB2015 pitches. First, you could eyeball the heatmaps of differences, to see if the random simulated ones often give rise to what look like meaningful patterns; and maybe a crude calculation, what proportion of the time do we see this extreme a change in proportion of called… Read more »
Brian Mills
Guest
Brian Mills
Thanks Eric. That would certainly be possible, and these are small changes. Since the original inquiry was to evaluate whether the locational change (whether random or systematic) would be enough to explain the exit velocity changes. Doesn’t seem that they are. The robustness of the changes for both RHB and LHB give me a bit more confidence (not just a shift in one direction due to a measurement change in the data) in the data moving in a direction consistent with the apparent zone changes. But it’s worth taking a more rigorous probabilistic view on the location changes themselves, for… Read more »
pft
Guest
pft
LOL. The hoops people jump through to avoid the juiced ball theory. Sometimes the simplest explanation is the best explanation. Occam’s Razor. The HR/G rate in the AL is the highest ever recorded in the AL in history including the height of the steroid era (also believed by some to be juiced ball era as well) . HR’s are up 15% over last year and 30% over 2014. B ABIP is up as is exit velocity, so its not all HR’s. Even Nathans data showed exit velocity was up for all batted ball types. His one comment that cast doubt… Read more »
Brian Mills
Guest
Brian Mills

“Sure the strike zone may be up a bit, but explain how guys and teams who have always lived up in the zone have been affected the most by a jump in HR rates where they are getting more calls…”

I can’t explain much of the change in HR at all. That’s the entire point of the article.

Here, I’ll CliffsNotes it for you:

“So, after nearly 3,000 lines of R code and sifting through hundreds of thousands of observations, I explained very little. “

evo34
Guest
evo34

I’m not claiming to be able to explain the apparent suddenness of change, but can anyone remember a time in the AL with so few young stud starting pitchers? The top 3 under-26 AL SPs are probably Aaron Sanchez, Michael Fulmer and Marcus Stroman. Off-hand, that trio seems historically weak — even adjusted for the higher offensive environment (if there is one). And the AL minor league pitching talent is practically non-existent.

I.e., I could see AL run scoring increase even more next year.

Pre Order Jordans
Guest

Look for the Nike Air Foamposite One Eggplant at select Nike stores and online during July 2017 for a retail price of $230. Keep it locked to KicksOnFire for more updates.

wpDiscuz