# Going Deep on Goin’ Deep

Over the years I have been studying the physics of baseball, I have been totally fascinated with baseball aerodynamics. In a simple Physics 101 world, where the effects of the atmosphere are neglected, baseball trajectories are pretty boring. But we don’t live in such a simple world, and the atmospheric effects of drag and lift play a crucial role in the flight of a baseball.

When PITCHf/x data first became public in 2007, that system produced a veritable bonanza of information that has helped considerably with our quantitative understanding of the effects of drag and lift on a pitched baseball. But after a while those kinds of trajectories get pretty boring too, since they mostly follow a straight line, with just a little bit of deviation due to the combined effects of gravity and spin.

Far more varied, and therefore more interesting, are the trajectories of batted baseballs, which run the gamut from line drives (which are sort of like pitches) to fly balls to pop-ups. If the goal is to understand the atmospheric effects with quantitative precision, it is necessary to investigate all these varied kinds of trajectories. With the advent of Statcast, we now have the opportunity to do just that.

In this article, I will use Statcast data to take a first look at batted-ball trajectories, with the goal of developing an aerodynamics model, including the effects of drag and lift, based on the variation of flyball distances as a function of exit speed, launch angle, and air density. The data used in this study consisted of exit speeds, launch angles, and distances of approximately 80,000 batted balls from the 2015 season.

Although the spray angle was also part of the data set, it was not used in the analysis I present here. Since the home-field and game time temperature for each batted ball were also known, it was straightforward to calculate the air density, assuming standard atmospheric pressure and 50 percent relative humidity, neither of which I had access to.

Let me start with the overview shown in Figure 1.

For this analysis, I considered only flyball hits for which the air density was close to the major league average. In particular, extreme elevations (e.g. Denver) and temperatures were excluded. In effect, I am trying to get an overview without the added complication of extreme atmospheric conditions. The plot shows average flyball distances and their standard error as a function of launch angle for various values of the exit velocity.

These data show quantitatively what we probably already knew, at least qualitatively. Namely, flyball distance reaches a maximum at launch angles in the vicinity of 25-30 degrees, with the angle decreasing slightly as the exit speed increases. Moreover, distances increase with exit speed at the rate of about five feet for each one mph increase in exit speed.

I have done this kind of analysis previously using HITf/x data, combined with independently measured home run distances, and found that an exit speed/launch angle of 100 mph/26 degrees leads to a mean distance of 405 feet, over 20 feet greater than found in the present analysis. The problem is a well-known issue with HITf/x exit speeds, which are measured at a distance somewhat removed from the ball-bat impact point, resulting in their being systematically underestimated. A 20-foot discrepancy corresponds to an underestimation of exit speed by about four mph.

These data are extremely valuable in developing and fine-tuning an aerodynamics model for the flight of the baseball. The important components of such a model are drag (i.e., air resistance) and lift (which results from the backspin). I use a model with five parameters that can be adjusted to best fit the data shown in the plot. Three of these parameters relate to drag and how it depends on the speed and spin of the baseball. The other two are used to specify the rate of backspin as a function of exit speed and launch angle.

The resulting model is shown by the dashed curves, which faithfully reproduce many of the features of the data. In particular, the model accounts for the slight shift in the peak of the distributions to smaller launch angle as the exit speed increases, a consequence of the increase of drag with speed. A notable exception to the good agreement is at the highest exit speed and angles below about 22 degrees, where the data fall distinctly below the curve and even appear to be discontinuous. Given that most things in nature behave smoothly, the data look suspect to me, but any stronger conclusion will have to await more data.

Figures 2 and 3 show mean distances for fly balls hit with an exit speed in the range 101-105 mph and with launch angle in the range 25-30 degrees.

Figure 2 plots the mean distance versus air density along with a dashed line showing the model calculation. Interestingly, in Figure 2, both of those points on the left are Denver, as there is variability in the air density due to temperature. Figure 3 plots the mean distance for each major league stadium, with Denver the clear winner at 430 feet, compared with 401 feet for the average of the other stadiums, indicated by the red dashed line. The Denver effect is huge!

Since the model is an excellent representation of the data, we can use it to draw some interesting conclusions about how flyball distance depends on the various atmospheric effects. Some of these effects are shown in the table below, all calculated relative to 401 ft, which is the major league average distance (Denver excluded) for exit speed 101-105 mph and launch angle 25-30 degrees.

I next want to examine the Denver effect in more detail. To that end, Figures 4 and 5 compare distances in Denver with those at sea level, where the latter actually refer to air densities in the range 1.15-1.20 per cubic meter (or, kg/m3).

Atmospheric Effect | Change in Distance |

10-degree increase in temperature | 3.3 ft |

1000 ft increase in elevation | 5.9 ft |

50% increase in relative humidity at 750 ft | 0.9 ft |

5.0 mph out-blowing wind | 18.8 ft |

Figure 4 shows distance versus exit speed for launch angles in the 25-30 degree range, while Figure 5 shows distance versus launch angle for exit speeds in the range 101-105 mph. As before, the lines are the model calculation.

From Figure 4 we learn that the slope of distance versus exit speed is larger for Denver than at sea level, so that the Denver effect increases from about 19 feet at 91 mph to about 32 feet at 110 mph. From Figure 5 we learn that the distance peaks at a bit larger launch angle in Denver than it does at sea level. These results make sense physically, as reducing the air density at higher elevations pushes the trajectories closer to those expected in a vacuum, where distances increases much more rapidly with exit speed and peak at 45 degrees. The aerodynamics calculation nicely accounts for both of these features.

Another interesting comparison is Arizona and San Francisco, shown in Figure 6.

Arizona is about 1,000 feet higher in elevation than San Francisco and has an average temperature about 17 degrees warmer, both of which contribute to a lower air density and therefore a longer distance, just as shown in the plot. Once again, the calculation agrees with the general trend of the data.

But not everything is as well understood. As an example, consider Figure 7, which compares Tropicana Field with Wrigley Field.

These two venues have mean air densities that are nearly identical, yet the data show the ball carrying measurably better at the Trop, by an average of over 10 feet. Perhaps we are seeing the net effect of an in-blowing wind at Wrigley, noting that no wind is expected at the covered Trop.

Finally, I want to take advantage of the fact that we have an aerodynamic model that accounts for most of the features of the data to investigate how flyball distance depends on the amount of backspin, here for a fixed exit speed of 103 mph and launch angle of 27 degrees. The results are given in the table below. They show that distance increases rapidly as the backspin increases from zero but eventually saturates, with very little gain in distance for spin rates exceeding about 1,500 rpm. The reason for the saturation is partly because air drag increases with increasing spin, essentially canceling the increase in lift.

Backspin Rate (RPM) | Distance (FT) |

0 | 336 |

500 | 368 |

1,000 | 386 |

1,500 | 395 |

2,000 | 400 |

2,500 | 403 |

3,000 | 403 |

Before concluding, it is useful to remind the reader that the analysis considers only average distances for given values of exit speed and launch angle and that actual distance may vary. One reason for variation might be wind. Another might be variation in the drag properties of individual baseballs, which is a topic I addressed in a previous article and which can lead to a significant variation in distance.

I very much look forward to continuing my analysis to fine-tune the aerodynamics model. The work presented here was “two-dimensional,” in that the spray angle was ignored. Including the spray angle, both at impact and at the landing point, allows for the determination of the rate of side spin on the batted ball. Moreover, using the spin measured directly from the Trackman device — an integral part of Statcast — as well as the hang time, should allow better determination of the lift properties of the trajectory. There is still lots to do and, hopefully, lots of data to help do it.

Do the exit velocities themselves vary by ballpark, all other factors being equal? I remember seeing slightly higher exit velocities at Coors from the publicly available data on Baseball Savant from the 2015 season, but wasn’t sure if this was related to the limitations of that data set.

Great stuff as usual!

Fascinating read, as usual. Have a couple questions regarding “Figure 3”.

1) Do spin rates have an effect on this? I.e. do pitching staffs in those ballparks have backspin-suppressing abilities?

2) Is it likely that a lot of the variation could just be quirks with the methodology that will get fixed over time? I.e. I assume it calculates based on distances from each camera, which are not uniform ballpark to ballpark and even small variations in the distance estimates can produce consistently biased results.

Alan – http://www.ncdc.noaa.gov/qclcd/QCLCD this is a link to monthly data from Denver for August 2006. They also provide the same data in comma separated value format for easy loading into spreadsheets.

Well if the media gets ahold of this, the Rockies will never have another MVP. I mean Arenado hit .287 with 42 HR (22 on the road mind you) and 130 RBI and I don’t believe he barely cracked the top 10. Yes Harper definitely deserved it, not saying that. Too bad notgraphs isn’t around anymore otherwise one could calculate what it would take for a Rockies player to win MVP lol

Alan, would you be more specific on how you calculated the numbers from the last table (distance vs backspin), please?

You mentioned that spin value is modelled via exit speed and launch angle (it’s where you talk about five parameters). Given that I supposed that 103 mph / 27 degrees balls might have had the same rpms. But in the table backspin varies.

Why is it so hard to hit at Safeco Field? Is there anything in your data set that begins to explain that?