Comparing FIPS and xFIPS Using Batted Ball Distance

In one of the World Series chats I hosted, it was stated that Matt Cain gave up weak fly balls and that is the reason that his xFIPs (2010 = 4.19 and lifetime = 4.43 ) are higher than his FIPs (2010 = 3.65, lifetime = 3.84). After finally getting all the wrinkles worked out, I am able to get the average distance for fly balls given up by a pitcher. So, does the fly ball distance given up by a pitcher help to explain the difference between his xFIPs and FIPs?

I took just the pitchers that threw over 60 innings in 2010 and subtracted their FIPs from their xFIPs. Then I got the average distance of all the fly balls for these pitchers and here are the top five leaders and laggards:

This past season it can be seen that the top five pitchers whose FIPs are lower than their xFIPs allowed fly balls to travel 26 feet less than the top five pitchers who had FIPs higher than their xFIPs. I expected to find some difference, but 26 ft per fly ball on these two groups was pretty substantial in my opinion, so I dug a little more.

One main problem I noticed immediately looking through all the data was that the Colorado pitchers had fairly high distances given their xFIP-FIP rates. I went ahead and did a small adjustment for elevation and temperature for each home park using data from Robert Adair’s “The Physics of Baseball”.

I assumed half of each player’s games were at home and all their road games were at a league-average level. With this adjustment, the Colorado pitchers came more in line. Because I adjusted for the park numbers, I removed all pitchers that swapped teams mid-season to make the analysis easier. With the new set of data, I compared the data again and got the following results.

1. The average distances of balls hit by the players with a negative xFIP-FIP is 289.3 ft while the distance hit by those with a positive xFIP-FIP is 281.6. A difference of almost eight feet.

2. Here is the list of five pitchers that threw > 160 innings (I wanted to show the top starters this time) and had the highest and lowest xFIP-FIP values:

3. The r-squared value for the second set of data is 0.23 (Tom – the R is 0.479).

There seems to be some decent correlation between how far a pitcher allows a fly ball to fly and his xFIP-FIP difference. In other words, pitchers that allow the ball not to be hit as far, don’t allow as many home runs (duh), therefore their FIP value is much lower than their xFIP.

The correlation is not great, but there is enough evidence for me to continue looking. The two main points I need to explore further are:

1. I need to get the park factors nailed down for the batted ball distance. I need to adjust every fly ball’s distance.
2. Once that is down, I need to see if there is any correlation from one season to the next for a pitcher’s fly ball distance.

So many questions, so little time.




Print This Post



Jeff writes for FanGraphs, The Hardball Times and Royals Review, as well as his own website, Baseball Heat Maps with his brother Darrell. In tandem with Bill Petti, he won the 2013 SABR Analytics Research Award for Contemporary Analysis. Follow him on Twitter @jeffwzimmerman.


38 Responses to “Comparing FIPS and xFIPS Using Batted Ball Distance”

You can follow any responses to this entry through the RSS 2.0 feed.
  1. Steven Ellingson says:

    I’m excited for #2. That’s something I’ve been wondering for a while.

    Vote -1 Vote +1

  2. Mike says:

    So. Uh. How did Cain fare?

    +11 Vote -1 Vote +1

  3. Bob says:

    Flyball pitchers slightly correlate to fewer HR/FB and groundball pitchers tend to have higher HR/FB which accordingly affects the xFIPs. It would be interesting to see the HR/FB %s and batted ball distance split out by high and low groundball %.

    Vote -1 Vote +1

    • drew says:

      i always thought this might be because if a groundball pitcher gives up a home run, at least on a fastball it was probably a hanger, whereas if a flyball pitcher gives up a homerun it was not necessarily a bad pitch, just a thing that will happen sometimes as a result of giving up a lot of flyballs off his fastball

      Vote -1 Vote +1

  4. Ken says:

    Jeff, this is an incredibly interesting study. I hope you continue along these lines. Thanks for doing this.

    Vote -1 Vote +1

  5. Trebecois says:

    Thank you this was an awesome piece of work, very interesting stuff

    Vote -1 Vote +1

  6. Hank says:

    Does the average FB distance include HR’s? Or is it only for balls in play?

    The thought on looking at multiple season’s is key as that will help assess whether it is a repeatable skill (at least for some pitchers).

    Nice work.

    Vote -1 Vote +1

    • Jeff Zimmerman says:

      It includes home runs

      Vote -1 Vote +1

    • phoenix says:

      i would be interested to see what pitch types lead to more or fewer home runs. as in why/how can some pitcher induce weak fly balls as opposed to hard hit balls in the air. maybe location would help as well. i would be interested in the same thing for groundballs: what pitches/location leads to weak ground balls and which ones lead to harder hit balls. i know this is probably an impossible thing to calculate given the data available, but maybe in the future we can be that specific.

      Vote -1 Vote +1

  7. bender says:

    Where do you get batted ball distance data from?

    Vote -1 Vote +1

    • Jeff Zimmerman says:

      MLB gameday data scraped from their website. It is included in the Pitch F/X dataset (.sql) I use from wantlinux.com. It is a little difficult to extract, but if you want it, let me know. email me at wydiyd ~ hotmail ~ com

      Vote -1 Vote +1

  8. Peter Jensen says:

    Jeff – You do know that Gameday hit locations are where the ball is fielded and not where it lands, don’t you? And you do know that they have to be adjusted for each field because the distortion of the outfields are not uniform in the input diagrams used by the Gameday stringers? Are you doing this or is your source and how is it being done?

    Vote -1 Vote +1

    • Jeff Zimmerman says:

      Peter – here is the background article I did for the data. I averages the home plate value and had to recreated the distance value as stated in the article and am using a distance value around 1.8.

      Vote -1 Vote +1

  9. Nathaniel Dawson says:

    I’m not sure if this is really telling us anything. If a pitcher gave up fewer HR/FB than the league average, his FIP should be lower than his xFIP. If he gave up fewer HR/FB, it’s likely that his flyballs didn’t get hit as far. This seems to follow logically, but what it doesn’t tell us is if this was a result of the pitcher’s skill rather than random occurrence. Which has always been the big debate about a guy like Matt Cain. Does he actually have a skill that limits the number of HR/FB that he’s given up, or has he just gotten lucky?

    Did you include infield flies as fly balls (as Fangraphs does), or did you exclude them from HR/FB?

    Also, why does your second table list pitchers with over 120 IP for the negative xFIP-FIP group, while over 160 IP for the positive group?

    Vote -1 Vote +1

    • Jeff Zimmerman says:

      Infield flys were not included.

      It was 120 for both, I miss wrote the in the writing. Sorry for the confusion.

      Vote -1 Vote +1

    • fred says:

      I agree with Nathaniel’s comment

      Vote -1 Vote +1

    • filihok says:

      I agree with Nathaniel.

      Vote -1 Vote +1

      • filihok says:

        So…[enter] when outside the text box=Submit Comment. Good to know.

        I agree with Nathaniel.

        Shouldn’t you somehow control for the home runs. Otherwise you’re just showing that pitchers with lower HR/FB rates have better FIPs than xFIPs

        Vote -1 Vote +1

    • Austin says:

      Yeah, I thought the same when I read the post. It seems sort of tautological, if that’s the right word here – of course the pitchers that allowed the shortest fly balls were largely the same group as those that allowed the fewest home runs, because… they allowed shorter fly balls. What does that actually tell us? Looking at year-to-year correlation for fly ball distance definitely seems like the better angle, and I’m looking forward to Jeff’s posting that eventually.

      Vote -1 Vote +1

    • Pat says:

      Cain has had five straight seasons of 190+ IP, 22 or less HR allowed per season (which, arguably, is low for a fly-ball pitcher). His HR rates aren’t much higher than those of Jake Peavy, a sinkerballer pitching in a cavernous park. Surely Cain’s sample size is large enough to discount luck. Not all fly balls are created equally, and not all ground balls are either. The better pitchers fool hitters more consistently, and induce weak tappers and weak popups, not just strikeouts. I hope Jeff’s opened the door to documenting this.

      Vote -1 Vote +1

  10. Tanbarkie says:

    Quick disclaimer: I’m pretty new to sabermetrics, but I work in science In Real Life, so I have a decent handle on data analysis. But I have no idea how to compile baseball data, so if my idea below is obviously wrong-headed, feel free to point it out. :)

    Anyway, I was wondering if you can take for granted that a given pitcher’s flyball distances would, if plotted as a histogram, approximate a bell curve. I could imagine a scenario in which most or all flyball pitchers show a bimodal distribution – that is, they’re good at giving up weaker flies when their pitches are going where they want ‘em (peak 1), but when they make mistakes, the balls tend to fly significantly farther (peak 2). The relative heights of the two peaks (that is, how often a given pitcher “makes mistakes”) could both be informative in ways that just taking the average flyball distance might not.

    For example, if Matt Cain’s performance comes from his ability to induce weak contact, one could imagine that he’d have one peak pretty close in (his “successes”), and one peak farther out (his “mistakes”). If we calculate the ratio of balls-in-peak1/balls-in-peak2, we could also imagine that this ratio might be unusually high for him versus less successful flyball pitchers. This wouldn’t necessarily show up as a significant difference just looking at average distance (or might come out as a surprisingly small difference, as Jeff saw).

    Just some idle speculation for y’all from a newcomer to the saber world.

    Vote -1 Vote +1

    • Jeff Zimmerman says:

      Thanks for the thoughts, let me see what i can find.

      Vote -1 Vote +1

    • GTStD says:

      My intuition as an engineer is that the data will be a bit noisy for this to be really noticeable. It seems to me that the number of variables becomes a bit large for the “success” and “mistake” peaks to be visible, not to mention relevant. I suppose if he was always pitching against a “perfect” batter, who would always hit his mistakes, and if he always made mistakes up in the zone, it might split the distribution. It would also depend on it being true that a flyball pitcher has some control over the distance that the balls go.

      Flyball pitchers, I’d imagine are successful because even though they pitch up in the zone, hitters can’t get a good read because of changing speeds or deception in the motion, and the hitters are kept off-balance. This would have very little to do with “mistake” and “success” of specific pitches, and more “good” and “bad” days, so even if the bimodal distribution were to come out, I’m not sure it would indicate what you mentioned.

      As a first step in that analysis though, it would be very interesting to see if pitchers do, in fact, have any control whatsoever over flyball distance. Fortunately, the #2 further study mentioned should give some idea about that. I would also be interested in seeing (for a few pitchers anyways) a plot of flyball distance as a function of time during a season. That would start to provide a little information as to whether any control a pitcher may have is based individual pitches or luck (roughly flat), or if it is more of an inherent mechanics or comfort thing (varying over the season), or if it is something else entirely.

      Vote -1 Vote +1

  11. frug says:

    I’m looking forward to seeing where this goes. If there is decent year to year correlation would you consider trying to fit fly ball distance into xFIP calculations? Especially since the single biggest reason that xFIP so severely underrates Matt Cain is because he generally posts such good HR/FB ratios. (About 29% better than league average for his career according to baseball-reference).

    Vote -1 Vote +1

    • frug says:

      Oh and I guess I should include this since I’m sure someone else will point it out, but Cain also gets underestimated by xFIP (and DIPS measures in general) because he consistently posts low BABIP (about 11% lower than league average according to baseball-reference).

      Vote -1 Vote +1

  12. ATrain says:

    Gotta agree with Nat on this one. Seems like luck woudl enter into how hard a pitchers’ fly balls given up are hit and it then just follows that their FIPs lower than their xFIPs and vice versa. It’s just a measurement of how much luck was involved in their balls in play. I’m not going to rule out what you theorize, but the key, imo, would first be proving that pitchers control how hard their fly balls are hit.

    Vote -1 Vote +1

  13. pounded clown says:

    I think a better measurement would be the speed that ball leaves the bat relative to the speed of the ball when it arrives at the plate This can help eliminate atmospheric effects at least on the batted ball which I’d imagine to be greater than on the pitched ball. This could inturn be measured against pitch location to possibly give a more accurate way of quantifing the quality of the batted ball. The idea being the harder hit ball will be made when the ball is hit more squarely by a bat travelling with greater speed.

    Vote -1 Vote +1

  14. […] right on time Comparing FIPS and xFIPS Using Batted Ball Distance | FanGraphs Baseball by Jeff Zimmerman – November 22, 2010 […]

    Vote -1 Vote +1

  15. slash12 says:

    I wonder how much of this is reflected in IFFB% (a stat that’s readily available via fangraphs for each pitcher), already. I’d wager you’ll see the same kind of correlation between IFFB% and the difference between xFIP and FIP. It would make sense that pitchers with a lower average distance are also inducing more IFFB’s.

    Vote -1 Vote +1

  16. William says:

    You might want to cut out RPs when/if you look at the “luck” angle, since it’s pretty clear that data from non-starters kind of blow up the theory that diversion from the typical HR/FB is a sign of luck. Most starters, on the other hand, will regress much more predictably, so you might want to give that angle a try…

    Vote -1 Vote +1

  17. cs3 says:

    why are infield not included?
    its might be obvious but i cant come up with a reason for not including IFFBs when it seems logical that a pitcher who induces lots of popups would have a lower FIP than xFIP

    Vote -1 Vote +1

  18. cs3 says:

    *infield flies*

    Vote -1 Vote +1

  19. Charlie Manuel says:

    I wouldn’t waste time with these “rocket science” studies. I just look at home runs and wins and average for my players. That to me is worth the $$$$.

    I’m why Omar Infante was an all star and Joey Votto had to get voted in by the fans at the last minute.

    I had picked the real MVP.
    OMAR INFANTE

    I’ve been around along time.
    All these new stats are killing the game.
    KISS (Keep it simple stupid)

    Signed:
    Charlie Manuel

    Vote -1 Vote +1

  20. Isaiah Hajduk says:

    I was having problems with my neck and shoulders when I got off work, so I bought the neck and shoulder heat pad. It works wonders on them both. I don’t hurt as bad after I use it. My left shoulder grinds so the heat on it makes it not hurt so much. I really like it and it was what I was looking for to help ease the pain. It is perfect.Rating: 5 / 5

    Vote -1 Vote +1

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>