Infield Fly Balls and xFIP

Today I saw a couple gripes around the Internets about xFIP and how infield fly balls are not taken into account. On FanGraphs, overall fly-ball percentage is used to calculated a pitcher’s “normalized” home run rate.

This got me thinking about David Gassko’s Batted Ball DIPS article from five years ago where he writes the following about infield fly balls:

Infield flies per ball in play actually have a slight negative correlation with outfield flies per ball in play. Inducing infield flies is a skill, and while it correlates somewhat weakly year-to-year (Lichtman found an “r” of .140), a small subset of pitchers exhibits clear control over the percentage of their fly balls that are infield pop ups. I would encourage studies looking into who those pitchers are—one thing I have noticed is that extreme ground ball pitchers allow fewer than expected infield fly balls.

What I believe is actually going on here is that fly-ball pitchers in general have higher infield fly-ball rates as measured by Baseball Info Solutions. The repeatability of infield fly balls is basically just a side effect of a pitcher’s total fly-ball rate. Looking at all pitchers from 2006-2009, here’s what you get when you bucket FB% in increments of 5%:

FB% Bucket     IFFB%    HR/FB%   HR/OFFB%
< 25%          7.1%     11.1%     11.9%
25% - 29%      7.8%     10.9%     11.7%
30% - 34%      8.9%     10.2%     11.2%
35% - 39%      9.7%     10.2%     11.3%
40% - 44%     10.5%     10.0%     11.2%
45% - 49%     11.6%      9.8%     11.0%
>= 50%        12.2%     10.0%     11.4%

So, while it’s pretty clear that overall FB% is impacting IFFB%, I’m not sure things are quite so obvious with home runs. It seems to me that home-runs-per-total-fly-ball plateaus at about 10% starting in the 30%-plus range. And for home-runs-per-outfield-fly-ball, things look pretty similar, except everything is about 1% higher because of the removed IFFBs.

So getting back to xFIP, does it really matter whether or not you exclude popups? The answer is, not really. You’re going to get almost the same results because HR/OFFB on average exhibits more or less the same issue as HR/FB. In fact, the correlation between using OFFB vs total FBs in xFIP is .996. The two, in practice, are virtually identical.

However, when you bucket the data like this, it seems that there is one thing made clear: When an extreme groundball pitcher induces a fly ball, there’s slightly greater chance it will end up a home run. I think it would be particularly interesting to look at the run values of different batted balls types for different buckets of fly-ball pitchers, but I’ll have to leave that for another time.

Print This Post

David Appelman is the creator of FanGraphs.

41 Responses to “Infield Fly Balls and xFIP”

You can follow any responses to this entry through the RSS 2.0 feed.
  1. jeffrey Gross says:

    HR/FB is a useless baseline for xFIP because everyone knows that an IFFB can’t be a HR, but you are including it in the data from which to derive predictive material. It’s making xFIP less correlative for no justifiable purpose other than “we’re too lazy to change it.” Until you run the numbers and demonstrate that the adjusted R-squared of pitcher’s xFIP based on HR/OFFB is a statistically insignificant variance from xFIP based on HR/FB, I will continue to use HR/OFFB, please do not assert that “you’re going to get almost the same results”

    Fangraphs is the place where baseless claims go to die, not be perpetuated. Vive le HR/OFFB.

    Vote -1 Vote +1

    • jeffrey Gross says:

      Side Note:
      maybe this is a glitch, but my comments are not registering on the site when my blog is listed under “website.” anyone else finding this problem, or have I vexated the wrong persons?

      Vote -1 Vote +1

    • jeffrey Gross says:

      Oh and apologies if it seems like that last sentence of my org. statement was overly critical and “kinda douchey.” My internal brain filter does not properly function at 130 am. I obvious love the site and material herein.

      Vote -1 Vote +1

    • FanGraphs Supporting Member

      Did you even read the article? I did exactly what you asked. The correlation between xFIP with OFFB vs total FB is .996 (an R-squared of .992).

      Vote -1 Vote +1

    • philosofool says:
      FanGraphs Supporting Member

      IFFB aren’te a repeatable skill (r = .14, r^2 = .02). The point of xFIP is to remove luck. If you don’t want to remove luck, use RA/IP.

      Vote -1 Vote +1

      • Nick Steiner says:

        IFFB aren’t a repeatable skill? I highly disagree with you on that, but regardless, why would you use them in HR/FB at all? xFIP is a descriptive stat (meaning it uses raw, unrepressed data) so IFFB should definitely be excluded from the calculation, even if there will hardly be a difference with and without them,

        Vote -1 Vote +1

    • neuter_your_dogma says:

      “an IFFB can’t be a HR.” Same is true for any FB that is not a HR. Why would FBs hit 10 feet past the infield count and IFFBs would not?

      Vote -1 Vote +1

  2. Matt says:

    If inducing IFFB isn’t a repeatable skill, it can explain how some pitcher’s performances are outpacing their FIP/xFIP. It seems that a useful model of pitching success might be based upon a weighted continuum of PA: KO > IFFB > GB > FB > LD > BB > HRA.

    Vote -1 Vote +1

  3. Matt says:

    edit: If inducing IFFB isn’t a repeatable skill, it can at least explain how some pitcher’s performances are outpacing their FIP/xFIP. It seems that a useful model of pitching success might be based upon a weighted continuum of PA outcomes: (best) KO > IFFB > GB > OFFB > LD > BB > HRA (worst).

    Vote -1 Vote +1

  4. johnf says:

    I’m not arguing against the numbers since this was the only guy I could find it for, just found it interesting. Mariano Rivera has infield fly numbers comparable to Ted Lilly(picked him randomly as an extreme flyball pitcher), yet has a career groundball rate about 20% higher. He’s a groundball pitcher(and high strikeout, low walk robot) who also induces a lot of pop-ups. Is it fair to say this is how he’s maintained a pretty low career HR/FB rate?

    Vote -1 Vote +1

  5. JRoth says:

    So David Gassko says that “a small subset” of pitchers can control their IFFB rate, and David Appelman replies with data that, if you look at all pitchers, pitchers don’t appear to be able to control their IFFB rate. It’s unresponsive.

    What DA has shown is that, when you look at all 511 ML pitchers at once, then xFIP works as intended. What he has not shown is that it’s accurate/predictive/valuable for the “small subset” of pitchers who can control their IFFB rate.

    I don’t know how you would usefully jigger xFIP to automatically take into account the “small subset;” perhaps more useful would be a way to identify members of the small subset, for whom you would then discount xFIP appropriately.

    More interesting to me is that DA has shown that xFIP is wrong for extreme (<30%FB) GB pitchers. Seems noteworthy to me. A 30% FB pitcher like Zach Duke still gives up a couple hundred FBs/year, which means that he's giving up a couple extra HRs on FBs relative to most other pitchers. 2 HRs isn't much over a season, but it's not nothing. [Duke himself is actually a touch below 10% on his career, likely an artifact of unsustainably low rates his rookie half-year and second season, 4.5% and 8.3% – he's been around 10.5% since then].

    Vote -1 Vote +1

    • FanGraphs Supporting Member

      I don’t disagree with anything you’ve said here. My main point was that for the sake of xFIP, it does not make any difference at all whether you include IFFBs.

      xFIP is certainly not perfect, it’s a good starting point for a discussion or makes for a decent quick glance look. I’d say it works in the majority of cases, but there are always going to be exceptions, like with all stats and baseball players.

      Vote -1 Vote +1

      • Matt says:

        “xFIP is certainly not perfect, it’s a good starting point for a discussion or makes for a decent quick glance look. I’d say it works in the majority of cases, but there are always going to be exceptions, like with all stats and baseball players.”

        This is why I respect you a lot more than the average person. You seem to be one of the few that realizes we have a lot of great tools that are way better than the tools of the past, but these new tools, like pretty much everything, are still more or less works in progress.

        I get tired of people who act like xFIP, or any stat, was carved into the back of the 10 commandments, given to us by god as an infallible way to judge everything and a road map to pirate treasure.

        Vote -1 Vote +1

      • wobatus says:

        Interesting. I tried to argue this with Brandon League (18% career rate) but a lot of folks think it’s just random. Small sample of flyballs, of course, even over his career, but 18% is a high rate. Only about 5 such full seasons by a starter since 2002, every one by a groundballer (over 50%).

        Vote -1 Vote +1

  6. X says:

    Does xFIP account for subjective “line drives?” How about how hard of contact a pitcher allows (whether it be grounders, fly-balls, “liners”, etc. etc.)? Batted ball velocity and trajectory? Does it account for any of that? Some pitchers get hit harder than others, and not all batted balls are created equally. Is it not fair to assume that the harder a pitcher get hits, the more XBHs and runs a pitcher is expected to allow? Until this data becomes available (and ultimately incorporated), xFIP is an incomplete attempt at trying to remove luck out of the equation. It’s simply incomplete and should not be treated as the be-all and end-all of pitching metrics.

    Not only that, but why should we consider a pitcher “lucky” if his career HR/FB rate is below the league average? Granted, in a small sample size, I agree that we can’t just dismiss the possibility of luck. But I’m talking about the pitchers with a large sample size…say, 5+ full seasons. Shouldn’t adjustments be made to not punish those kind of pitchers? Pitchers who have shown time after time again that they can maintain a low HR/FB rate? Shouldn’t we adjust for that by giving them THEIR career HR/FB rates, instead of a league average one? Or are we to just dismiss all of that as simply “luck?” And just completely ignore the data that is staring straight at us? That’d be unfair and just plain lazy and irresponsible.

    I’m talking about someone like Cliff Lee (career 8.1% HR/FB)
    2005: 7.9% 2006: 8.8% 2007: 10.5% 2008: 5.1% 2009: 6.5% 2010: 2.3%

    Or Tim Lincecum (career 6.3% HR/FB) 2007: 8.2% 2008: 5.6% 2009: 5.5% 2010: 6.4%

    Or Jered Weaver (career 8.0%) 2006: 8.4% 2007: 7.0% 2008: 8.3% 2009: 8.5% 2010: 8.5%

    Or Verlander (career 8.0%) 2005: 7.7% 2006: 10.3% 2007: 8.5% 2008: 7.0% 2009: 7.4% 2010: 6.3%

    Or Ubaldo Jimenez (career 7.6%)
    Or Adam Wainwright (career 7.6%)
    Or Mike Pelfrey (career 7.6%)
    Or Roy Oswalt (career 8.9%)
    Or Chad Billingsley (8.5%)
    Or John Lackey (9.1%)
    Etc. etc.

    Have all of these pitchers been lucky after all these years? Or do they have some control over it, which enables them to prevent HRs? I’m betting it’s the latter.

    Also, what doesn’t get brought up is that pitchers (unless they’re established pitchers where such an anomaly in a smaller sample size can be dismissed) who do have a ridiculously high HR/FB rate don’t last long enough in the majors…which prevents us from having a big enough sample size that may, just may show us that certain pitchers do indeed allow a lot of HRs for whatever reason. That’s the thing…a pitcher who would consistently put up an extremely high HR/FB rate wouldn’t last in the majors, therefore eliminating the possibility of us ever having a large enough sample size to work with…which basically then gives anyone the opportunity to dismiss all of it as luck. A pitcher probably has to allow an acceptable amount of HRs per FB to survive and last in the majors. Pitchers who don’t do that don’t make it, therefore we never get that “extreme” data to work with. And if we do, it’s in a smaller sample size where it’s usually dismissed as bad luck.

    Vote -1 Vote +1

    • JRoth says:

      Charlie Morton.

      “Unlucky,” as long as you define luck to include things like feeding belt-high fastballs over the middle of the plate. Batting practice pitchers don’t regress to 10% HR/FB either.

      Vote -1 Vote +1

    • Toffer Peak says:

      That was a really long-winded comment but I thought you should know that even the creator of FIP says that for larger samples you should evaluate a pitcher based on their RA rather than FIP (or xFIP). You can search his posts here:

      My guess is that for fewer than 300 innings xFIP is the most accurate indicator of skill. For 300 to 600 innings FIP is probably the most accurate (though you should still account for park factors) and probably somewhere over 600 innings ERA begins to become the most accurate indicator of skill.

      Vote -1 Vote +1

  7. Bobby Boden says:

    While we’re on the subject of xFIP…how about discussing another flaw. xFIP is deflated by having a higher BABIP. Which is to say, having a higher BABIP actually helps your xFIP. xFIP is measured on a K/IP scale, which is to say, the more strikeouts you have per inning, the better your xFIP is. Well, the higher your BABIP is, the more hits you give up, therefore the more batters you face in an inning, and the more chances you have to strikeout batters. This became apparent to me when watching Rich Harden in a few starts last year. His final line looked spectacular in a lot of games, striking out like 10 guys in 4 innings. Well, he gave up so many hits in the game, it was like he had pitched a 5 or 6 inning game. I realize the fact that your walks also increase at the exact same rate due to BABIP, but walks negatively impact your xFIP less then your strikeouts positively do.

    On the subject at hand, the following article seems to conflict with your results, as it implies that a groundball pitcher gives up slightly less home runs per outfield fly ball: Perhaps the difference is, that they broke off line drive home runs (where fangraphs just assumes all home runs are fly balls, when calculating your HR/FB rate).

    In summary, in an ideal world, what I’d love to see at fangraphs: HR/OFFB (and excluding line drive home runs). K% and BB% for pitchers, in addition to K/9 and BB/9 (K/9 and BB/9 are flawed, as they are effected by BABIP, K%, and BB% are not, furthemore it’s just a nice scale, to use, as it’s the same scale we use for batters, so you can see that XXX pitcher strikes out so many batters, he turns everyone into Alfonso Soriano). A new xFIP that isn’t effected by BABIP (I’ve wanted to come up with this myself, I just haven’t had the time). I was thinking something along the lines of normalizing IP based on a batters BABIP, or basing xFIP on TBF instead of IP. Maybe this is a new version of xFIP.

    Vote -1 Vote +1

  8. Jack says:

    With all of these great points made by X and Bobby Boden, why do some people on this website still continue to treat xFIP as the gospel? It’s incredibly flawed.

    Vote -1 Vote +1

    • philosofool says:
      FanGraphs Supporting Member

      Your use of the word “incredibly” is incredibly flawed. xFIP is slightly flawed. If you want to talk about the flaws in xFIP, the first one to look at is probably its linearity. (I don’t mean to imply that extant non-linear metrics are superior.) None of the proposal being talked about in this thread even remotely hint at removing any of the actual error in xFIP.

      Vote -1 Vote +1

  9. Bobby Boden says:

    I suspect that some of the complaints about IFFB% not being included in xFIP has to due with the fact that IFFB’s tend to lower your BABIP as well, particularly for fly ball pitchers, who are throwing a ton of IFFB’s (like ted lilly).

    A similar issue lies with some groundball pitchers who do a really good job of disallowing line drives (and thus, post lower BABIP’s).

    These are seemingly repeatable pitcher skills (not defense), that aren’t included in xFIP. They are included in tRA, but tRA isn’t park/league adjusted (I don’t believe), making tRA a less useful fantasy baseball tool (and a less reliable ERA predictor). Maybe a park/league/division adjusted tRA would be the most reliable predictor. Furthermore, if you could somehow adjust based on your teams defense, that would be the ultimate actual ERA predictor, and fantasy tool.

    In summary, FIP, xFIP, tRA, they all have their flaws, and are all doing slightly different things. None of these are perfect, tRA might be the best in terms of rating a pitchers true skill. In terms of fantasy baseball(or comparing to actual ERA), it gets a little sketchy, because there are a lot of variables that may not be a result of the pitchers skill (park/league/division factors, your teams defense), but will still predictably effect the end result(ERA). Still, ERA itself is even more flawed, and very difficult to predict, so using these other metrics are still valuable. I think from a fantasy baseball perspective xFIP is probably the best, while not perfect.

    Vote -1 Vote +1

  10. wobatus says:

    I do think that some groundball pitchers are homer prone on flyballs. Since the flayballs are more likely on mistake pitches.

    Brandon League, for example, has a career 62.4% gb rate, and an 18.6% hr/fb. And every year he gives up more than 11% hr/fb: 30.8, 16.7, 12.5, 14.3 15.1, 20.8. That doesn’t seem like a fluke.

    Rafael betancourt is an extreme example in the other direction. 28.8% career gb rate. Career hr/fb rate is 7.3%, and he is never above 11%: 8.5, 8.5, 6.3, 3.8, 10.3, 5.1, 8.3.

    League’s career IFFB%: 3.4%.

    Betancourt’s: 14.2%.

    League’s career e.r.a. is a third of a run higher than his xFIP. Betancourt’s career e.r.a. is half a run lower than his xFIP.

    That Legue has a low IFFB% rate makes sense, since he has a low overall fb rate. And that Betancourt has a high rate also makes sense, as he has a high fb rate. But Betancourt’s hr rate is so low for overall flyballs, it seems he does have an ability to generate pop-ups, whereas League does seem to, when he gives up a flyball, give up long ones.

    The numbers are too stark and repeated essentially year after year for it to be just luck, I think, but I don’t know much about statistics or probability.

    Vote -1 Vote +1

    • opisgod says:

      The whole context of pitching in relief skews the numbers, often times the reliever doesn’t pitch enough innings for proper regression to occur. Such deviation is rarely seen among starters. In the case of League, so many of those home runs come off really well placed pitches that barely manage to clear the fence. Add to that, if he had better luck on weak groundballs and the umpires called a proper zone for him (they never do), the innings could very well be over before the home run occurred. The best example I can find is the grand slam against the orioles this year.

      1. Rob Johnson, truly one of the worst catchers ever, allowed a strikeout to reach first base on a routine ball in the dirt.
      2. The umpire called a big strike zone all night, but once League came in nothing on the corners got called.
      3. A weak groundball that could have been a double play only resulted in a forceout.
      4. The home run pitch was a sinking fastball half a foot off the plate and outside to the lefty hitter.
      5. The pitch didn’t even reach the second row of seats, and could have been caught if not for fan interference.

      This sequence of events was almost duplicated two weeks later against the angels, this time with Chone Figgins blowing a possible inning ending double play right before the home run.

      Vote -1 Vote +1

    • joser says:

      I don’t think we can discount the possibility that some pitchers have some control over infield flies or home runs. But we might also explain a case like Betancourt with selection bias. Obviously, flyball pitchers who have a high percentage of their flyballs leave the park don’t make it to the big leagues, or don’t stick around very long. It’s possible that some of the ones who do keep their flyballs in the park have a skill, but it’s possible that some of them have just been lucky. If you start flipping a large number of fair coins and keep throwing out the ones that come up tails you’ll end up with a small number of coins that seem to have a “talent” for always coming up heads, though any one of them still has an even chance of coming up tails in the next flip.

      Again, I’m not saying Betancourt definitely has just been lucky; it’s possible he (and other pitchers with similar IFFB and HR/FB rates) has some talent that should be studied. Certainly the high rate of infield flies suggests something about the profile of many balls hit off him, that perhaps hitters are getting under his pitches more than they intend so that what might otherwise have been cannon shots over the fences instead become more-vertical flares that die in outfielders’ gloves. We know there definitely are groundball pitchers and flyball pitchers; perhaps Betancourt is such an extreme flyball pitcher that many of his flyballs are IF flies and many of his erstwhile homeruns are merely long outs.

      But as stark and repeatable as his numbers seem, they could be nothing more than the coin coming up heads more often than seems naturally possible. Betancourt is a reliever with just 459 total IP across 432 games in 7+ seasons; over that span he’s given up 634 fly balls, of which just 90 were infield flies (14%) and 46 were home runs (7%). To reach an 11% HR/FB rate he would’ve had to surrender 69 HR. That’s 23 extra HR, which of course seems like a lot (it’s a 50% increase!); but it’s only one more HR every 18 or so appearances or roughly 3 extra home runs per season. Is it really so impossible that he might have gotten lucky with three balls caught at the wall or blown foul or captured by the deepest fences of the park? Yes, as year mounts upon year the preponderance of evidence shifts towards something other than luck, but you can never discount it entirely. We really don’t have that much evidence. And sometimes the fair coin really does come up heads many times in a row.

      Vote -1 Vote +1

  11. Matt says:

    I also disapprove of the league average HR/FB factoring into xFIP, which placed Josh Beckett second only to Tim Lincecum from 2007-2009. My own observations of Beckett are that he is exceptionally good at surrendering HR and that this ability is tied to his stubborn personality and pitch selection more so than simply bad luck. I realize that xFIP correlates with ERA slightly better than FIP across all pitchers, but, in this particular case, xFIP is misleading.

    Vote -1 Vote +1

    • Brimhack says:

      Matt, if that observation was correct, Beckett would have a higher HR/FB rate than the rest of the league. He doesn’t. In fact, it’s quite average. So is his HR/9.

      That’s why his career FIP is almost the same as his career xFIP.

      Vote -1 Vote +1

  12. studes says:

    I didn’t realize (or more likely forgot) that you are including infield flies in your calculation of xFIP. As the guy who created xFIP, I feel bad about that. I think they should be excluded.

    The correlation between just OF and all flies may be high, but I don’t think that’s necessarily a meaningful analysis. Nor is whether or not infield flies are a repeatable skill (they are, even more for batters than pitchers). The point is that *some* pitchers, at *some* point in time, give up more or less infield flies than average, and that will impact xFIP for those specific pitchers, perhaps significantly.

    Infield flies are never home runs (and are virtually always outs) and so shouldn’t be in the calculation. You can make a better argument that line drives ought to be included instead of infield flies, because they are more likely to be home runs.

    The measure of an infield fly is arbitrary, to be sure, but it is a useful subset of batted balls. Why throw them into the mix just because they happened to be labeled “flies?”

    Vote -1 Vote +1

    • Not David says:

      I’m sold.

      Vote -1 Vote +1

    • FanGraphs Supporting Member

      I wasn’t aware that the original version of xFIP did exclude infield flies. For some reason I always thought it used overall fly ball percentage. In any case, I actually put in the code change to switch it last night because I thought it might be a good idea to exclude infield flies, but then I took it out again after I did the above research.

      On the whole, it seems to me that most of the year-to-year correlation in infield fly balls is a side effect of fly ball percentage in general. There may certainly be pitchers who do year after year induce more popups than league average for every fly ball, but they seem to be the exception, just like pitchers who post below league BABIPs, or below league HR/FB.

      In other words, most pitchers that induce a particularly high number of infield fly balls in one year, won’t be doing it in the next and they could very well end up being home runs, which is at least my argument for including them.

      It just seems to me that xFIP shouldn’t cater to the fringe cases, which is what I think removing infield fly balls does in some sense.

      But, to be perfectly honest, you’re looking at something like a maximum of a .25 difference or so in xFIP on the edges, whether you include IFFB’s or not. I think in the vast majority of cases, it makes almost no difference if you exclude them or include them.

      Vote -1 Vote +1

      • studes says:

        You know, xFIP has been captured, I think, by the need to come up with a “better” predictive stat, to rationalize it from that point of view. To me, it was as much about being descriptive as predictive. And, to me, it was never about “making a difference or not.” It was about what made the most sense, descriptively and predictively.

        Really, I called is “xFIP” cause the “x” stood for experimental. I just did it for the article and never thought it would last this long. So it doesn’t really matter to me. But if you look at the definition of xFIP at THT, it’s very clear that only outfield flies are included.

        Vote -1 Vote +1

  13. studes says:

    I should check my own glossary more often. I don’t specifically state there that I include outfield flies only, so naturally there is some confusion about the subject.

    Vote -1 Vote +1