Leaving aside the issue of the quality of batted ball classifications, SIERA regresses HR rate, so WAR would then measure how valuable a pitcher would have been had his HR rate been normalized towards an expected HR rate. It would move from being a measure of things that definitively did happen to a theoretical about a pitcher’s value if things had happened differently. That’s not really the point of WAR.
Would you be in favor of removing HRs allowed from WAR, then? They have about as much y-t-y noise as IFFB rate, and aren’t a great true talent measure either.
It sounds like your preference is for WAR to be based on something like a projection system, such as Steamer or ZIPS, rather than on past performance data. My sense is that you’re probably in the minority on that perspective.
Someone needs to go through each of the above pitchers’ pitch history and collate exactly what pitch is inducing the pop-up in each specific at-bat. I suspect what we’ll see is that they all throw fastballs out of the same arm slot and window they throw their primary off-speed pitch from. It’s that small deception which is throwing some hitters off.
Comment by Phantom Stranger — March 6, 2013 @ 5:06 pm
I most certainly would like to see it added. The data seemed fascinating.
I know IFFBs don’t have a high y-t-y corrolation, but have there been any pitchers who have had an above average IFFB% consistantly, and have they beat their FIP well? I wonder.
Comment by Ruki Motomiya — March 6, 2013 @ 5:09 pm
I think you make a good case for it. If FIP is descriptive, so is WAR, so the low year-to-year correlation doesn’t really enter into the conversation.
Comment by Sparkles Peterson — March 6, 2013 @ 5:12 pm
Wouldn’t adding IFFB give some slight advantage to fly ball pitchers in IFFIP compared to sinker ballers or other pitchers who induce weak contact on the ground?
The latter is more defense-dependent, but including them would require only an extension of the same leaps to defensive-interchangeability as IFFIP.
Pitchers who generate IFFBs tend to be flyball pitchers, yes. But, at the same token, fly ball pitchers tend to give up more HRs, so you could make a case that they’re being penalized for their batted ball profile in FIP already. If higher infield fly rates are a biproduct of pitching up in the zone, just as higher HR rates are, then FIP is penalizing pitchers for their approach by docking them for HRs without giving them credit for the good contact that pitching up in the zone can also lead to.
I think it’s probably worth noting that fly ball pitchers tend to outperform their FIP more often than groundball pitchers. So, perhaps this advantage is correcting a previous bias against FB pitchers. This is all still speculative, but I think there might be something to the idea that this adjustment balances the scales more than it tilts them towards FB pitchers.
I kind of like this idea, but what I’d be interested in finding is how much IFFB% is hitter-dependent, rather than fielding-independent. I mean, it’s in much the same realm as homers, I suppose, but isn’t there a place to discover how much the IFFB is related to the talent/ability of the hitter versus the talent/ability of the pitcher.
Comment by Bryan Grosnick — March 6, 2013 @ 5:28 pm
Great points. I buy into the re-calibrating the scale back towards effective FB pitchers, but will point out that some number (many? few? most?) of FB pitchers are already benefiting in FIP because of the inclusion of strikeouts that may offset the HR issue.
I would be curious if extreme groundballers also outperform their FIP because there is no groundball equivalent of IFFB.
Uhm … the new (and interesting) part of Rosenheck’s analysis was all about the significance of z-contact%. Writing that off as not “really a new finding” not only sounds rather petulant, but entirely misses the point. Given that z-contact% allowed would appear to be an actual skill (although I don’t yet know about yearly variance), that’s a pretty momentous addition to the overall concept of BABIP.
But at the same time, FIP already essentially normalizes BABIP and LOB% by assuming pitchers have no (or minimal) control over either. I don’t see how that’s much different than SIERA regressing HR rate.
FIP is meant to be a defensive independent describer of a pitcher’s performance. Do the fielders have anything to do with an out being recorded on a infield pop up? Obviously the answer is yes they do have to catch, but is you assume that every infield fly ball is going to be caught by a major league infielder (which isn’t a bad assumption) then the fielder really has nothing to do with it. A.K.A. fielding independent.
I think IFFB% is giving people the wrong idea about the repeatability of pitchers inducing infield popups because it’s a misleading stat. IFFB% on FanGraphs is the percentage of fly balls that are popped up, not the percentage of all batted balls, as I think most people would expect.
Therefore, FB%*IFFB% is the percentage of all batted balls that are popped up. There’s about a 0.63 year-to-year correlation for this stat, which is quite a bit better than the 0.42 Bill Petti found for HR/9 IP.
Comment by Steve Staude. — March 6, 2013 @ 5:57 pm
Yeah, Dave should have said something like “Incorporating z-contact% into FIP and WAR introduces many complications, so for now I’ll focus on how IFFB% might be used to improve both statistics.”
ZC% is Z-Contact%, which comes from the batted ball stats on FanGraphs (specifically, I used the non-PitchF/X version, as there were more years available), and represents the contact rate on pitches swung on that were in the zone. You may recall that I pointed this out in my first article as one of the strongest correlates of infield popups and therefore BABIP.
Comment by Steve Staude. — March 6, 2013 @ 6:06 pm
GIDP? I realize that it is extremely defense dependent (and baserunner), but it is the value added corollary for ground balls. Maybe you could distill it into a rate of opportunity stat? or use the batted ball buckets and timer that go into DRS to find balls that reasonably could be expected to be DPs?
Striking out means you didn’t hit the ball. (almost every time, some are foul tips of course) Infield flies, some way up into the sky mean you did hit the ball, just not exactly centered. I get the math, but I would value a player that can hit the ball over a player that can’t every time.
Comment by Hurtlockertwo — March 6, 2013 @ 6:16 pm
the low yty correlation of IFFB% would lead me to say no, though that’s just the same problem that HR/FB% has…
it would at least be interesting to see a win value exist for IFFIP, but how much different is it than the other non-FIP win values (FDP- and BIP- wins) that we already have?
Question: You bring up the out percentage on IFFbs…what’s the out percentage of strikeouts? Obviously higher than the 99.1% there is for IFFBs I suppose, but there’s still a bit of wiggle room there too, yeah?
Not that it changes anything, I’m just interested.
I would definitely be interested in the “new” FIP, but I don’t think I’d like to replace it all together. I’d like to be able to look at both and see how they changed differed, and analyze those a little more.
I would like to see IFFB% included, and the most recent Community article (Greenlee, 3/4) seems to make a strong case for rSB and rPM to be included as well. Incorporating all three would stay true to DIPS theory without overlooking as much of a pitcher’s skill-set as FIP.
does Cameron even read what he writes? One of the major points of contention with WAR for pitchers is that it does exactly that — it measures things that didn’t actually happen. As you pointed it, FIP normalizes some of its components to take “luck” out of the equation.
If we want to use WAR to “measure what actually happened” with pitchers a good place to start would be be to put luck (notice the lack of quotes this time) back into it.
What you seem to be implying is that all of the really high IFFB% hitters are accounting for most or all of the IFFB out success of some pitchers. I can just say intuitively that the correlation would not be remotely as high as Steve reports. For there to even be a small correlation those hitters would have to popping up against those pitchers at an astronomical rate.
“for players selected at the Major League level, there is no real differentiation in their ability to catch a pop fly.” I’ve heard a radio announcer call an infield fly (while in the air) as not catchable by the 1B. After quite awhile in the air it landed about 20 feet back from the bag but the inexperienced 1B didn’t account for the different late break of a LHB and wasn’t near where it fell.
Nevertheless, with just 1% falling in for non-outs, even the worst cannot be that bad.
I think what you are saying is that it is likely that high IFFB% hitters are better hitters statistically than high SO% hitters. If you look at that leaderboard, I believe that is correct. So there is really no need for anybody to dispute. I do appreciate the rather aggressive baiting technique, though.
As madvillain pointed out, Fangraphs WAR certainly doesn’t measure what definitively did happen. If you were doing that, you would use R/9 as the metric in your WAR calculations. You choose to use a metric that strips out some things that are assumed to be out of the pitcher’s control, (in particular, but not limited to, sequencing) to get a better read on his actual performance. If you’re doing that, you have already moved to a metric that’s theoretical, not actual. At that point, it just becomes a matter of choice between which run estimator you feel best represents how a pitcher performed.
Comment by Nathaniel Dawson — March 6, 2013 @ 8:11 pm
I don’t understand FG’s baby-steps in pitcher-WAR either. Why is a pitcher’s defense and steal-prevention irrelevant again? Real, significant, repeatable, support-independent skills should be considered. Period.
Is there a way to incorporate so that the pitcher doesn’t get full credit for the IFFB? Considering the correlation is lower, could they get a % of the credit. I agree with the idea of it being included but also agree with some of the comments about how the hitter is still making contact. It could be a large part pitcher skill that the batter is unable to square it up but ‘full credit’, equaling it to a K, seems flawed.
I feel the opposite. I’d much prefer WAR to be about actual results, whether that be for a year or for a career. If I want an estimate of his actual value, then I can look at the different pitching metrics, the parks and defense, and try to form an opinion about his actual talent. Give me the raw numbers first, then I can use that as a baseline if I want to look into it further.
Comment by Nathaniel Dawson — March 6, 2013 @ 8:28 pm
You could also say the same about strikeouts and walks. Both the hitter and the pitcher influence all those outcomes. Which is the essence of FIP, that it’s measuring the interaction between hitter and pitcher, taking only those measures which seem to be truly within the realm of those two.
Comment by Nathaniel Dawson — March 6, 2013 @ 8:44 pm
Why would it necessarily be higher? Two different things, I’m not sure why it would logically follow that SO’s would have a higher out% than IFFB’s.
Comment by Nathaniel Dawson — March 6, 2013 @ 8:46 pm
Steal prevention is support-independent? So catcher’s have nothing to do with it?
Comment by Nathaniel Dawson — March 6, 2013 @ 8:48 pm
The allure of rSB is that it separates a pitcher’s contributions in steal prevention from the catcher’s (or at least that is its objective).
I’m sounding like a broken record here, but the 0.37 year-to-year correlation for IFFB% is extremely misleading, because IFFB%=IFFB/FB.
If we’re talking about IFFB/Batted Balls (similar to how LD%=LD/Batted Balls, or FB%=FB/Batted Balls), the YTY correlation is 0.63. Compare that to 0.69 for BB%, 0.79 for K%, and only 0.42 for HR/9.
Comment by Steve Staude. — March 6, 2013 @ 8:57 pm
Well, there are far more strikeouts than infield fly balls, so the percentage reached on infield flies would have to be far higher than that for strikeouts for the two to be equal.
Comment by Nathaniel Dawson — March 6, 2013 @ 8:57 pm
Thank you — I hadn’t actually taken a look at it, so I’m glad you pointed that out.
Comment by Nathaniel Dawson — March 6, 2013 @ 9:01 pm
Yes, but home runs are 4.33 times as weighty in the FIP equation as a strike out, and for that reason, one tolerates a lot more noise.
My second point is more important, in my opinion: you can’t just introduce an importantly interacting variable, use the same coefficients, and not test the results to see that they are better than the original model.
No, it does not. It doesn’t normalize them, it ignores them. There is a difference between what SIERA does, in giving weights to batted balls, and just ignoring them. FIP just looks at plate appearances that have to do with walks, strikeouts, and home runs. All of the other plate appearances are simply ignored, because teasing out how much credit for the resulting hits and outs should be given to the pitcher and how much should be given to the fielder is something that we can’t do very well.
This is the thing that people get wrong about pitcher WAR the most. FIP-based WAR doesn’t measure anything that did not actually happen. It is not comprehensive in its measurements of things that did happen, but it also does not claim to be comprehensive. It is an incomplete measure of pitcher performance, but it exclusively measures events that demonstrably occurred.
Using an RA based WAR and then making adjustments for assumed defensive contribution — as B-R does — makes that pitching WAR construct guilty of the thing that FIP-based WAR is most often incorrectly accused of doing. When you start making guesses about how much a pitcher or a fielder contributed to the results of their balls in play, then you are no longer measuring what actually happened.
It is completely fair to criticize FIP-based WAR for not measuring everything that happens, much like it is fair to criticize current catcher defense ratings for not measuring everything that catchers do that impact run prevention. It is incorrect to state that FIP measures things that did not happen or represents a hypothetical. FIP constrains itself to only measuring things that we know were almost entirely dependent on solely the pitcher/hitter match-up. Starting at RA9 and working your way backwards from there introduces the hypotheticals, not the other way around.
This is explaining part of FDP (in particular, the BIP-wins), not separate from that. Essentially, this calculation would be crediting part of the BIP-wins to a pitcher’s FIP instead of his FDP. Which, I think, follows logically from the definition of fielding dependent or fielding independent wins.
There seems to be a lot of confusion about why the variables for FIP were chosen. They were not chosen because they have high year-to-year correlation. The weights were not chosen to maximize year-to-year correlation. When talking about a metric that measures past performance, you don’t actually care about y-t-y correlation very much.
The question is simply which outcomes we believe are the responsibility of the pitcher himself and which ones had some interaction with his defenders. BB/K/HR have little to no interaction with the fielders, while most balls in play have a very large degree of interaction. IFFBs are a subset of BIPs that have very little interaction, so on that basis alone, they seem to be more like BB/K/HR than GB, LD, or OFFBs.
This should already be integrated in that hitter’s greater ball-in-play rate. (Assuming there is one, but I don’t see why you’d give extra value to a player who hit more popups while still having the same rate of other balls-in-play.)
Fairly cryptic answer. But, no, FIP is not measuring what actually happened, else it would stop at HR’s allowed, SO’s, and BB’s. We of course already have that information, so why any need for FIP? From there, it attempts to estimate how many runs a pitcher should have allowed based on those three components. So it is indeed hypothetical, or theoretical, or whatever. For better or worse, it’s an estimate of runs allowed, not a measure of them.
Comment by Nathaniel Dawson — March 6, 2013 @ 11:12 pm
“runners do not advance on infield flies”
Comment by Jarrod Dyson — March 6, 2013 @ 11:44 pm
Makes sense to me. Does this mean that the next step in calculating a FIP-based WAR would be analyzing which balls in play would have been turned into outs by all major league players? For example, pitcher X induces 100 soft ground balls right to the shortstop over the course of the season. Assume that batted balls of a similar velocity and trajectory are turned into outs 98% of the time. Any out made on such a batted ball appears to be independent of who is playing defense.
Of course, I haven’t seen this sort of public data publicly available. When it is, then FG may even consider to adding to the cartegories of defense independent outcomes (weak ground balls, lazy flies, etc). In the meantime, I think this is a good step.
Can’t believe it took this long to find a Castillo mention.
Comment by Frank Campagnola — March 7, 2013 @ 3:31 am
For the most part, they don’t.
Comment by Frank Campagnola — March 7, 2013 @ 3:33 am
The discussion seems to be centered around IFFBs and FIP, but the question is asking about IFFBs and WAR. To me, this is the real question: What correlates better to team wins, an IFFIP-based WAR or the normal FIP-based fWAR?* The basic logic of including IFFBs in FIP appears sound, so if IFFIP-based WAR correlates better, I would vote for including IFFBs. After all, the goal of FIP and WAR should be to provide a metric that most accurately describes how many games a pitcher won or lost for their team.
*Hopefully someone with a better stats background can chime in. Would calculating the correlation between different FIP-based WARs and team wins rely on the accuracy of the hitting, baserunning, and fielding metrics? I.e., would the correlation between various FIP-based WARs and team wins be skewed if the other metrics were either more or less accurate?
FIP doesn’t really “ignore” batted balls in general; it ignores any difference between batted balls (eg. grounder vs. fly vs. line drive). For every batted ball a pitcher allows, it makes the pitchers FIP grow nearer to 3.1 (or whatever the baseline is these days).
Comment by suicide squeeze — March 7, 2013 @ 8:31 am
They can advance on a K + WP, which is probably more common than a dropped IFFB.
I will simply agree with what many others said. FIP based WAR is completely useless to me. It gets stuck in the in between. It tries to adjust for LOB% and BABIP which are largely out of a pitchers control and doesn’t adjust for HR/FB. I’d rather you base WAR on simple ERA which is basically what you do with hitters. We don’t pushing a hitters WAR because his BABIP was high. WAR is a measurement of results, it isn’t a measurement of skill.
While we are on the subject I don’t agree with park adjusting WAR for hitters either since all of the park adjustments are completely flawed. Petco doesn’t have the same effect on Venable as a LH pull hitter as it does on Quentin as a RH pull hitter or on Cabrera who is a spray slap hitter.
The WAR you have created doesn’t show what happened or the talent of the player, it is some odd amalgamation that doesn’t seem very useful.
” BB/K/HR have little to no interaction with the fielders”
That is really the root of the problem. HR depend on the hitter, they depend on the park, they depend on the wind, they do sometimes depend on the fielders as many get robbed each year taking them away, they depend on tons of things out of the pitchers control and things we can’t accurately measure. You are putting all kinds of bias into WAR when you try to correct for things like this.
I agree with this. FIP assumes that pitchers all face a reasonably similar cross-section of hitters, and can thus be compared. In order for this assumption to fail, you would have to see very selective pinch-hitting approaches across the big leagues, which doesn’t happen often. An argument could be made that there are lesser hitters at the plate for pitchers late in the game because starters have been removed through substitution, but in order for any meaningful difference to appear there, the numbers would need to be staggeringly different, due to the sample sizes.
Actually, Cameron points out that its y-t-y correlation is similar to HR rate. So you would think in xFIP you would normalize it just like you do HR rate.
Although, Steve Staude claimed that IFFB actually has a .63 y-t-y correlation not the .37 that Cameron put in the article. If that is the case, maybe FanGraphs would decide that is high enough to not normalize in xFIP.
I don’t understand what you don’t get about this. FIP measures some of the things that happened, makes no estimate or claim to the things it doesn’t measure, and does not claim to touch anything it doesn’t measure. If you add regressed metrics you are now measuring something theoretical. FIP currently just ignores things that it would have to treat as theoretical, and then expressly tells you that it is only measuring the things that McCracken found to be stable year over year.
Sure, let’s include it. I’m not sure how much of IFFB% is due to park and how much to the pitcher, but park has an effect on HR too. If it gets you closer to what the pitcher’s actual performance was in terms of factors they control then it should be in there.
Comment by Ivan Grushenko — March 7, 2013 @ 12:36 pm
FIP does measure what happened that is under the pitcher’s control. When studies are published that show more things are under a pitcher’s control,such as just happened with infield flies, then those factors should be added to it.
Someday, we may see a much-improved FIP that looks very different from today’s.
In the vast majority of instances, an infield popup is a non-advancing out no matter which major league infield is behind the pitcher. That’s more or less the definition of “fielding independent”. If the goal of FIP is to count fielding independent events generated by this pitcher then IFFB should be included.
“Sure, maybe you or I wouldn’t turn every IFFB into an out, but for players selected at the Major League level, there is no real differentiation in their ability to catch a pop fly.”
Are you sure about this, Dave Cameron? Have you allowed for the well-documented Mark Reynolds factor? http://www.baseballprospectus.com/article.php?articleid=19409
Before you decide, you should check the year-to-year correlation of K+IFFB%. While you have the correlations for K% and for IFFB%, you don’t necessarily, know what the combined correlation would be if K’s and IFFB’s are themselves correlated with each other.
Nothing inherently makes it higher I guess, my statement was more based on my personal observations which could obviously be wrong. If it doesn’t end up being much higher, then the case for IFFBs == Ks ends up being stronger, though my guess would be the number of Ks that end up in the runner on base is closer to a magnitude higher than IFFBs (1 in 1000 as opposed to 1 in 100, maybe).
Hence why I wouldn’t mind seeing the numbers but have no idea where to start.
I don’t think there is any dispute as to the fact that HR, BB and K are things that DID HAPPEN. It’s important to note, though, that converting those three things into FIP by a linear formula is just a product of multiple linear regression, using ERA (or maybe R/9) as the response variable.
So the estimated effects on ERA of a HR, BB, or K are regressed back to the league mean effects. The regression has attempted to explain as much of the variance in ERA as possible using only those three stats.
But K, BB, and HR also correlate linearly to BABIP and LOB%. By leaving BABIP and LOB% out of the model–among other things–the three stats left in the regression model are trying mightily to explain what BABIP and LOB% should be explaining, in addition to some of the things they can explain independently.
Any environment like baseball is a complex environment with lots of interaction and shared explanation between variables. Because K/9 and LOB%, for instance, are linearly correlated themselves, K/9 ends up explaining some of what LOB% could have explained.
I’m not sure exactly to whom I’m speaking. I don’t think I’m arguing with anyone. Only trying to point out what the linear formula is doing, and what is it not doing.
That’s not exactly how the weights work. The linear multipliers are scaled to account for how we’d expect a player to perform in those stats.
0.111 is average-ish for HR/IP, while 0.777 is average-ish for K/IP. That the K’s multiplier (slope) is about 1/6** that of the HR/IP means very little. In fact, the contribution to the model for an average-ish player would be 0.777*2/IP = 1.554/IP for K’s and 0.111*13/IP = 1.443 for HR’s. From that perspective, I guess you could say that K’s actually has the higher weight.
Though, in reality, a regression model is as much an analysis of variances as it is of means. So all the weights discussed mean nothing unless we understand the underlying variance of the variables so that we can scale the weights appropriately.
**If I’m not mistaken, the coefficient for HR is 13/IP and for K is 2/IP, for an exact ratio of 2/13.
Alright, IFFB/IP has a 0.60 YTY correlation amongst qualified pitchers. If you want to stay consistent with the denominators of BB% and K%, then IFFB/TBF has a 0.61 (I prefer this to an IP basis). HR/TBF is only 0.40.
Comment by Steve Staude. — March 7, 2013 @ 3:28 pm
“FIP measures some of the things that happened, makes no estimate or claim to the things it doesn’t measure, and does not claim to touch anything it doesn’t measure.”
Um, what? “makes no estimate or claim to the things it doesn’t measure”? Then how come it is scaled to ERA, and is designed to estimate how many runs a pitcher should have given up? It’s not measuring actual runs allowed, it’s trying to provide an estimate of them. There can be no argument that FIP is estimating something that it isn’t directly measuring.
Comment by Nathaniel Dawson — March 7, 2013 @ 4:40 pm
And yet, no R/9 in any of the tables that I can see. Or maybe I just haven’t looked in the appropriate place, so maybe I’ve missed it.
ERA is there, but what do I care about ERA? If I want to look at a metric that’s a direct measure of the runs a pitcher has allowed (and I would think that would be obligatory), I want R/9, not something like ERA.
Comment by Nathaniel Dawson — March 7, 2013 @ 4:56 pm
Is it just me or is all of this conversation about Pitcher WAR going to entirely unnecessary once we have reliable batted ball data that can give a run value to every ball contact?
The coefficients are just linear transformations of linear weights, dude. I have no idea what you mean, nor do you, I suspect, when you write “The linear multipliers are scaled to account for how we’d expect a player to perform in those stats.”