A Discussion About Evaluating Pitchers

Eric Seidman and I had a conversation about pitchers, pitching metrics, and the end of season awards last night. The fruits of that conversation are below.

Dave: So, I took a sneak peak at the FanGraphs author awards ballot, and you’re kind of a traitor. You can make a strong case for Roy Halladay or Cliff Lee, but instead, you pick Clayton Kershaw, even though he has a WAR of 6.8 compared to Halladay’s 8.0. You’re from Philadelphia, you write for FanGraphs, and you pick the pitcher with a lower WAR who doesn’t play for the Phillies? Don’t you know that you’re supposed to be a slave to the stats, and our most recognizable stat says Halladay has been better? You’ve got some explaining to do.

Eric: I’m a loner, Dottie, a rebel. At least when it comes to the Cy Young Award it seems. But I don’t think it’s crazy to support Kershaw for the NL’s best pitcher of the year award even though Halladay has a 1 WAR advantage. Bear in mind that by voting for Kershaw I’m not dissing Doc or his tremendous work this season. It wasn’t like Kershaw clearly stood out above the rest. I wrestled with the decision but ultimately decided that it would be nice to see him take home some hardware. The difference between he and Halladay is actually an interesting proxy for discussing why WAR isn’t the be-all, end-all when it comes to evaluating pitchers. First, some numbers:

Kershaw: 218 2/3 IP, 9.7 K/9, 2.1 BB/9, 43% GBs, .272 BABIP, 2.30 ERA, 2.37 FIP, 2.63 SIERA
Halladay: 227 2/3 IP, 8.6 K/9, 1.3 BB/9, 51% GBs, .300 BABIP, 2.41 ERA, 2.18 FIP, 2.61 SIERA

Each pitcher has excellent peripherals, and Halladay doesn’t have a dominant innings advantage like he did last season. Their ERAs and adjusted estimators — xFIP and SIERA — are very similar as well. The major difference for me, and why I don’t truly believe Halladay holds as big a lead in WAR as it seems, is Kershaw’s batting average on balls in play. While .272 seems absurdly low, his mark was .275 last season and .269 in 2009. Dodgers stadium likely plays a role in that, but FIP doesn’t factor his BABIP prevention into the equation. Given the lackluster infield defense and the revolving door of middle infielders he’s pitched in front of, it seems safe to assume that most of the prevention comes from Kershaw’s skill-set, repertoire and sequencing. Were we to retroactively credit his WAR given that it seems he is going to be one of those consistent BABIP-preventers, suddenly that 6.8 WAR might be be 7.5.

At that point, we’re really splitting hairs. There isn’t a statistically significant difference between 7.5 and 8 WAR. Plus, even though I have Halladay as my computer and phone background, and root for him as a Phillies fan, I would just like to see Kershaw get some more national recognition.

Dave: By choosing a FIP-based pitcher WAR, I agree that the measure leaves out some things that pitchers do deserve credit for, but are you really ready to give a pitcher 100% credit for his BABIP? Yes, Kershaw has shown some ability to post lower than average marks before, but almost all of that is due to his performance at home, where his career BABIP is just .271 (and .248 this year). On the road, he’s been basically league average at preventing hits on balls in play. I’m not saying we should limit ourselves to split season data and draw conclusions only from smaller samples, but I’d be more convinced that the variable affecting Kershaw’s BABIP was actually something he’s doing if he was able to do it anywhere besides Los Angeles.

Certainly, when doing retrospective value analysis, I believe that pitchers should get some credit (or blame) for their BABIP. FIP gives them no credit, ERA gives them total credit, but the truth is somewhere in the middle. We just don’t know where. So, should we split the difference and hope we’re close? I don’t know that there’s a right answer here. Separating what is pitching and what is outside factors is just hard.

Eric: Right, I wouldn’t want to give Kershaw complete credit for the BABIP-prevention but I actually think this year could be the outlier on the road. In 2010 his BABIP split was virtually identical at .271 home/.278 away. The year before, .275 home/.261 away. Now Dodgers Stadium surely factors in, but his 2009-10 road BABIP gives me some reason to believe he has the prevention skills. Again, he shouldn’t be credited for the complete difference between his and the league’s BABIP, but I don’t think ignoring it entirely does him justice. The inverse is also true, especially for a Javier Vazquez-type, who has shown over 10+ years that his estimators will always best his actual run-prevention. Fortunately, these guys are few and far between, so an FIP-based metric does the job 90 percent of the time.

It’s very difficult to separate pitching the other external factors, but perhaps that’s the next great wave of analysis. At the very least, it would be interesting to have a readily available calculator where a user could determine a pitcher’s WAR based on his feelings about BABIP. That would provide a range of possibilities. For someone like Kershaw it might suggest he’s worth between 6.5-8 wins… for Halladay it might be a smaller range, like 8-8.5, but we could foster the conversation about what the BABIP-prevention (or the lack thereof) can do to a pitcher’s value.

Dave: If this year is the outlier for Kershaw, what was 2008, when his road BABIP was .341? I don’t think we can just ignore large home/road BABIP splits in two of the four years he’s been in the big leagues. I’m not saying it’s all Dodger Stadium, but a career road BABIP of .290 makes it tougher to argue that Kershaw is one of the exceptions.

The tough thing about estimating BABIPs impact on ERA is that it can be skewed situationally. Last year, for instance, Cliff Lee’s BABIP was .256 with the bases empty and .344 with men on base, so he posted a ridiculously low LOB%, which drove his ERA up significantly. His overall BABIP didn’t look out of whack, but the distribution of when those hits came had a remarkable impact on the amount of runs he allowed. We’ve seen a bit of the opposite this year with Kershaw (though not to the same degree), as his BABIP with men on base is just .254, and only .260 with men in scoring position, so he’s got one of the highest strand rates in baseball.

This is part of why I’m in disagreement with the “ERA measures what happened” crowd. For all we know, Dee Gordon, Rafael Furcal, and Jamey Carroll each made an outstanding play with runners at second and third this year, saving Kershaw six runs in the process. A good or bad defensive play can have a substantial impact on runs allowed, and I don’t think we can just assume that those plays are distributed normally throughout all situations.

Should Kershaw get some credit for outperforming what we’d expect based on his walk rate, strikeout rate, and home run rate? Absolutely. Should he get enough to make up for the fact that Roy Halladay has just out-pitched him against better competition and in a better hitter’s park? It seems like you would have to give him almost total credit for his hit prevention – and the timing of that hit prevention – in order to swing Kershaw’s way.

Eric: Right — I think many have a skewed idea of what “happened” and don’t truly realize how much 4-6 earned runs mean over the course of the small sample that is 200 innings. For instance, tack on another five earned runs and, despite throwing 218 2/3 innings, Kershaw’s ERA jumps to 2.51. It doesn’t take much for that number to be manipulated. Whether it’s defensive players making a tremendous stop, or a two-out error followed by five runs that don’t count against the pitcher, ERA clearly doesn’t tell us everything we want to know. Not only with respect to run prevention, but also in terms of what actually took place on the field.

Dave: Over in the AL, Justin Verlander is almost certainly going to win the Cy Young, but he might just win the MVP award too. Not to turn this into another article on what the word valuable means, but how would you judge Verlander against premium position players like Jose Bautista, Jacoby Ellsbury, and Curtis Granderson this season? WAR has them in the same general range, and Verlander is probably under-credited for his season by a system based on FIP, so giving him partial credit for his low BABIP closes the gap even further. You didn’t vote for Verlander in the staff MVP awards, but given your arguments about Kershaw, it’s hard to believe that you don’t see him as being among the most valuable players in the AL this year.

Eric: As for Verlander, it seems like virtually none of the balls batters put into play off of him fall in for hits. While I wouldn’t expect him to consistently hold batters to a .235 average on balls in play, he has been absolutely tremendous this season, and awards are based on what happened, not what might happen in the future. Verlander has clearly been less hittable than, well, anyone, and he is a major reason the Tigers won their division. You’re absolutely right that he’s one of the most valuable players in the league this season, and I’d honestly vote for him #2 on my ballot behind Bautista. If Verlander is credited for his BABIP prevention, as I believe he should be, the gap between he and Joey Bats does close, but I don’t think it would close enough for me to give him the award.

Bautista, in a down offensive season, is hitting like batters did in the mid-90s and is putting the finishing touches on one of the best offensive seasons we’ve seen. Just like Pujols missed out on MVPs because Bonds was otherworldly, I can’t justify a Verlander-for-MVP campaign when Bautista is tearing the league up.

That being said, I do believe pitchers should be considered for the MVP. As we discussed in my recent article about Verlander’s MVP credentials, the extreme impact he has in a smaller concentration of games can be argued to have been more valuable to the Tigers than marginal improvements in their odds of winning from everyday players in the games he didn’t start. Some view the 32-35 starts as a detriment to a pitcher’s campaign while I think of it in the opposite manner. If the Tigers odds of winning are 65 percent when he pitches and 51 percent when he doesn’t (made up to illustrate the point), then he can certainly have the same level, if not a greater level, of impact as a position player.

Dave: Not to carry the Kershaw/Halladay discussion over to this argument too much, but I do wonder if there’s some confirmation bias going on here. Justin Verlander is excellent, and throws really freaking hard, so when he posts a low BABIP, it’s not that tough to draw the conclusion that he’s just throwing pitches that are tough to hit. Except, Justin Verlander has been really good and thrown really hard for years, and his BABIP last year was .286, and the year before that it was .319. In fact, most studies that have found a trend in guys who can consistently beat the league average in BABIP tend to show that it’s soft-tossing lefties – the Barry Zito‘s of the world, not the Justin Verlander’s.

Has Verlander actually been less hittable than anyone else in the majors, or do we just talk ourselves into that conclusion when a great pitcher also gets a lot of balls hit right at his defenders? I’ll repeat the point I made about Kershaw – I’m comfortable giving him some credit for his BABIP, but all of it? I don’t think so. We’ve come too far in understanding the impact of non-pitcher variables on the outcomes of balls in play to then ignore that progress when we start looking backwards instead of forwards.

Eric: Getting a lot of balls hit right at defenders isn’t necessarily something Verlander or any other pitcher can control, but it’s entirely possible that, this season more than any other, batters are making weaker contact, which makes the jobs of his fielders much easier. If HITf/x were available — and around for 5+ years so we had some context to incorporate — we might see that the speed off the bat when he pitches is much lesser in scope than that of other hurlers. Right now we only have his balls in play distribution and his BABIP, so it’s impossible to make that determination.

But I do think that getting balls hit at fielders isn’t necessarily something that should take away from the pitcher. There’s a big difference between lined one-hoppers right at a bailing first baseman and weak grounders hit at a spot where even an average shortstop could range. I’m not saying with 100 percent confidence that’s the case, and I will certainly grant that confirmation bias could be at work, but until we have more detailed data, the natural reaction is to revert to what we know with the most certainty. Right now that would suggest that Verlander has some prevention skill but it’s a combo of a tremendous pitcher having a flukily lucky season. I don’t know if I’m okay accepting that without at least testing what his WAR would be if we considered his BABIP true talent level to be X, Y, or Z.

Dave: Yeah, I’ll retract the “balls hit at defenders” comment, since we’re trying to isolate a pitcher’s performance but not strip out all aspects of luck. I’d agree that it’s likely that if we had Hit F/x, we’d probably find out that Verlander’s been getting more weak contact than usual, but we don’t have that, and we’re the kind of people that like evidence, and I don’t know that we have much to support the idea that Verlander has actually induced weak contact. It seems like the kind of thing we’d expect to find if we could prove it, but then again, we probably would have expected to find that all good pitchers produced weak contact before Voros McCracken came along.

Like Kershaw with the Cy Young, Verlander’s case for MVP rests mostly on a full acceptance that his low BABIP is almost entirely something he caused. He’s had a good enough year that he’s certainly a worthy candidate, and it won’t be any kind of poor choice if he ends up winning the award, but I’d agree that I just don’t quite see enough pure dominance to make up for the fact that there are some position players having truly spectacular years as well – and their greatness comes with fewer caveats.

Eric: The idea of weak contact does seem like more of a backtracking statement and a possibility drawn up after seeing results, and not the process, but I used that more to illustrate that there is an awful lot of unknown out there. Way too much unknown for people to express opinions with any meaningful level of certainty. But I do think Verlander’s candidacy extends beyond just the BABIP prevention and whether it’s his or not. Given the vague definition of the award and the nebulous term ‘value’, and how aspects ancillary to the individual — like teamwide success or the number of other good players on the team — are often factored in, I’d wager that Verlander would be a candidate even if he had, say, a .260-.265 BABIP. Bautista plays for a non-playoff team, and voters are hard-pressed to select a player from the Yanks and Red Sox when there are numerous very valuable players. This is somewhat silly, but it’s how many people think when they’re given a ton of leeway in interpreting the intentions of an award.




Print This Post



Dave is a co-founder of USSMariner.com and contributes to the Wall Street Journal.


140 Responses to “A Discussion About Evaluating Pitchers”

You can follow any responses to this entry through the RSS 2.0 feed.
  1. Hurtlocker says:

    I’d vote for Kershaw too, not just because I’m a Giants fan and he made my boys look sick last night, really…

    Vote -1 Vote +1

  2. Beau says:

    Does offensive WAR use BA or some sort of BABIP adjusted rate stat? It seems that we have accepted that just like some pitchers are responsible for sustainably lower than expected BABIPs that some offensive players are just as responsible for BABIPs that are sustainable otuside of the standard deviation from the mean. If not perhaps offensive WAR needs to be adjusted ?

    Vote -1 Vote +1

    • Dave Cameron says:

      We’re not trying to measure sustainability or luck – we’re trying to isolate individual player performance. There are far fewer outside variables that need to be corrected for with hitter BABIP than with pitcher BABIP.

      Vote -1 Vote +1

      • Ivdown says:

        Like Matt Cain’s ability to overperform his FIP just about every year, when does not allowing hits become a skill, instead of just blind luck apparently? Kershaw’s done it his entire career save his rookie season of 2008 (which you mentioned should still count, which it should, but he’s clearly nowhere near that same pitcher now). After a while how can that not be considered a skill?

        Vote -1 Vote +1

      • delv says:

        Why not incorporate SIERA into WAR, Dave? Serious question.

        Vote -1 Vote +1

      • Dave Cameron says:

        We prefer that WAR is based on actual outcomes. Introducing an estimator that includes regression – be it SIERA, xFIP, or what have you – takes it from measuring past production into more of a projection of future performance.

        Vote -1 Vote +1

      • RC says:

        “We prefer that WAR is based on actual outcomes.”

        Then why are you using FIP, which ignores ~70% of the actual outcomes?

        Vote -1 Vote +1

      • Notrotographs says:

        You mean 70% of the outcomes that end up falling almost entirely under the control of the defense, and not the pitcher?

        Vote -1 Vote +1

      • suicide squeeze says:

        @ RC

        FIP doesn’t ignore BIP. Balls in play are the baseline for the stat. The coefficients for HR, K and BB are their values relative to a ball in play. If all a pitcher ever did was allow BIP, then we would expect an ERA of 3.2, which is what FIP relays.

        Vote -1 Vote +1

      • CircleChange11 says:

        You mean 70% of the outcomes that end up falling almost entirely under the control of the defense, and not the pitcher?

        I think what the data shows is that the longer the ball is in the air, or the more bounces it takes in the infield (up to a point) determine whether the defense generally turns the BIP into an out.

        So, who is responsible for how hard the ball is hit? And what direction it is hit?

        The defense? Not likely.
        The pitcher? Probably.
        The batter? Probably.

        IMO, it’s more 50/50 for pitcher/batter than it is reliant on the defense (100/0).

        We know that ~70% of line drives go for a hit. What % of the outs are [1] great plays, [2] fortunate/unfortunate positioning, [3] etc?

        Certainly we know that certain pitches/locations are easier to hit well than others, it’s why we put so much emphasis on plate discipline and selectiveness. It’s why we look so much at LD% and hope that it is recorded accurately.

        Some BIP results are more luck than others.

        Looking at a spray chart for where hits are allowed could also be a quality data source.

        The defense has ZERO influence on [1] how hard the ball is hit, and [2] where it’s hit.

        I think it fair to say that the pitcher and batter have influence over that. We just are unable to nail it down to a % for a variety of reasons.

        Vote -1 Vote +1

    • Hejuk says:

      Offensive WAR uses linear weights, which does give credit for high BABIP. There’s no problem with offensive WAR as a value stat.

      Vote -1 Vote +1

      • delv says:

        Question: are the linear weights (for each type of hit, or SB) recalculated for each individual season (or era)? That is, is the possible enhanced value of a SB or 3B or HR in 1920 relative to it’s value in 2010 accounted for, or does wRC+ keep the same historically averaged coefficients, and then, at the end, adjust for league-wise offense?

        Vote -1 Vote +1

      • Dave Cameron says:

        Yes, the weights are recalculated each season to account for the run environment of the time.

        Vote -1 Vote +1

      • delv says:

        thanks for the answers, chief. get well soon

        Vote -1 Vote +1

      • Ryan says:

        Why give a pitcher credit for his fielders? The park adjusted FIP is the best way to isolate a pitcher’s contribution.

        Vote -1 Vote +1

      • RC says:

        Because BABIP numbers have very little to do with the fielders.

        Does pitcher BABIP correlate to UZR or any of the defensive metrics at all?

        Vote -1 Vote +1

  3. Mitch says:

    frankly, an insane conversation. xFIP!?! babip!?! peripherals!?!? OMG.

    tell me one thing. what makes 200 Ks better than 200 IF popups?

    I love FG but can’t take it seriously on the pitching side until they relent and provide opposing batting numbers. You could more or less look at OPSa and see that Kershaw and Verlander are out in front in both leagues and shut off your sets there. That basically confirms what we’ve seen from both all year – shear dominance. After all, when did awards like the Cy Young become anything beside a performance/results acknowledgment?

    Vote -1 Vote +1

    • Beau says:

      In a way I cant help but agree. CY Young is an reward used to acknowledge the pitcher who had the best performance of the season. Perhaps we shouldn’t try and discern which pitcher SHOULD have had the best season for any purpose other than to predict future performance. Its not like any one is campaigning to give Nolasco any votes or anything.

      I agree that peripherals suggest that Halladay may be the better pitcher based on pure skills and will likely have a better next year (regardless of aging factors) but the statistics show that Kershaw’s true “performance” was better. At least I think.

      Vote -1 Vote +1

      • Steven says:

        Peripherals, and “luck adjustment” DO measure how well a pitcher performed this season. The idea behind it is that we are basically trying to isolate a pitcher’s performance. To do so we need to remove the factors that affect pitching stats that the pitcher cannot control. Ball park, fielders behind them, strength of schedule etc. This way we can see who PITCHED the best. In the same way that looking at pitcher wins does not tell us how well the pitcher performed this season because it ignores the fact that a pitcher cannot control how many wins he has, looking at ERA ignores the fact that the pitcher is not in complete control of his ERA.

        Peripherals certainly tell us how well a pitcher should have done and will do. But you need to understand that looking at them in the right way also tells us how well the pitcher actually did pitch this season.

        Vote -1 Vote +1

      • Mitch says:

        but what does the babip comparison of Kershaw and Halladay tell? that Kershaw was “luckier” than Halladay? On what basis could one conclude that? There is no standard by which we could or should be able to tell what constitutes favorable of unfavorable luck or random occurrence.

        Isn’t a measure like babip best used with small IP samples to even out extreme results like the kind Bobby Parnell was getting? Over the course of 200 IP in a season, the out inducing rates of Kershaw, Halladay, and Verlander speak for themselves.

        Vote -1 Vote +1

      • bill says:

        The BABIP difference might just say that Kershaw plays in a much better pitcher’s park (which he does). That’s not “luck.”

        Vote -1 Vote +1

    • Norm says:

      It would take 11 seasons to get to 200 IF popups, not even a full season to get to 200 k’s.

      Vote -1 Vote +1

    • Dave Cameron says:

      Defense. It exists. You have to account for it. If you just use OPS against or ERA, you’re ignoring a huge factor in run prevention. I’m sorry, but that’s just not good enough.

      Vote -1 Vote +1

      • Mitch says:

        Not saying that I buy into this argument (not fully sure what babip has to do with this), but if this is your argument, then come up with a defensive adjustment factor for each team to apply to OPSa in order to get a better comparative figure.

        Vote -1 Vote +1

      • Mitch says:

        moreover, what’s the point in using OPS to judge hitting if so much of offensive results have to do with fielding?

        Vote -1 Vote +1

      • Ed, Ed, and Eddy says:

        Kershaw doesn’t exactly have elite defense behind him. Aaron Miles has started half of his games at 3rd for goodness sake. Dee Gordon isn’t really a + defender right now as well. Hell, I think Juan Rivera was in right field for a few of Kershaw starts.

        JV at least has Austin Jackson tracking down everything in sight and an adequate infield defense.

        Vote -1 Vote +1

      • Mitch says:

        and btw, I didn’t suggest to look at runs allowed. Runs occur for a variety of reasons not all attributable to the pitcher. We agree on that. But if the bottom line objective of a batter is to “not make an out” then the converse must be the most basic objective for pitchers – to induce outs. And with inducing outs comes fewer baserunners and run scoring opportunities, the likelihood of success (limiting opposing runs) increases.

        Vote -1 Vote +1

      • Dave Cameron says:

        A hitter does not face the same defensive group every time he comes to bat. Over the course of a season, the variation in quality of opponent’s defense is going to be very small between hitters. That is not true of pitchers, who carry basically the same defense behind them all season, and whose results are much more impacted by the quality of defense behind them.

        Vote -1 Vote +1

      • Mitch says:

        as I said, then come up with team defensive metrics and use that to come up with adjusted figures for opposing batting stats to make them more comparable between pitchers…

        Vote -1 Vote +1

      • RC says:

        Dave, this is a clear example of “We can’t get this perfect, so we’re going to just ignore it”, which is a poor way to go.

        Does pitcher BABIP correlate to UZR? Any of the other metrics? Why do different pitchers on the same staff have drastically diffferent BABIPs?

        Because a very large portion of BABIP is skill. Ignoring that makes your stats worse, not better.

        Vote -1 Vote +1

      • Mitch says:

        thank you RC. and an enormous component of defensive results are positioning and alignment. Just look at the Ryan Howard splits with empty vs. RO. Most team put the shift on with bases empty against him and his babip in those situations is .241… with RO it goes to .347… is that just a random fluctuation?

        Vote -1 Vote +1

      • vivaelpujols says:

        A lot of that isn’t defense though, it’s just luck (in both ways – the over-performance type of luck and the plain old bloop hit type of luck).

        If the Cy Young award should go to the guy who pitched best irrespective of his defense, then I think you’re going to be closer to RA than to FIP.

        Vote -1 Vote +1

      • vivaelpujols says:

        Baseball Reference WAR is pitcher RA adjusted for team defense (based on Total Zone rating). That’s going to be about as close as we’re gonna get to factoring out defense without factoring out luck.

        OTOH, there’s a legitimate argument that the Cy Young should go the pitcher who’s the most dominant, not just the guy who happened to luck into the best numbers.

        Vote -1 Vote +1

      • Sean O'Neill says:

        Mitch, the basic reason people still use OPS, even here at Fangraphs, is because it is a generally accepted and understood metric. It is not a perfect judge of hitting talent for a number of reasons (OBP and SLG aren’t weighted properly, it fails to account for luck or run environment, etc), and I wouldn’t recommend it as such; as a communicative tool however, it is still effective.

        Vote -1 Vote +1

      • Mitch says:

        Sean, clearly over SSSs, OPS isn’t able to explain (or predict future) performance, but given enough of a sample size, it does a very good job at measuring performance.

        Vote -1 Vote +1

      • The Nicker says:

        VEP: Which way is it?

        It’s either primarily the defense, which means we should be making some sort of adjustment to ERA for all pitchers on the same staff and using that to factor into WAR, or it’s mostly based on luck, in which case we should be recalibrating every hitter’s triple slash into an xBABIP calculator before we calculate wOBA.

        One or the other should be done. Even if batters control BABIP outcomes more (they do), we can still get closer to reality with one of these strategies.

        Vote -1 Vote +1

      • joe says:

        I’m all for trying to isolate out defensive impact, but simply ignoring 60-70% of a pitcher’s outcomes might not be the best way to do it. The problem is people are simply lumping defense and luck (or variation if you prefer) together; we don’t do that for hitting and fielding.

        As an example: look at the BABIP’s of every other Detroit starter…. is there a different defensive team in Verlander starts? The team BABIP against is .291, Porcello .319, Scherzer .309, Penny .310, Fister .254 (9 starts), Coke .307 (14 starts). Verlander is at .235. That seems more like either skill or variation to me (as opposed to defense)… stuff we don’t correct for when looking at hitting #’s.

        I’ve heard the batters face a more balanced defensive spectrum meme for some time now (which is only partially true when you consider unbalanced schedules), but we seem perfectly fine with a “luck” component being left in to that value equation (when a BABIP is far from a career avg without a demonstrable change in batted ball profiles). So why wouldn’t we be OK with leaving in the ‘luck’ component in for pitchers? Without really assessing how much of BABIP is defense vs how much is luck, is the attempt to strip out defensive impact actually stripping out other more significant effects?

        I think if you want to isolate out defense you want to look at a pitcher’s BABIP vs the staff BABIP (and normalize/regress that against a league BABIP)…. so you can actually isolate defense? This would not seem to be too hard to do if the goal is to really strip out a pitcher gaining/losing from defensive contributions.

        Vote -1 Vote +1

      • CircleChange11 says:

        Just for discussion, as of today brWAR shows …

        Verlander 8.5
        Bautista 8.4
        Ellsbury 7.1

        That seems to jive with what many feel, that Bautista and Verlander are having the best hitting and pitching seasons in the AL.

        Matt Kemp at 9 brWAR is stunning to me.

        fWAR having Verlander and Sabathia almost idetical is tough to wrap the mind around … as is Ellsbury being a 1/2 win better than Bautista.

        Averaging the “WARs”, again, seems to fit what many feel … Bautista and Ellsbury are close … Verlander and Hallday are favorites, with Lee and Kershaw deadlocked for 2nd.

        Factoring in WPA just solidifies the case for Bautista and Verlander,

        All that said, it might be possible that CC Sabathia is actually under-rated. 5 straight 230+ IP seasons, being worth between 5-8 WAR each year. His BABIP is 30 points higher than career average, but like I mentioned earlier, a guy at Tango’s blog went through and watched all of CC’s hits allowed, and the vast majority were hard hit balls that the defense should not be expected to turn into outs.

        Vote -1 Vote +1

      • vivaelpujols says:

        Yeah that’s the thing Nicker, I don’t know what the Cy Young should be really.

        The two extremes would be relying completely on RA – IE, saying luck doesn’t matter, or relying completely on a pitch attributes based metric:

        http://www.hardballtimes.com/main/article/working-title/

        I used to think that defense was a different sort of luck than hitter luck (where the batter hits a home run off of a low outside curveball or something), but are they really different?

        Vote -1 Vote +1

    • adohaj says:

      “tell me one thing. what makes 200 Ks better than 200 IF popups?”

      Luis Castillo is playing second base

      Vote -1 Vote +1

      • jorgath says:

        Well, that assumes that your catcher isn’t…um…someone as bad at catching as Luis Castillo is at playing 2nd. Basically, if your catcher allows as many people to reach on dropped 3rd strikes…

        Vote -1 Vote +1

  4. Tangotiger says:

    Fantastic. Exactly the kind of discussion that should take place, and hits almost all the touchpoints.

    This is the kind of stuff I’d want on sports radio.

    +13 Vote -1 Vote +1

  5. Ryan says:

    Frankly, there are worse ideas than just splitting the difference between ERA and FIP and just using that is the WAR input.

    God, I can’t wait for HITf/x. This was a good discussion and follow-up to the comment wars going on in the “Official Position…” series. Thanks.

    Vote -1 Vote +1

  6. BobbyGrich says:

    You guys actually get paid to write this stuff? Sign me up.

    On a serious note, it seems to me that the more advanced statistics are fine-tuned, the more the gap between “on paper” (statistical) value and the “on the field” (real) value is defined. Note that I say “defined” and not “widened.” I think the gap is getting narrower, but stats like WAR make it even more distinct and point out that, like Zeno’s arrow, they’ll never actually get there–that is, a One Stat To Rule them All, and in the Dugout Bind Them. A single stat, or combination of stats, that will adequately define the full capabilities of a player in relation to other players. I just don’t think it will ever happen (e.g. can we look at only WAR without looking at wOBA or good old-fashioned OPS? Can we say that 7 WAR shortstop with a .370 wOBA is truly more valuable to his team than a 6 WAR first baseman with a .420 wOBA? We have to look at the whole context, the entire picture, and so far there is no TPS, or Total Picture Stat).

    To put it another way, we’re just going to have to accept the fact that statistics only tell us so much about a player. OPS might give us 60-70% of a position players total value, and WAR maybe 75-85% (I’m just making these numbers up), but there’s always going to be a significant gap, and it is only becoming more clear the more advanced and sophisticated the metrics become. I would even like to posit a 10%+ Indeterminancy Factor–that is, no matter how sophisticated the metric, it can never account for more than 90% of a player’s value.

    Vote -1 Vote +1

  7. CircleChange11 says:

    I’m a loner, Dottie, a rebel.

    I’m somewhat ashamed to know the quote.

    I can still picture that idiot running out of the pet shop screaming, holding two handfuls of snakes.

    ———————————

    fWAR
    ——–
    KC: 6.8
    RH: 8.0

    brWAR
    ——–
    KC: 6.7
    RH: 7.2

    Average WAR
    ————–
    KC: 6.8
    RH: 7.6

    IMO, .8 is a “lead”. If we continually put imaginary error bars around WAR and suppose a player’s on-field value is within the 6-8 WAR range, then WAR really doesn’t tell us what we’d like it to.

    Kershaw leads in [1] ERA, [2] K’s, and [3] Wins … and that seems to be in order of importance for the BBWA. It’s possible that KC wins the NLCY.

    Kershaw and Lee seem like an exact tie for 2nd place in the CYA voting.

    Vote -1 Vote +1

    • CircleChange11 says:

      An equally interesting discussion could be had for the AL CY.

      fWAR
      ——–
      CC: 7.1
      JV: 7.0

      brWAR
      ——–
      CC: 6.6
      JV: 8.6

      avrWAR
      ——–
      CC: 6.9
      JV: 7.8

      Rarely does such a situation illustrate the differences in WAR components/calculations than this situation. FG has CC and Verlander being equal in value, BR shows Verlander ahead by a good margin … and I think the latter fits public perception and observation better. brWAR also has Verlander as the overall WAR leader (essentially tied with Bautista, ahead of Ellsbury). There’s a reason why Verlander is being discussed as a legit MVP candidate.

      At Tango’s blog, someone watched every hit allowed by both verlander and sabathia (I often recommend this, most recently in the case of Kotchman) and concluded that Sabthia’s hits allowed were generally hard hit balls. It wasn’t just the differences in defense between NYY and DET. Long story short, the conclusion may end up being that pitchers generally are to be given a decent amount of “blame” for the hits they allow.

      With the nature of the recent discussions, it seems WAR is changing in the regard of how we use it, and how precise we value it. IMO, it’s a decent example of how precision and accuracy are not the same thing.

      Vote -1 Vote +1

      • Paul says:

        I agree that this is where we’re headed. But how is this different than BBWA writers saying they voted for a guy “because I watched him pitch 35 times,” and ignore various stats based on what they see?

        The thing is, we’re at a point where intellectual honesty demands that when judging the very best of the best, eye on the prize is all that works. Thus, statistical analysis is only good for describing the general population. We may never have the instruments necessary to measure the extreme outliers with enough precision to overcome this.

        Vote -1 Vote +1

      • Andrew says:

        Let’s say Kershaw delivers a perfectly located fastball on the outside corner, but the hitter smacks a line drive. Dee Gordon makes a diving catch for an out, though. When we adjust for defense, we call CK lucky here, and detract from his worth/performance.

        Now let’s say Halladay throw a hanging curve, but the hitter only manages a weak groundball. Easy play for Rollins that any SS could make.

        Of course I took many words to illustrate a very simple concept, but how do we delineate quality of contact against a pitcher, vs. quality of the pitch? Because while they’re correlated, it’s not 100%. We dub CK lucky because his defense made a great play, but he still did his job by making a good pitch. Meanwhile Halladay got what he earned (an out), even though one could argue he ‘got away’ with one.

        This gets into ‘why did the batter hit a LD against a well-located pitch?’ and talks to sequencing and quality of various pitches. I don’t have a theory/stance on this, just asking the question.

        Vote -1 Vote +1

      • Paul says:

        Andrew: This is why bars and beer were invented. We are at the same point anti-stats knuckledraggers have always been. And they’re laughing because drinking beer and arguing is much more fun than doing it in computerland over my morning coffee.

        Vote -1 Vote +1

      • CircleChange11 says:

        Andrew, that’s why sample size matters.

        It’s also why I say pitchers need *some* credit for single season BABIP, because it seems so statistically improbable that a lone pitcher would experience such good/bad luck over a 200 IP span.

        There’s a lot of “noise” in certain statistics. Some of that noise may be randomness that we never really narrow down. Some of it may be items that we can detect with advances in technology and thinking of things in new ways.

        Here’s how I look at it …

        If I, as a pitcher, threw 100 pitches in a cage/simulator, trying to hit various sports and make certain pitches move certain ways, the main aspects affecting the outcome are …

        [1] Location – Seriously, if Ted Freackin Williams can hit only .220 on low and away pitches, then it’s unrealistic to expect average hitters to do better than that. But, that’s BA and not just limited to BABIP. [Note: We are having a discussion of BIP at the MLB level, so I assuming that th epitched velocity is in the rnage of 85-95mpg.)

        [2] Quality of Hitter – It’s that important. The field is setup in favor of the defense. If you draw a circle around the 9 defenders to represent “average range”, it’s a wonder hitters get any hits. They do because the better hitters hit it hard, and at the MLB level, almost all of the hitters will hit pitches in the center of the zone. One of the keys here is that the difference in Runs/Game between the best and worst offenses at the MLB level is not “huge”. In college, where the talent is more concentrated, it might be #1.

        [3] Movement & Changing Speeds – there are very few ways to get a hitter to contact the ball (or miss it) away from the barrel of the bat. Having the ball end up in a spot just off from where they thought it was going to be and/or having the ball arrive slightly earlier/later that what they thought are the ways.

        I think what BABIP shows is [1] how hard it is to get hits simply by the design of the field/defense, and [2] how hard it is for pitchers to continually hit their spots, the margin for error is slim.

        Set up a pitching machine to hit the low and away spot. Tell the batter where the ball is going to be. Note how well (velocity and distance) they hit that pitch for 100 pitches.

        Now, repeat the process changing the velocity of the pitch each time.

        Now, repeat the process adding movement of pitch along with the changing of velocities.

        Vote -1 Vote +1

      • vivaelpujols says:

        There is a difference between hitter luck and defensive luck.

        The thought is that defensive luck is a bias and thus should be adjusted for, but if hitter luck is uneven at the end of the year shouldn’t that make it a bias as well?

        FanGraphs should have an article explaining the different ways to look at the Cy Young race.

        Vote -1 Vote +1

      • Andrew says:

        Thanks for the response, CircleChange. The answer is always sample size.

        Vote -1 Vote +1

      • CircleChange11 says:

        Sample size will ALWAYS be THE issue in these situations because we’re most often discussing single season events that use metrics that need larger amounts of data than a single season can supply in order to be reliable.

        So, the discussion comes up every year. *grin* … much like the term “valuable”.

        Vote -1 Vote +1

      • CircleChange11 says:

        Same thing with single Season UZR and how it affects WAR, when we use WAR as a major component of MVP discussions and the like.

        Not that my opinion carries anywhere near the amount of weight as Tango’s, but I agree that single season UZR could/should be regressed in the fielding runs calculation. We’ve had that scenario in the Granderson-Ellsbury-Bautista-Pedroia MVP discussion.

        I also find Tango’s suggestion to combine positional adjustment with UZR in the “defensive component” to be very helpful … so we don’t view a +5 SS and a +5 LF as being “equal fielders”.

        This is where I reach my “knowledge limit” in regards to knowing exactly what to do, why to do it, and evaluate whether it was done correctly.

        Vote -1 Vote +1

      • Say Hey says:

        jeez, why not just drop the other players entirely out of the equation and examine quality of pitches?! is this what the uber-sabres are moving toward? Then just look at maybe outside zone swing rates… who can induce the most swings at bad pitches is doing a pretty darn good job. (shrugs)

        If most of defense is about positioning and given proper positioning, the defensive quality variation is relatively immaterial… that is to say that most ML defenders make nearly all routine plays, then over the course of a full season’s worth of data the balls hit on the nose right at a fielder and the weak GBs should even out such that it isn’t meaningful to make any adjustment for differences in babip allowed.

        Vote -1 Vote +1

      • CircleChange11 says:

        jeez, why not just drop the other players entirely out of the equation and examine quality of pitches?!

        I know this comment is siad in jest but “what else is there?” In terms of what a pitcher can control (completely), the quality of pitches is it.

        The tough part there is that each pitcher will have a different % importance in [1] velocity, [2] movement, [3] location, [4] change in speeds, [5] sequencing.

        I wouldn;t even know how to begin to assign the % on that. Pitch quality could even be count, batter, situation dependent.

        As a pitching coach, i don’t talk to pitchers about hits allowed or defensive plays, we talk about the quality of the pitch (velocity, location, movement, etc). That’s all a pitcher can control.

        That said, where and how a pitcher throws a pitch drastically skews the odds in his favor one way or another. A hard sinker down and in is almost all in favor of the pitcher, a hanging changeup is just the opposite.

        Swing and misses have a lot to od with the batter as well, but I think in decent sample sizes, it would be very revealing in terms of spitch quality to see who is among the leaders in “swing and misses induced” (along with taken strikes as well).

        Vote -1 Vote +1

      • Mitch says:

        CC11… but how else can “pitch quality” be judged than by the opposing batting results? % of pitches that paint a corner?!? any other way to judge pitch quality is purely subjective… what’s more, what’s a quality pitch to one hitter may be a preferred pitch to another.

        This is more or less where I came into the discussion. I said that over a large enough sample (30 starts for a SP is more than enough) we should be looking at opposing batting results and that gives as accurate a picture as any other could possibly yield. Any differences in defense playing behind pitchers are let’s face it, not likely to be material.

        Vote -1 Vote +1

      • CircleChange11 says:

        I completely agree. Especially on the part that a quality pitch to one batter may not be to another.

        I also agree that at the ML level, 200 IP worth of stats is likely enough.

        I am one of the few that are sayoing that pitchers have influence over BABIP, and deserva as much as 50% of the credit/blame.

        At the level I work with, I ignore results quite a bit due to the small samples and vast disparity in hitter quality. At the ML level, looking at batter results is probably very good (for what we’re doing).

        If I were a major league pitching coach, my charts would be overlays of where we were suppossed to pitch a specific batter and where we did, and I would keep track of things of that nature (especially sequencing).

        Vote -1 Vote +1

  8. BobbyGrich says:

    For instance, statistics can never really define reputation and the psychological impact a player has, both through “clubhouse presence” but also impact on the opposition. When Derek Jeter steps up to the plate, there is a feeling that he’s a clutch hitter, that he’s not an easy out, even if he’s no better than Marco Scutaro. Torii Hunter may not be a true star, but he’s a great guy to have around the clubhouse. In other words, players like Hunter and Jeter end up feeling better than the sum of their parts, and there’s no way to statistically define what is between the sum of the parts and their actual value. That’s the Indeterminacy Factor.

    Vote -1 Vote +1

    • delv says:

      How is it a “factor” if it has no effect on what actually happens in the game (say, Jeter and Scutaro produce the same offensive numbers)?

      Vote -1 Vote +1

      • BobbyGrich says:

        First of all, we don’t know how a given player’s presence influences other players. We could extend this to managers, and question whether or not they bring out the best in their players (see: Scioscia, Mike and Napoli, Mike).

        But psychology isn’t objective. Even if Scutaro and Jeter produce the same statistics, Jeter’s presence in the box has a different feeling to it, it impacts both teams in a different way, perhaps even influences the defense. We might even say that Scutaro might have better skills at this point but Jeter has a psychological edge (sort of like how you can never beat up your big brother, even if you are stronger).

        My point is not to say WHAT that factor is, but to point out that there IS a factor, and that it cannot be defined statistically. To quote Lyall Watson, “If the brain were so simple that we could understand it, we’d be so simple that we couldn’t.” There’s always going to be a gap between statistical metrics and actuality; my point is to consciously include that “X Factor” in the discussion. Not to rate it or give a number of any kind, but to be able to say “There’s something about so-and-so that makes him better than the numbers show.”

        Vote -1 Vote +1

      • BobbyGrich says:

        In addition to the X Factor, also the “overall picture” of a player. If we’re going to take your approach and look on “What actually happens in the game” then we should throw out FIP and stick with ERA, because that’s what actually happened in the game (or even throw out the “E” in ERA altogether).

        So I’m talking about including two things, really: One, the X Factor; two, the overall picture, or gestalt of statistics. If we take two players:

        Player A: .810 OPS, .362 wOBA, 6.6 WAR
        Player B: .995 OPS, .421 wOBA, 6.1 WAR

        I’m not comfortable with the idea that Player A is better because he has a higher WAR. He’s a better fielder at a more difficult position, and a better baserunner to boot, but he’s not nearly as good of a hitter and WAR on its own doesn’t show that.

        Which player is better? Again, I don’t think WAR is enough. And even if you have plenty of other statistics, how to weigh them all together? If we can all agree that scoring runs is 50% of the game and preventing them is 50%, how much of preventing them is defense? And how much of scoring them is baserunning?

        And which player would I rather have on my ballclub? That depends upon which other players I have already. WAR says that all things being equal, Player A’s team wins half a game more over the course of a year; is that true, though? If anything, we may just have to accept a +/- factor with WAR, that a WAR within one run on either side is essentially equally of value. We need to go at least 1 WAR away to really get a different caliber of performance.

        Just ruminating, here…

        Vote -1 Vote +1

    • BlackOps says:

      Right, and we can’t quantify it or, more importantly, even prove it exists, so why do people still talk about it? Its hard to say that its even a part of an evaltuaying process when it can’t be evaluated.

      Vote -1 Vote +1

  9. Paul says:

    The balls in play argument could go on forever. Because there are so many variables involved that are not currently measured, very specific outcomes data from Pitch/FX data, this is always going to come down to the two camps staked out here: 1) Comfortable with an admittedly flawed stat like WAR because I believe luck is more of a factor than skill in random variation; 2) Not comfortable with the notion that some freakishly godlike humans have the ability to throw a ball 1/4 of an inch closer to a RH batter’s hands with late break and 2 mph more velocity than 99% of peers, thereby making a grounder through the right side impossible even when Casey Blake is over there.

    Vote -1 Vote +1

  10. Andy says:

    One point for giving credit to Verlander for his low BABIP: He is the only Tigers starter (and one of the few Tigers Pitchers) whose ERA is lower than FIP… Tigers not known for their defense outside of Austin Jackson.

    Vote -1 Vote +1

    • Ryan says:

      Yup. I posted this in the now-deleted UZR thread yesterday, but JV’s ERA has outperformed his FIP, 2.29 to 2.91. The REST of the Tigers pitchers’ have underperformed their FIP, 4.43 to 4.23.

      Vote -1 Vote +1

    • joe says:

      Posted this above… Verlanders BABIP is ,235, every other starter except FIster (9 starts) is over .300

      Are we really chalking that delta up to a generic combination of defense and luck and calling it a day when we leave the luck component in the other stats when measuring value?

      Vote -1 Vote +1

  11. Andrew says:

    Verlander has been incredible this season not only because he is a great pitcher, but because he’s also been tremendously lucky (.235 BABIP vs. .286 for his career). Anyone can see that luck has been a big factor in his success. CC Sabathia has a lower FIP and xFIP than Verlander but a much higher ERA because he’s actually been somewhat unlucky this season (.320 BABIP vs. a career avg of .291).

    Am I prepared to say that Sabathia deserves the AL Cy Young ahead of Verlander? Absolutely not. CC is a fantastic pitcher, and as a Yankees fan I’m very grateful for that. He’s just as good a pitcher as Verlander, and if there were no such thing as luck, he may have been slightly better this year. But there’s no doubt in my mind that Verlander deserves the Cy this year.

    An essential part of what makes up a great season is luck. Is Joe DiMaggio the greatest hitter of all time? No, but he was a pretty damn good hitter who happened to go on a very lucky stretch, and that cemented his place in baseball history. Was Don Larsen anything other than a completely average pitcher? Probably not, but his single-game performance will be remembered forever. The point is that you may be able to separate luck and skill if you account for every single variable in the equation, and that’s very important for determining a player’s true talent level, but luck and skill are equally essential parts of an incredible season. I might be able to take the luck out of Verlander’s season, but I don’t want to. That’s why he’s the clear Cy Young award winner this year, and, in my opinion, the MVP as well. FIP is a great tool for predicting future performance, but when evaluating past performance, I much prefer B-R’s results-based WAR, which had Verlander and Bautista tied at 8.5 the last time I checked.

    Vote -1 Vote +1

    • Dave Cameron says:

      This is the thing that always gets me – B-R’s WAR is absolutely not results based. The defensive adjustment in their calculation is based on an estimate of defensive value extrapolated from their overall Total Zone (critics of the usefulness of single season defensive data, start yelling… now), with an assumption that every pitcher on the staff got the same level of defensive performance behind them.

      Because the Tigers have a team TZ total of -14, Verlander actually gets treated as if his defense has harmed him, and his adjusted runs allowed total is even better than “what actually happened”. Not only does doing pitcher WAR that way ensure that Verlander gets full credit for his .235 BABIP, it actually assumes that his BABIP should have been even lower than that if not for the poor defensive support he’s been handed.

      Yeah, I’m sorry, that’s just not “what actually happened”. A FIP-based WAR only measures tangible outcomes that actually occurred in real life. An RA-based WAR that assumes all pitchers on the same team get even defensive support measures an assumption of what might have happened if you buy into Total Zone completely, and if you believe that the distribution of runs saved/lost by the defense was spread evenly around the pitching staff.

      Because it squares up better with traditional valuation metrics, it’s more easily acceptable, but the idea that their WAR measures “what happened” and ours doesn’t is just wrong.

      Vote -1 Vote +1

      • Paul says:

        Why would fielders suddenly play differently for different pitchers? Is this something like “clutch” fielding?

        Vote -1 Vote +1

      • delv says:

        Paul: most of it will probably be random and hard to account for: eg. 1) one pitcher played more home/away games than another pitcher and the team’s fielders do better in certain parks than others; 2) maybe the manager plays worse fielders behind a certain pitchers (maybe for their offense, or maybe because he thinks they’re good defensively, or maybe the OFs don’t matter as much for a groundballer or something); 4) catcher match-ups 3) maybe this pitcher is often played in day games (eg. Jeff Niemann) and the fielders go drinking often late at night…. In other words, a lot of random stuff.

        A little of it might have to do with the pitcher’s pace and how “in the game” the fielders are.

        Vote -1 Vote +1

      • Dave Cameron says:

        They aren’t robots – they don’t perform the same every day. An easy illustration of this is offensive run support for pitchers on the same team – a line-up that averages 4.5 runs per game doesn’t score 4.5 runs per game for every pitcher on the staff over the course of the season. Some guys get 3.0 runs per game, some guys 6.0 runs per game. Why? Random variation.

        Vote -1 Vote +1

      • Paul says:

        delv: That all sounds so similar to “clutchiness.” Why not just call it luck and stand on that?

        Vote -1 Vote +1

      • Andrew says:

        There’s no doubt in my mind that Verlander does not possess some magical ability to limit hitters to an extremely low BABIP in spite of pitching in front of a defense that is essentially average or slightly below average (a quick glance at the 2001 Tigers teams stats here on FanGraphs suggests that UZR has them as slightly below average as well, without adding up all of the values). But you’re missing the point of my original post, which is that I don’t *care* whether Verlander’s extraordinarily low BABIP is caused by skill or random variation. For the purposes of evaluating his performance this season, I’m willing to give him credit for being quite lucky in addition to quite good. That’s why I look at rWAR, because I *want* him to be fully credited for things he doesn’t necessarily have complete control over, but things which he managed to accomplish nonetheless in the small sample of a single season.

        Yes, FIP and FIP-based WAR measured what happened, but it does so by attempting to remove random variation from the equation. And as I said, that’s an incredibly useful tool and there’s nothing more valuable for telling you how a pitcher will perform next season or if a pitcher’s true talent level is actually as good or bad as their results might suggest. But I have a philosophical difference in my approach to awards voting – which is that I don’t want to remove random variation from the equation, because you can’t take away what Verlander’s done in 2011.

        Vote -1 Vote +1

      • delv says:

        Paul: my understand of “clutch” is something like… “player X performs better in situation Y because he is intentionally trying to, but trying less(?) in other situations” where “situation Y” tends to be a higher-leverage situation.

        I don’t think anything I said was similar to that. Players DO perform better or worse in certain situations, and I think a lot of baseball research is aimed at trying to find out what situations those are and whether they are repeatable, or random, or a result of a complex combination of deterministic but individually tough to parse factors.

        Vote -1 Vote +1

      • Dave Cameron says:

        We’re not attempting to take out random variation either – that should be clear in how we handle hitter WAR. We’re trying to isolate individual performance from the context in which it was produced in. Yes, FIP strips out random variation while also stripping out defensive support, which is why Eric and I talked about a FIP-based pitcher WAR not telling the entire story.

        However, a metric that tells a large part of the story accurately is still very useful, as we can then make adjustments for what we know it leaves out from there.

        If you’re going to rely on an RA-based version of WAR, you’re essentially saying that defense doesn’t matter, and that you’re okay rewarding a pitcher for what very well could have been the performance of his teammates. If we’re not willing to accept that logic for Win-Loss record, why are why willing to accept it for RA?

        Vote -1 Vote +1

      • Paul says:

        Dave: I think what you’ve here is decided that random variation applied to “human-ness” will not be so easily called into question as “luck” when we get into large samples. You know very well that over the course of a full season complete random variation as an explanation for that much of a BABIP differential is astronomically unlikely.

        Of course, now if we’re going to justify FIP-WAR on the basis of players magically playing better for Verlander, don’t you also have to call into question home-road splits? We know that players generally play worse on the road. In your home-road splits argument regarding Kershaw, you are assuming that he is a robot who pitches exactly the same whether he’s at home or on the road, and only adjusting for the offensive environs of the structure of the ballpark itself.

        Vote -1 Vote +1

      • Paul says:

        Delv: But in this context we are literally concerned with why Detroit Tigers fielders played better for Verlander instead of his opponents over the course of an entire season.

        There are two choices in defending this position: 1) It’s very unlikely, but he was freaking lucky. Get over it; 2) Hire a private investigator to follow the team and see if there is a possibility that Johnny Peralta has a longstanding ritual of partying with “models” the night before whoever the ace of the staff is is pitching.

        For number 2 we’re arguing random variation when that’s clearly not what we’re talking about. Situational performance is situational. Do you realize how much information you would need to have to sort this out?

        Vote -1 Vote +1

      • RC says:

        “Why would fielders suddenly play differently for different pitchers? Is this something like “clutch” fielding?”

        Because pitchers have different batted ball distributions?

        If you have an outfield of Gardner, Ellsbury, and Ichiro, and an infield of 4 Yuniesky Bettancourts, the defensive adjustment is going to be very different for Halladay than its going to be for Cain, (and even more so for Wakefield, or any other extreme flyball pitcher)

        Vote -1 Vote +1

      • Andrew says:

        Dave: There is a defensive component to RA-based pitcher WAR, as you well know, having referenced it above. I understand that you are trying to limit the components of fWAR to only those components which you feel can be directly attributed to the pitcher and his skillset – to tell as much of the story as can be told with a great degree of accuracy – but when evaluating a player’s single-season performance I believe that it’s much more important to include luck than to remove it for the sole reason that Austin Jackson may have caught a fly ball for Verlander and let a similar one fall in for Max Scherzer a couple days later. Both versions of WAR are inherently flawed, and I am merely trying to convey my belief that rWAR is more useful for evaluating past performance – for instance, the fact that Zack Greinke has not been as stellar over the past two years as he was in 2009 – whereas fWAR is more useful in predicting future performance – i.e., by telling us that, based upon his peripherals, Zack Greinke has actually been quite good the past two years and might easily return to his 2009 form in 2012.

        Vote -1 Vote +1

      • RC says:

        “There are two choices in defending this position: 1) It’s very unlikely, but he was freaking lucky. Get over it; 2) Hire a private investigator to follow the team and see if there is a possibility that Johnny Peralta has a longstanding ritual of partying with “models” the night before whoever the ace of the staff is is pitching”

        There’s option 3:

        He’s allowing tons of weak contact. The fact that something is UNSUSTAINABLE does not make it LUCK. THEY ARE NOT THE SAME THING.

        Human beings are not random number generators.

        Vote -1 Vote +1

      • Paul says:

        RC: Should have added to that sentence, “for different pitchers… on the same team?”

        I believe it was before last season that an article on FG was lauding Zack Greinke for accounting for the value of defense by specifically pitching to contact that would result in a fly ball to the best LFer ever, David DeJesus.

        Maybe you can help me understand how we use that as evidence for the importance of defensive metrics, but not acknowledge it as a skill for the pitcher. Today some are additionally arguing, apparently, that David DeJesus played extra good for Greinke (perhaps because Greinke told him he was gonna make hitters hit the ball to him?). In that case, if we segregated DeJesus’s UZR for games where Greinke pitched it must have been over 100.

        Vote -1 Vote +1

      • vivaelpujols says:

        Assuming all pitchers on the team get even defensive support is a better assumption to make than assuming that the difference between a players BABIP and the league average BABIP is all defense IMO – most of that is going to be luck unrelated to the quality of the defense.

        Vote -1 Vote +1

      • CircleChange11 says:

        According to FG, DET’s Arm rating is above average and the team’s range is horrible (-13).

        Wouldn;t that support the idea that DET’s defense isn;t helping Verlander’s BABIP?

        It doesn;t seem likely that they think “Oh man, Justin’s pitching today, we better step up our game.” I would think, if anything, that when Verlander is on the mou8nd the defense would think the opposite … that Verlander is going to save the defense fro any mistakes … versus a pitcher that is more defense dependent.

        In reality, the team is likely trying to play their best defense at all times. There could be a case that randomness factors in … but couldn;t we look at all of the BIP for hits for each pitcher and decide which hits were actually “real hits” and which ones were not and compare that to the actual results?

        Vote -1 Vote +1

  12. Cal says:

    Yea I really do want to believe that “clubhouse presence” has some sort of discernable impact on your team or teammates – like Jeter making his team better. But if intangibles mattered so much shouldn’t we be able to see improved performance?

    Vote -1 Vote +1

  13. Paul says:

    It is a well-known fact that Jeter’s clubhouse presence added (~) 2.2 to Posada’s career WAR.

    Vote -1 Vote +1

  14. Kyle says:

    This was great stuff, guys.

    Vote -1 Vote +1

  15. SaberTJ says:

    Love the discussion guys.

    Vote -1 Vote +1

  16. Luck is a factor in value. Actual value. Luck plays apart in what actually happened.

    Should it really matter if Verlander “caused” the ball to be hit at a defender. (And who can know that to a certainty?) The ball was hit at a defender, and thus an out was made. Next year, with regression to the mean, he’ll likely be less valuable. But there’s a difference between measuring the value someone has had over the course of a season, and whether they have any real chance to repeat it.

    Isn’t the question then, in terms of value, not luck but the quality of defense behind Verlander and Kershaw and trying to extrapolate their impact on these pitcher’s totals? For that, shouldn’t we have a more than cursory discussion of the defensive quality of the position players behind these pitchers?

    Vote -1 Vote +1

    • Colin says:

      Not to mention I didn’t recall too much argument the other way regarding Josh Hamilton’s MVP award and his flukey BIP average.

      Vote -1 Vote +1

    • Matt H says:

      I disagree. Luck should not be a factor in value. If I flip 2 coins 100 times, and one of them comes up heads 56 times while the other comes up heads 45 times (and let’s assume we want heads), is the first coin really more valuable than the second? Sure, it technically provided more value, but I wouldn’t want to give it an award for “Most Valuable Coin”, because it didn’t actually control it’s outcomes; the extra heads were simply a matter of luck. Similarly, players should not be rewarded for lucky outcomes on their plays. We should try, though we may fail, to factor the luck out of a player’s performance and look simply and what he had control over.

      Vote -1 Vote +1

  17. Isaac Yankem says:

    Fangraphs uses FIP-based WAR. Would it help to calculate an ERA-based WAR (or RA-based WAR, since the whole concept of ER is probably too subjective compared to RA)?

    That way we would have an approximate range of WAR that a pitcher falls into depending on their ability (or lackthereof) to control babip.

    The FIP-based WAR would be their score if they have no control over babip.

    The RA-based WAR would be their score if they have a very high degree of control control over babip.

    Vote -1 Vote +1

    • Tangotiger says:

      And WAR at FAngraphs does one, while WAR at BaseballReference does the other.

      Everyone wins.

      Vote -1 Vote +1

      • vivaelpujols says:

        There are other differences in the calculations though.

        bWAR adjusts for team defense, and it also probably uses different park factors and stuff.

        I think it would be nice for FanGraphs to have their own versions of WAR based on FIP, xFIP, tRA and RA.

        Vote -1 Vote +1

  18. tim says:

    This makes for fun reading when you don’t know what BABIP stands for.

    Vote -1 Vote +1

    • TFINY says:

      The Sabremetric Library (found linked from the front page) is a great place to start for any abbreviation you don’t recognize.

      BABIP is basically the batting average on balls put into play; no strikouts, walks, or homeruns. Hitters have a fair amount of control (based on things like how fast they are, so they could leg out an infield grounder). Pitchers have less control (although some, the exact amount is debatable) over their BABIP; again, the batting average on all balls put into play against them.

      Vote -1 Vote +1

  19. Anon21 says:

    I wish they would just change the name of the award to “Most Valuable Position Player” and end the yammering about whether pitchers should be considered. No, they shouldn’t. Award asymmetry makes no sense. If you’re going to have a high-prestige award that only pitchers are eligible for, you should also have a high-prestige award (so, not the Hank Aaron thingy) that only position players are eligible for. That award is already, de facto, the MVP Award, but because of infelicitous wording people just won’t let the issue rest.

    Vote -1 Vote +1

    • TFINY says:

      You mean the wording that specifically allows pitchers to be considered? You may disagree with the definition, but the definition is not infelicitous when it is spelled out in the definition of the award.

      Vote -1 Vote +1

  20. SaloF says:

    @Dave
    If we have defense metric like UZR, why not use it.

    Give average credit for their BABIp to pitchers playing for average defenses, and give mor credit to pitchers who have lower BABIP with below average defenses. As you said, pitchers play with the same defenses the whole year!

    Hope you are doing better

    Vote -1 Vote +1

  21. Scott says:

    “Roy Halladay has just out-pitched him against better competition” I don’t know where you’re getting your numbers but by OPS Kershaw has faced tougher competition than Halladay– the difference is minimal but enough that you can’t really say ‘Halladay has faced better competition.’

    Vote -1 Vote +1

  22. Hurtlocker says:

    Whe someone can explain Steve Carlton’s 1972 season rationally, then we will understand pitching. Not to mention his 1973 season where he was only the third best pitcher on his own team. baffling.

    Vote -1 Vote +1

  23. Colin says:

    How do you factor Detroit’s god awful defense into the equation for JV’s BABIP when considering MVP?

    Vote -1 Vote +1

    • Jeff says:

      I was just going to post this. If anything, Verlander having one of the lowest BABIPs in the game is an argument against DIPS being the final answer to pitching evaluation. The Tigers have only two full-time plus defenders (Peralta [whatever that says about single season fielding metrics] and AJax – Santiago has been solid in his utility role) and a whole host of mediocre to awful defenders (Betemit has cost the Tigers as many runs this year in a third of a season than the notoriously bad-at-defense Miguel Cabrera). It wouldn’t surprise me at all if it was a lot harder to get a hit off Verlander than most other pitchers, even when his strikeouts are taken out of the equation.

      Of course, it doesn’t hurt Verlander’s case that his DIPS statistics are also outstanding.

      Vote -1 Vote +1

  24. Matt H says:

    There are so many comments that I couldn’t read through all of them, but why can’t we look at LD%, GB%, IFFB%, etc. to determine what a pitcher’s BABIP should be, given an average defense? Whether or not suppressing line drives is a sustainable skill, it surely is something that the pitcher should receive credit for. It is also clear that pitchers do have control over, and the ability to, sustain high (or low) ground ball rates, yet this is not factored into FIP. Why not? Why not say, look, the average pitcher with X GB% has a Y BABIP, so let’s calibrate a pitcher’s FIP using that BABIP instead of league-average? I understand the desire to separate pitching from defense, but surely we can look at batted ball percentages and attribute those to the pitcher as well, right?

    Vote -1 Vote +1

    • Scott says:

      You can do this, and Kershaw’s BABIP right where you’d expect it to be; at least it was when I did it a few starts ago.

      Vote -1 Vote +1

    • CircleChange11 says:

      It is also clear that pitchers do have control over, and the ability to, sustain high (or low) ground ball rates, yet this is not factored into FIP. Why not?

      This is the fctor (GB%) that has the highest year-to-year correlation for pitchers. In other words it’s the factor they influence the most (according to the chapter in BBTN).

      I asked this same question in regards to GB% when looking at Jaime garcia a couple of years ago.

      IIRC, FIP does a decent job accounting for this in the HR/9 component.

      High GB pitchers give up fewer homers and more singles.

      At some point, if not already, we’re going to be able to look at [1] pitch factors (velocity, speed, movement, location) and [2] batted ball factors (velocity, trajectory, direction, location, etc) and have % probability of it being a hit (based on large sums of data) … and then credit both the batter and pitcher for that portion … and the defense for the remaining portion depending in whether it was an actual hit or recorded out.

      Then shortly after that, The Matrix, will happen.

      Vote -1 Vote +1

  25. Scott says:

    Also, if one were to look at the called strike zones of Halladay/Lee/Kershaw… Let’s just say that robot umpires would be helping out Halladay and Lee.

    Vote -1 Vote +1

  26. Matt says:

    Here’s a factor people leave out. Playing on a bad team. This is Kershaw’s stats when he didn’t get run support from March – June. 2.93 ERA and a 8-3 record. From July on 1.56 ERA with a 12-2 record. Run support can make a huge difference. Imagine how Kershaw’s season would’ve went had he gotten that run support from July on, and got it from day one. Chances are his ERA would even be lower. So I don’t think his ERA would be much different in Philly with more run support than his ERA is now playing for the Dodgers.

    Vote -1 Vote +1

  27. Eric Cioe says:

    I swear to God there is a vast conspiracy among some writers of this site against Verlander. When Felix had an ERA of around 2.25 and a FIP around 3.00 last season, no one brought BABIP up at all. Verlander actually had a better FIP and yet no one suggested he win the Cy Young because his ERA was a run higher. You have almost exactly that situation this year, but this time it’s Verlander in the lead, and people are suggesting that Sabathia’s miniscule lead in FIP and WAR is enough to make him Verlander anything other than a slam dunk, unanimous vote.

    Flyball pitchers don’t fit into some of the fangraphs writers’ narrative about what makes a good pitcher, so Verlander doesn’t get the same defense that Felix did last year.

    Vote -1 Vote +1

    • Notrotographs says:

      Are there laws about murdering strawmen?

      Vote -1 Vote +1

    • CircleChange11 says:

      There is a bias against DET, IMO with some guys. It could be because they are not viewed to be a “smart” front office, that Leyland is as old school as it gets or because they play in the same division as the Twins.

      I usually bring it up when Miguel Cabrera is left out of MVP discussions despite his usually top of the league WPA and ultra consistent elite level batting performance and he’s still “young”.

      I agree with you that there are some that seem to be going out of their way to marginalize Verlander’s season, when most see the obvious that he’s far and away the best pitcher in the AL this year.

      The comparison to Pedro’s 1999 was way over the top, and most readers pointed that out. Comparing ANY season to Pedro is akin to comparing batters to a PED-assisted Barry Bonds’ season … which the conclusion could be that no player in any season is good enough to win the MVP or CYA.

      I would think Verlander would be praised over and over for finally putting all of his skills together. He’s durable, hard throwing, great movement, and mixes his pitches well … now he’s not allowing many walks, still striking out a bunch, and generally has no hitter stuff much of the time. For the last few years I’ve watched Verlander and wondered how anyone hits him. He’s got heat, good offspeed, and a wicked deuce. I guess sometimes batters could guess right, but even then …

      To me, he is the prototype for tall power pitcher, the way Felix is the model for a fastball-changeup pitcher.

      It’s as if there’s not enough love to go around for two guys.

      I would expect to see far more praise than what I am seeing.

      Vote -1 Vote +1

      • Hurtlocker says:

        I don’t care about Detroit at all, I’m a NL guy and Verlander is clearly the best pitcher in the AL by far. The NL is anothre story, but if Verlander doesn’t win the Cy Young he got robbed.

        Vote -1 Vote +1

    • Colin says:

      Nice point Eric.

      Vote -1 Vote +1

  28. Matthew Cornwell says:

    Just a few days ago, somebody said to me that pitchers have “no control over BABIP”. I replied:

    Of course pitchers have impact on BABIP – just less than they do over K and BB and it takes years of sample size to know how much (if any) skill any particular pitcher has at it. In fact, it takes 3,700 batters faced for r=.5 in terms of BABIP. So when Roy Oswalt (who is a worse-than-average BABIP guy for his career) has a great BABIP in 2010, we can assume that he got lucky and mathematically regress for his true performance. When Matt Cain has a .250 BABIP, we shouldn’t assume he was really a .300 BABIP guy since he has a proven track record of reducing BABIP. But we should still regress anyway and he still comes out with a good BABIP, just not .250, which was likely impacted by luck”

    I thought these thoughts were givens in the sabermetric community, but see so many people still making the case that pitchers have no or very limited BABIP impact. Or is the discussion just centering around HOW to apply it to WAR?

    In terms of WAR – I prefer to look at Fangraphs when the sample size is too small to determine BABIP skill; so say the first 3-4 years of a pitchers career. And it would be my first choice when determining a Cy Young winner. When the BABIP regression point is close to .5 (say years 5-7), I look at both WARs and average them. If a pitcher has played 8 or more seasons, you had better believe I look at Sean Smith’s WAR. Not only do we have a large enough sample size in regards to BABIP, but the sample sizes are large enough that we should be considering pitcher impact on the running game, sequencing , GIDPs, wild pitches and passed balls, and other situational splits. On top of that, Smith’s WAR also includes pitcher offense and an AL vs. NL league quality adjustment as the icing on the cake.

    So the question “which WAR is best” has a different answer depending on what you are evaluating. Single seasons? Cy Young? Short careers? Fangraphs, definitely. Long careers? Baseball reference, easily.

    Anybody in agreement?

    Vote -1 Vote +1

  29. James M. says:

    Great discussion. But I can’t believe nobody brought up WPA (unless I missed it). If we re-named the CYA as the Most Valuable Pitcher Award, wouldn’t we want to know how much each pitcher contributed towards his team’s wins? FIP and WAR may be better measures of how well they pitched without regard to the game situation. But is that the best measure of value?

    Here are the current Top 5 standings in WPA, starting with #5.

    3.83 Kershaw
    4.19 Hamels
    4.47 Lee
    4.71 Halladay
    5.23 Kennedy

    That’s right, Halladay has contributed nearly a full win more to his team than Kershaw has to his. Hamels and Lee have been more valuable as well.

    The real shocker here though is Ian Kennedy. A half win better than Halladay. 1.4 wins ahead of Kershaw. And he was never even mentioned!

    Now I’m not saying WPA should be the only stat considered. But doesn’t it have to be part of the discussion?

    Vote -1 Vote +1

    • Matthew Cornwell says:

      WPA for pitchers (as opposed to WPA/LI) is also heavily influence by luck and defense. The difference between the two is largely the pitchers “Clutch” score. I am definitely okay with including “Clutch” at a career level (maybe regressed some), but I do not know how credible it is in-season, since defense and luck play a big part.

      If all things were equal, I’d consider looking at a pitcher’s “Clutch” score to break a Cy tie. Of course, I (in the minority) would look at pitcher offense too as a tie breaker. Lee and Kershaw get a solid boost over Halladay in this regard. Not sure about Kennedy as a hitter.

      Vote -1 Vote +1

      • Puzzled says:

        Why would you look at pitcher offense when the award is for who’s best at pitching? Pitcher offense should not come into play for Cy Young at all.

        Vote -1 Vote +1

      • Matthew Cornwell says:

        Because a run created is worth a runs saved. Any way that a pitcher helps his team win should be counted in his resume. If there were clause that said ONLY pitching should be considered, I would concede, but as far as I know – there isn’t. If people can use fluffy, non-quantifiable factors to support their favorite, I can use something real and tangible like offense.

        Vote -1 Vote +1

  30. Hector says:

    WAR, what is it good for? Absolutely nothin’, say it again, y’all!

    After last night’s victory against the Giants, Kershaw had to seal the deal for the Cy Young, especially if he holds for the ERA/win/K triple crown. I’m a Dodgers fan, but in all honesty, I think Kershaw wins by a nose.

    Vote -1 Vote +1

  31. Matthew Cornwell says:

    “WAR, what is it good for? Absolutely nothin’, say it again, y’all!”

    That would be pretty clever…if you weren’t about the 10th person I have seen do that.

    Vote -1 Vote +1

  32. PymnToody says:

    Guy .. Beautiful .. Wonderful .. I’ll bookmark your web site and take the feeds additionallyI am glad to find a lot of useful info here in the put up, we’d like develop extra techniques in this regard, thanks for sharing. . . . . .

    Vote -1 Vote +1

  33. Matt says:

    Question: Why can’t UZR be made into a “UZR behind pitcher” type stat? This wouldn’t nearly answer everything (especially considering SSS), but it would be useful.

    Vote -1 Vote +1

  34. jorgesca says:

    Is there a stat that includes GB%/FB%, LD% HR rate(not sure how to include it since BABIP doesn’t) and could somehow be incorporated into BABIP? I think it could give a better idea if it was luck or not.

    Vote -1 Vote +1

    • Matthew Cornwell says:

      Except for the fact that those types of stats make huge assumptions too. For example, something called PZR understands that GB pitchers tend to have worse BABIP than FB pitchers, but extreme GBers tend to suppress hits on GB at a far greater rate than PZR says. So PZR says that Greg Maddux “should” have given up hundreds of runs more than he did – since he was a GB pitcher who also suppressed BABIP. This is true for a lot of extreme GBers.

      The best way to see if a guy has career BABIP skill (or how much) is to look at his BABIP compared to mates and regress the gap based on # of balls in play. If the sample size is large enough, why do we need to know the GB/FB breakdown?

      Vote -1 Vote +1

  35. jorgesca says:

    I just saw my last post was already somehow answered, but the replier didn’t say how, I’d really like to know this, thanks.

    Vote -1 Vote +1

  36. fraulkpluck says:

    Guy .. Excellent .. Amazing .. I’ll bookmark your blog and take the feeds additionallyI am happy to seek out numerous helpful information right here within the publish, we’d like develop more techniques on this regard, thank you for sharing. . . . . .

    Vote -1 Vote +1

  37. Matt C says:

    I didn’t read all the comments so I don’t know if this has been brought up but I just want to say that I think BABIP can be highly overrated when gauging a high K low BB guy like Verlander. The reason being is that he simply doesn’t allow that many balls to be put in play(especially with a higher HR rate like he has this year) so the difference in BABIP isn’t that big of deal. If I calculated it right the difference between his BABIp this year and last year is 30 hits. Now that may sound like alot but over the course of 33 starts that’s less than 1 hit per start. So depending on what type of hits they that may equate to what 7-10 runs at the most over the course of the season?(given his current LOB%) Even if you tack on 8 more runs(which would be more than his current LOB) his ERA would be 2.58. Which would still be really really good.

    Now I know there are more variables than that and it wouldn’t be that simple but it gives you a general idea. I think some people just assume his ERA would be way higher if his BABIP wasn’t so low but I don’t think that would be the case. And if your’e going to penalize him for a low BABIp maybe you should knock some of his HRs off since this is the highest HR rate he’s had in 5 years.

    Vote -1 Vote +1

  38. noseeum says:

    The thing that frustrates me about fWAR is the completely contradictory philosophies. It’s like Fangraphs found people on two different planets and assigned one of them the task of developing position player WAR and one of them the task of developing pitcher WAR. They both happily went about their tasks and never once had a conversation.

    Why all this hemming and hawing abut what a pitcher was and wasn’t responsible for when there is absolutely none of the same on the offensive side?

    Hitters face varying qualities of pitchers and varying qualities of defenses as well. Yet it matters none to fWAR.

    A pitcher may have less control over what happens when a ball is put in play, but guess what? He threw a pitch and the hitter made contact. If you don’t want to get burned by your defense, strike everyone out.

    I’ve said this countless times in these threads, but a FIP based WAR has the same weaknesses an ERA based WAR. Dave always responds that “a FIP based WAR only measures tangible outcomes that actually happened.” but this is not true for a couple of reasons:
    -ignoring 70% of what happened
    -the method of calculating FIP. FIP is not just a simple sum of actual events. It’s an extrapolation of those events with a very particular point of view. It’s attempting to approximate what you would expect someone’s ERA to be given the number of home runs, walks and strikeouts per inning he’s got. It’s not reality. There’s way too much judgement built intothose stat to say it’s reality

    See the definition at BP here:
    http://www.baseballprospectus.com/glossary/index.php?search=FIP

    The only judgement in ERA is the errors given out by the scorekeeper. The rest is a simple calculation of actual events. ERA is much more reality than FIP.

    This is why I think a FIP based WAR shouldn’t really be presented without also presenting an ERA based WAR. They are two sides of the same coin.

    I don’t like bWAR for the reasons Dave stated above, but calculating WAR by replacing FIP with ERA and presenting both would be a very informative way to present pitcher performance for a given year.

    Vote -1 Vote +1

    • Matt C says:

      The problem I have with an FIP based WAR is why don’t you do the same thing for hitters? Now I know that hitters have more control over their BABIP than pitchers do but plenty of times hitters get lucky and unlucky due to flukey BABIPs but that doesn’t factor into their WAR yet they do for pitchers. Why is that? It seems like if you’re going to do it for one you should do it for the other.

      Vote -1 Vote +1

  39. Matthew Cornwell says:

    The following two articles should be required reading for anyone interested in the topic. They will answer a good chunk of the questions and confusions about BABIP. Fangraphs should make sure all of their readers have access to them:

    http://www.insidethebook.com/ee/index.php/site/comments/misunderstanding_dips/

    http://www.insidethebook.com/ee/index.php/site/comments/career_dips_numbers/

    Vote -1 Vote +1

  40. Holiday says:

    This definitely makes perfect sense to anyone

    Vote -1 Vote +1

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Current ye@r *