FanGraphs Baseball

Comments

RSS feed for comments on this post.

  1. tERA?

    Comment by Kevin S. — August 26, 2010 @ 12:12 pm

  2. SIERA?

    Comment by Mark — August 26, 2010 @ 12:24 pm

  3. Not sure if this has been asked before, but on the leaderboard why can’t you sort by tERA? There’s no column for this stat there.

    Comment by Matt — August 26, 2010 @ 12:26 pm

  4. Dave…THANK YOU!!!

    I really tire of the “old school vs new school” junk that is tossed around by some with such deragotory implications on this and every site that discusses stats in any way shape or form.

    The truth is, there just isn’t one individual metric that we can point to and say, “well, X pitcher leads the league in THIS, so clearly he is the best pitcher and the most deserving of the Cy Young Award.” It would serve everyone best if they just try to learn everything they can about stats before stating absolutes. I actually wanted to reply to the Keith Law/Buster Olney tweets with something along those lines that would encourage discussion last night, but 140 words can be very limiting at times.

    Comment by randallball — August 26, 2010 @ 12:27 pm

  5. (140 characters)

    Comment by randallball — August 26, 2010 @ 12:33 pm

  6. I am not sure how Onley can say Trevor Cahill is the leader for the Cy Young, even using traditional metrics. Clay Buchholtz has a lower ERA, more wins, and a higher K/9 rate. He also hasn’t allowed a run in his last 23.1 innings pitched.

    I am not saying Clay should be the leader, I am just saying he should be before Cahill.

    Comment by Mike — August 26, 2010 @ 12:37 pm

  7. might the defense have something to do with that low babip

    Comment by frank pepe — August 26, 2010 @ 12:38 pm

  8. Ground ball rate?

    Comment by Aaron — August 26, 2010 @ 12:39 pm

  9. How about WPA?

    Halladay and Sabathia are in front now

    Comment by DT — August 26, 2010 @ 12:51 pm

  10. Yeah, that sinker and his intense ground ball rate has nothing to do with that low BABIP,

    Comment by Travis — August 26, 2010 @ 12:51 pm

  11. Pretty sure there’s a lot more to pitching than those five metrics, including all the batted ball types…

    Comment by Andy S — August 26, 2010 @ 12:56 pm

  12. I really liked the approach SG over at RLYW took when looking at the AL Cy candidates last week. “my preference is to calculate the runs saved above a replacement level pitcher using RA, FIP and CERA and then take the average of them.”

    http://www.rlyw.net/index.php/RLYW/comments/who_are_the_best_2010_al_cy_young_candidates_through_games_of_august_20

    Comment by keithr — August 26, 2010 @ 12:58 pm

  13. There SHOULD be a single number that represents how well a pitcher pitched in a given season. If sabermetricians haven’t produced it: Shame on them! Here’s how you should produce such a stat if you want to remedy the situation: (1) come up with measureable indicia of pitcher performance, (2) give them appropriate weighting, and (3) add them together.

    Comment by Lucas — August 26, 2010 @ 1:00 pm

  14. I’m curious what a list of top pitchers look like using these weights on z-scores. How does Cahill do with these metrics/weights?

    Comment by Annonymous — August 26, 2010 @ 1:03 pm

  15. While I don’t generally support giving out awards for being lucky, I think, in the end, that I would also rather be lucky than good.

    Comment by Bill Sweet — August 26, 2010 @ 1:03 pm

  16. Grounders have a higher BABIP than flies…

    Comment by Alexander — August 26, 2010 @ 1:05 pm

  17. Right. Throw out BABIP all together and replace it with GB%.

    Comment by The Duder — August 26, 2010 @ 1:07 pm

  18. I’d rather be lucky AND good.

    Comment by neuter_your_dogma — August 26, 2010 @ 1:08 pm

  19. That’s a pretty cool method.

    Dave, you say neither ERA or FIP tell the full story, yet FIP is the one stat used to calculate pitcher WAR on here (correct me if I’m wrong on that). If you truly think the former is the case, then why does Fangraphs continue to have a WAR leaderboard of pitchers on its front page based solely on FIP? And I also prefer RA to ERA, which is why I really like B-R’s WAR calcs for pitchers.

    Comment by CajoleJuice — August 26, 2010 @ 1:09 pm

  20. There are some pitchers out there that are extreme control specialists(Jim Palmer, Greg Maddux), but they are very rare and we won’t know for years if Cahill is one of them are just an average to good pitcher having a good year. This was a great article and expressed some of my dislikes of people that use FIP. Stats are only one tool at the disposal of baseball fans, and like any tool they can be misused.

    Comment by David Huzzard — August 26, 2010 @ 1:09 pm

  21. Great post, Dave. But I have one nit to pick and perhaps a not insignificant one. I agree with just about everything you’ve said, but I think you’ve jumped to a more ideal solution without sufficient attention paid to the import intermediate step. Perhaps that’s a function of the audience for this article, but I would suggest there’s an opportunity here to expand that audience.

    We can and should come up with better metrics to measure pitcher contribution to run prevention, such as the weighted one you’ve outlined here (or BP’s SIERRA). But in the interim, we should be trying to get Cy Young voters to use better inputs in to their decision making process that are available today — and easy to understand. I think getting them to use FIP instead of ERA, or in addition to ERA, while not a perfect solution by any means, would be a reasonable first step. Even an average of the two would be a big step up, and probably have pretty good ROI compared to adopting a completely new metric.

    The sabermetric community often takes too many steps at once in their efforts to bring them along, leaving the “traditional” folks in the dust. A concentrated discussion on what ERA measures vs. what FIP measures (clearly and honestly highlighting their respective weaknesses) and then positing a simple alternative approach (just FIP, average of ERA & FIP, or what have you) would be plenty for most fans and traditionalists in the media to swallow.

    As you’ve clearly outlined here, FIP is still plenty flawed for this discussion, ignoring the issues of batted ball influence and timing. However, while we can continue to create even better ways to measure actual pitcher contribution to run prevention, let’s do more to try and bring the mainstream traditionalist crowd along for the ride.

    Comment by Rick — August 26, 2010 @ 1:09 pm

  22. Isn’t groundball rate important as well, since home run rates are more a product of the batter rather than the pitcher?

    Comment by Lucas Apostoleris — August 26, 2010 @ 1:10 pm

  23. As others have mentioned, award voting isn’t the only thing impact by a better metric. Moving WAR from a predictive metric to a performance measurement one would do much to address it’s critique by skeptics.

    Comment by Rick — August 26, 2010 @ 1:11 pm

  24. The Cy Young is still awarded by a vote. Those voters that cling to old metrics will never agree to a formula to detemine the best pitcher. The old line will still say they don’t need more than W-L, ERA and maybe IP or K to determine the best. If they choose to ignore BABIP, FIP or WAR they are unlikely to embrace anything else new. People don’t like to have to learn new things and they’d rather insist that their wrong way of thinking is right than admit that Zack Greinke is still having a good year even though he is 8 – 13.

    Comment by MikeS — August 26, 2010 @ 1:11 pm

  25. GB pitchers have a lower BABIP on GBs than flyball pitchers do.

    Any while GBs have a higher BABIP than FBs, they’re also much less likely to result in extra base hits.

    Comment by Sam — August 26, 2010 @ 1:12 pm

  26. FIP isn’t predicting the future (although it’s a better predictor than ERA), it is what the pitcher actually did in the the things completely in his control. It IS a backward looking metric.

    I think if we don’t “penalize” Josh Hamilton for his high BABIP, we shouldn’t “penalize” Cahill for his high BABIP. It’s a question of whether players should get credit for being lucky, and there is no right answer.

    Comment by Alexander — August 26, 2010 @ 1:12 pm

  27. Trav- agreed, why is DC pussyfooting around the driving factors behind the low BABIP when it is clearly the result of an elite sinker?

    Moreover, this is prob. naive but when it comes to sinkerballers and, er, cutterballers (think Mariano), we’re talking about guys who have little to no interest in striking you out, and whose prime objective is to not allow you to square the ball up on the bat.

    Depending on a pitchers’ approach and the quality of his offerings, is it fair to see an oddly low BABIP and assume his luck will run out? I feel that when this particular stat is bandied about, we are too often citing luck as a factor instead of a pitchers’ ability to miss the fat part of the bat.

    Comment by Scott — August 26, 2010 @ 1:13 pm

  28. I think the award should go to the highest NERD rating among pitchers. It’s what the fans would want. It combines predictive statistics with results statistics, and throws in the glamor of the game as well!

    Comment by Bradley — August 26, 2010 @ 1:15 pm

  29. Line drives have by far the highest BABIP and he’s really limiting those, likely in large part due to his sinker. Most of the good outliers in LD rate seem to have come from pitchers that rely on the sinker.

    Comment by Nitram Odarp — August 26, 2010 @ 1:17 pm

  30. Maddux’s career BABIP against was .289 and during his career the league average was between .285 and .307. His BABIP’s weren’t crazy like Palmer’s.

    Comment by Alexander — August 26, 2010 @ 1:21 pm

  31. WAR is meant as a predictive metric? If that’s true, then I’ve been looking at it completely wrong. And wouldn’t it be based on something like xFIP instead of FIP if that were the case?

    Comment by CajoleJuice — August 26, 2010 @ 1:21 pm

  32. FIP does two things that prevent it from being a “value” stat.

    First, as this post points out, it assumes that all balls in play have an equal run value. The pitcher gets credited for the same value whether it’s a double off the wall or an infield popup. This is supposed to “correct” for both defense and luck. But we don’t want to correct for luck when talking about value, as an event has the same value whether it was lucky or not. All that matters, in terms of value, is how easily fieldable the BIP Cahill allows are. Chances are, as Cameron points out, Cahill has induced very easily fieldable BIP this year.

    Second, FIP ignores the timing of events. Three walks before a HR counts the same as three walks after a HR. But the timing of events does matter when it comes to value, which is something we have hesitantly acknowledged in the MVP race for batters. Some events are higher leverage than others. Even Fangraphs WAR acknowledges this with its partial weighting of LI for relievers. Cahill has been extremely stingy with runners on base and with RISP, which is part of his value this season. That’s part of why he has the second best WPA/LI in the AL this year.

    I applaud Cameron for this post, though I’d say his weightings at the bottom are far from a “solution.” The weights are arbitrary. But someone from the saber side needs to point out that Law is making himself look stupid by determining the CYA by FIP (just as he did last year).

    Comment by Sam — August 26, 2010 @ 1:25 pm

  33. Your stats are all rates, but some counting stats have a lot of value too. For instance, I’d take 250 IP from CC instead of 180 IP from a slightly better pitcher (by rate stats).

    Weighting strikeout rates so heavily when they also decrease the number of innings you can pitch (takes more pitches to strike somebody out) seems dubious, at best.

    Comment by Paul — August 26, 2010 @ 1:25 pm

  34. Grounders have a higher BABIP than flies. And you must admit that a .217 BABIP is insane; Mariano has a career .274 BABIP and you can’t assume that Cahill is a Rivera type after 334 career innings. Cahill seems to have some BABIP “skill,” but not this absurd amount.

    Comment by Alexander — August 26, 2010 @ 1:25 pm

  35. Law’s response from his chat:

    Dave did a really strong job on that piece. I wasn’t arguing that FIP or its ilk should be the only factors, but that so much of Cahill’s performance this year is due to luck/the park/defense behind him that he’s not close to a top five candidate. How could you vote for him over, say, Cliff Lee or Felix Hernandez?

    That’s kind of what I was thinking he would say. He didn’t have room to lay out his full case in 140 characters, so he just used FIP.

    Comment by suicide squeeze — August 26, 2010 @ 1:26 pm

  36. WAR tells what happened. It is not a predictive stat.

    Comment by Alexander — August 26, 2010 @ 1:27 pm

  37. @Sam
    GB might be less likely to be extra base hits, but the extra bases don’t affect BABIP.

    Comment by Alexander — August 26, 2010 @ 1:28 pm

  38. I agree, Dave. For just the same reasons, I think if I were to base the Cy Young on one single stat, just among those I’m aware of and have come to understand, I’d probably pick WAR based on tRA.

    Comment by philkid3 — August 26, 2010 @ 1:32 pm

  39. Yeah I looked that up after I posted. Memory should always be confirmed by stats. I broke my own rule of using all tools availible, but sometimes in an effort to save time we don’t always do the smartest thing.

    Comment by David Huzzard — August 26, 2010 @ 1:32 pm

  40. Man, Buster Olney is such an idiot. It annoys me how people are so blind to new ideas and ways of thinking. His reasoning is, “Arg, player A have more RBI than player B. He more clutch. I vote for him.”

    Comment by no — August 26, 2010 @ 1:32 pm

  41. Correct me if I’m wrong, but WAR for pitchers is based on FIP and FIP, as Dave described it above, is predictive. My language was a bit imprecise however. For hitters, it is a measure of performance.

    Comment by Rick — August 26, 2010 @ 1:33 pm

  42. @Nitram Odarp
    I would think his limited LD% is also due in (large) part to luck/random variation.

    Comment by Alexander — August 26, 2010 @ 1:33 pm

  43. Skill affects batters’ BABIP much more than pitchers’, so Hamilton *should* get more credit for his high BABIP than Cahill for his low one.

    Comment by edb11235813 — August 26, 2010 @ 1:34 pm

  44. @Rick
    I’m confused to how FIP is predictive. Doesn’t it only factor what the pitcher did in the form of Ks, BBs, and HRs? That is what the pitcher actually did.

    Comment by Alexander — August 26, 2010 @ 1:36 pm

  45. FIP may be backward looking to an extent, but it strips out things like strand rate that may not be sustainable. For example, Cahill has a strand rate of 77%–we can say that we expect that to regress, but it completely writes off the fact that Cahill has performed extremely well with runners on base this season. It probably isn’t sustainable, but I think it is completely unfair to ignore the fact that to be able to strand 77% of base runners, one has to pitch very well with runners on.

    On a side note, I find it interesting that there are a couple of pitchers that are Cy candidates that the sabermetrics community downplays because of FIP: Cahill and Tim Hudson

    Both have large variances between their FIPs and ERAs, both have extremely low BABIPs, high strand rates, and both are extreme ground ball pitchers. Obviously you can’t draw conclusions on the effectiveness of FIP based on 2 guys that seem to be having outlier type seasons, but is at least possible that for pitchers with a certain skill set (extreme GB), FIP is not the best way to evaluate them.

    Comment by Dan — August 26, 2010 @ 1:38 pm

  46. That’s nonsense.

    Law made the exact same simplistic mistake last year in defending his Vazquez CYA vote, and he can’t blame that one on Twitter.

    Comment by Sam — August 26, 2010 @ 1:38 pm

  47. Over the last ten years, I’ve found that WAR has the biggest correlation to the Cy Young winner, which is interesting since it is convened via FIP.

    That would give Buchholz a legitimate shot at winning

    Comment by Rufio Magillicutty — August 26, 2010 @ 1:38 pm

  48. I think it’s important that we have a broader discussion about the “purity” of performance metrics. I would posit that there are two basic categories: Those which truly measure events and those which attempt to place value on those events.

    To grossly simplify things, traditionalists favor the former while sabermetricians favor the latter.

    For example, OBP is true a performance measure — how many times did the guy get on base divided by how many times he came up to bat. It doesn’t attempt to tell you many runs that’s worth, it’s just a record of events. However, wOBA converts those events to a value using estimate run values and then manipulates the figure returned such that it looks like OBP. This actually can confuse us because we forget that wOBA is a run estimator.

    Sabermetricians prefer value metrics because they allow us to consider the complex system of run production/prevention more scientifically.

    The degree to which a given metric is predictive of itself or other metrics is a separate, albeit important issue.

    Comment by Rick — August 26, 2010 @ 1:42 pm

  49. Perhaps because FIP is a better measure of skill / predictor of future ERA than is ERA itself and a player’s reputation (history of success) plays some role in Cy Young voting.

    Comment by Rick — August 26, 2010 @ 1:43 pm

  50. “But FIP was not designed to give us a better insight into what actually happened, but, instead, what is likely to happen in the future.”

    I’m taking it at face value: FIP is a DIPS implementation, and it measures what it measures. I also haven’t seen Tango state that predictive value was his design intent, but even if he did I’m don’t see why intent would matter; I could design a car with the intent of flight, but it would still drive (and not fly).

    While I prefer xFIP to FIP and grow increasingly interested in t(E)RA, I don’t see why anything in the DIPS class couldn’t or shouldn’t be used to help determine how well a pitcher actually pitched (as opposed to the multi-influenced outcomes they happened to receive while pitching).

    As for the Cy Young part, I think it comes down to the weighting between best-effort measurements of skill (e.g. tERA) and actual outcomes (e.g. ERA) – and that there’s a fair enough argument to be made for either leaning.

    And kinda speaking of: how about leaderboard sorts for tERA and wRC+?

    Comment by astrostl — August 26, 2010 @ 1:45 pm

  51. Alex,

    1) Extra base hits matter because they have more value than singles.

    2) When discussing value, who cares whether his LD% is a product of luck or skill? Inducing a GB has more value than inducing a LD.

    Comment by Sam — August 26, 2010 @ 1:48 pm

  52. My response to Law would be that just because a particular performance/outcome is not sustainable doesn’t make it all luck. As DC points out Cahill may be influencing the .217 BABIP to a certain extent–we just can’t quantify how much. But just because we can’t quantify it right now, doesn’t mean that the amount of influence he has over BABIP couldn’t be quantified with different tools/approaches or the like. As technology moves forward and is applied to the game in different ways, it is likely that some “conventionally” currently held sabermetric ideas will be proved incorrect to a degree.

    This is why I think awards like the CY will tend to lean towards what actually happens (ERA), rather than what would have happened independent of the defense/park/etc (FIP).

    Comment by Dan — August 26, 2010 @ 1:52 pm

  53. Yeah I was wondering the same thing.

    Comment by Ivdown — August 26, 2010 @ 1:52 pm

  54. FIP only does as much as it does. It is not considering LOB% or BABIP, but it is not trying to. It is completely telling what happened, but only in Ks, BBs, and HR.

    Hudson has a career .41 difference between his FIP and ERA, so something is going on there. What is his “true” FIP-ERA? Probably somewhere between .41 and 0. I don’t think any real saberist or whatever it’s called is saying that pitchers should be judged solely on FIP.

    Comment by Alexander — August 26, 2010 @ 1:53 pm

  55. Where did Law say that FIP is the only thing you should look at? He certainly didn’t say it in that tweet, and I distinctly remember him saying that on his vote last year he looked at a variety of advanced statistics.

    Comment by Jake — August 26, 2010 @ 1:58 pm

  56. “Second, FIP ignores the timing of events. Three walks before a HR counts the same as three walks after a HR. But the timing of events does matter when it comes to value, which is something we have hesitantly acknowledged in the MVP race for batters. ”

    We have? Are you saying we should go back to looking at RBI?

    Comment by Jake — August 26, 2010 @ 2:01 pm

  57. @Alexander FIP uses the three true outcomes (BB,K,HR). These have been demonstrated to be the outcomes with the highest correlations for a given pitcher, year to year. This means that FIP is better at predicting future ERA than current ERA.

    What it doesn’t do is to tell you how a pitcher performed in the past — because, luck or not, those hits fell in (or didn’t).

    So while it does include SOME of what the pitcher did, it doesn’t include all of it. It doesn’t accurately reflect how a guy performed; it’s just better at smoothing out luck and predicting how they will perform in the future.

    I agree with the comments here that wonder why we use FIP as a basis for WAR, when WAR is used retrospectively.

    Comment by Travis L — August 26, 2010 @ 2:03 pm

  58. why don’t we just reward them for what they did, whether or not it was sustainable…if someone throws 250 innings at 2.50 ERA/3.50 FIP and another guy throws 250 innings of 3.50 ERA/2.50 FIP, just give it to the guy with the 2.50 ERA instead, I don’t care if FIP suggests he can’t sustain those numbers next year, the award isn’t being given for next year.

    Comment by Brendan — August 26, 2010 @ 2:04 pm

  59. If you’re looking for a stat that measures what actually happened, why would you use ERA rather than just runs allowed?

    Comment by Travis L — August 26, 2010 @ 2:06 pm

  60. Law said his vote was based on FIP, xFIP (FIP w/ HR rate normalized), WAR (which is just FIP and IP), and VORP.

    Vazquez was 7th in the NL in VORP last year, and Law put him 2nd on his CYA ballot. His vote was quite obviously determined by Vazquez finishing 2nd in the NL in FIP and WAR.

    Comment by Sam — August 26, 2010 @ 2:07 pm

  61. You should be using runs allowed, not earned runs allowed. A small difference, but it helps bolster the underlying philosophy of your argument (reward them for what actually happened).

    ERA is not a very good stat — it’s kind of a bastardization between RA and FIP. RA just states what happened; FIP tries to isolate luck. ERA tries to remove SOME of the luck component, provided that the bad luck was caused by the fielders.

    Comment by Travis L — August 26, 2010 @ 2:08 pm

  62. WHIP is in all honesty probably the best pitching stat available. Since it removes runs from the equation, its generally not used as a base stat to measure Run Value, but if you just multiply the WHIP by pi you get about the league average ERA. Which is understandable if you think of how a field is configured (a diamond) and how an approximation of pi can be simulated (inscribe a circle in a square, and the ratio of random points placed inside the circle to the entire square ~ pi)

    Its a delicate science. Do some pitchers genuinely get absurdly lucky with lower BABIP averages and low HR/FB rates? We’ve already seen Cliff Lee undergo a drastic regression to the mean during his few weeks in Texas.

    Soon the force and trajectory of a ball off a bat will be used as a predictor of expected pitching performance, with all the Pitch/FX data available now. In about five years we should have a pretty good idea of expected rates of xBH, HR, groundouts, etc… based on Pitch/FX. Then some linear weights can be applied, park adjustments, and the new formulation will usurp its predecessors.

    Comment by Rufio Magillicutty — August 26, 2010 @ 2:08 pm

  63. I’m sure that this observation has been made before, but heavily sinker dependent pitchers like Cahill seem to be situated at the interstices of what FIPS and WAR measure.

    In recent years Chien Mein Wang, Vin Mazzarro, Jair Jurjjens, Trevor Cahill, and Tim Hudson appear to consistently beat their FIPS (Hudson by .4, Mazzaro by .35, Jurjjens by .38, Cahill by an absurd 1.09, Wang by .25). At the other end of the curve, the most putatively ‘unlucky’ sinkerballers, like Derek Lowe and Aaron Cook, have FIPS that beat their ERAs by .3 and .05 respectively. Maybe some of the individual outliers can be linked to statistical noise, the combined number of innings these guys have pitched form a sample that is large enough – and divergent enough from our expectations – to call into question some aspects of the model.

    1) Because they have more balls hit in play they are more dependent on a strong defense than fireballers. I think that FIPS generally treat this as a liability, but thats not really an accurate paradigm. Making the hitting event into strictly a batter/pitcher relationship simplifies the formula for analytical purposes but it elides certain details, like the fact that its implicitly volatile, but not implicitly bad, to rely heavily on fielders. By extension:

    2) I know that the values assigned to GB %, K %, BB % and HR % are based off of some sort of ANOVA, and so they are accurate. Is it possible, though, that sinkers somehow produce a qualitatively different type of ground ball than other pitches? Has this sort of study been done (I assume it has)? If not I would do it, but I lack the requisite statistical access/savvy.

    Comment by MortimerKhan — August 26, 2010 @ 2:10 pm

  64. No. I’m saying that a pitcher who comes into the game and strikes out the side in the bottom of the ninth with the bases loaded, none out, and a one run lead has provided more value than a pitcher who strikes out the side in the 9th inning of a 10-0 game.

    Similarly, the pitcher who allows the HR before the walks has provided more value to his team than the one who allows the grand slam.

    Comment by Sam — August 26, 2010 @ 2:11 pm

  65. WAR is a combination of performance and predictive.

    It is mostly performance except for the .217 BABIP for Cahill. That’s likely ‘luck’ in that batters have simply hit more balls in range of fielders than normal. Using FIP gives all of that luck to the fielders, and none of that luck to the pitcher. In reality, that luck should probably be allocated somehow between the pitcher and the defense.

    Comment by KJOK — August 26, 2010 @ 2:15 pm

  66. Because those stats that have been traditionally used are STILL pretty damn good indicators of good pitching. Seriously, look at the past award winners, and 95% of the time, it’s a darn good pick. Here’s why ….

    Guys that strikeout a lot of batters generally have great stuff and are hard to hit. Likewise, they often have really good ERA’s and rack up a lot of wins. Those correlations are rather consistent.

    We need to stop acting like the media awards hte CYA to the Jeff Suppans and Woody Williams of the world, provided they have 22 wins. They most often give the award to the K leader …. who very often has a great ERA and pretty good W-L record.

    Some of the advanced metrics give us better insight into a particular player or performance, and should be given attention. But the basic stats still tell us A LOT of good, basic information.

    Advanced metrics, also seek to isolate the individual player’s performance, wheras some of the traditional counting stats involve team or context aspects.

    What bothers me is how some completely disregard counting stats as if they don’t matter. Like Ryan Howard driving in 150 runs, and someone commenting RBIs (as a stat) are useless. Useless? They are probably THE most important stat for a guy hitting 3rd or 4th. If he doesn’t have a lot of em, your team has a problem (I would love to see SEA have Branyan for the entire season. That team could have been very different). That doesn;t mean that RBIs tell us everything we need to know about a player, but when someone has 130+ of em, that’s tells us something important.

    IMO, SABR sometimes goes overboard with the “isolation” of an individual’s performance stats. That may tell us something about a player’s “true talent”, but not necessarily “best season” type stuff.

    Bob Gibson went through the 68 season with a ridiculous BABIP. I don;t see where he should be punished for that … not until we can quantify with a VERY high level of confidence the pitcher’s contribution to BABIP and what % is “luck” or randomness.

    We treat FIP as if it were THE thing in pitching. It’s THE metric that is superior to all the rest. You know what FIP tells us? The pitchers that strike a lot of guys out, don’t walk many, and don’t give up HRs should be expected to be among the best pitchers and not give up many runs. Wow! What a revelation. Thank God for FIP or we might confuse Roy Halladay with Jake Westbrook. *grin*

    There is both stat snobery (SABR) and unnecessary resistance (old-school just for the sake of being old school) involved in the metrics.

    The part that I place *some* value on that gets left out in the metrics is performance in a pennant chase. It’s matter, at least to me. I give Adam Wainwright a lot of credit for being dominant in big games down the stretch run (like allowing 2 runs or less in 17 of 18 starts last year). That’s important to me, because of the additional pressure, and winning being the ultimate goal. Sure, the player cannot control his team’s situation, but that’s how it is sometimes.

    Comment by CircleChange11 — August 26, 2010 @ 2:19 pm

  67. In my opinion the Cy Young award should not try evaluate a pitcher’s baseline abilities or “true” talent. But rather it should reward performance even if that performance would be considered lucky by modern metrics. If Cy Young voters were to consider strikeout rate and walk rate as the top two metrics to consider, that would favor strikeout pitchers with good control.

    Admittedly that is the epitome of what most see as a good pitcher. The reward though, isn’t really about a pitcher’s ability per se but how well the pitcher performed in a season. Let’s say pitcher A was a knuckleballer while B a strikeout artist. And both had great traditional stats, ERA, wins etc. Should Cy Young voters knock the knuckleballer (or say groundball artist like Derek Lowe or Fausto Carmona in his prime) for not having the best strikeout rates? Or not vote for Tom’s Glavine in ’91 and ’98 for having a .268 and .278 BABIP.

    I see the value in trying to separate luck from ability or defense from the pitcher’s ability. But you’re going to need a finer grained definition of “best” pitcher to use metrics like the ones suggested. BABIP is great but I’ll bet most the Cy young award winners will have unusually low BABIPs. The award is about performance not ability. I think it would be cool if a junkballing lefty had a season for the ages and won the award (cough Moyer cough).

    Comment by Klatz — August 26, 2010 @ 2:20 pm

  68. Yes – Dave says that explicitly:

    “Yes, that number is driven down by a combination of outside factors, including his home park, his defense, and some bad hitting by his opponents.”

    Comment by Jason B — August 26, 2010 @ 2:21 pm

  69. AND have a huge wang…

    Comment by That guy — August 26, 2010 @ 2:27 pm

  70. agree. GB rate should be in dave’s list of criteria.

    Comment by brendan — August 26, 2010 @ 2:29 pm

  71. “Those voters that cling to old metrics will never agree to a formula to detemine the best pitcher.”

    I like the newfangled metrics as much as anyone, but I would rather see the Cy decided by a vote, flawed though the vote (and voters) may be, than by a formula. Blah. Yawn.

    (Not that it would ever happen – who would ever agree what inputs to put into the formula, and how to weight them?)

    Comment by Jason B — August 26, 2010 @ 2:30 pm

  72. He also mentioned this morning on SportsCenter that an Advanced Scout whom he spoke with said that this scout would pitch around Joey Votto to face Albert Pujols – if the two of them were on the same team, batting in sequence.

    I nearly choked on my golden grahams when I heard that.

    Comment by Erik — August 26, 2010 @ 2:34 pm

  73. Disagree. Law has been using FIP as a be-all-end-all dispute decider in his tweets for quite a while. It’s been driving me nuts all season. He did it with All-Star game selections as well. For example, I pointed out to him that based on FIP Randy Wells should have been an NL All-Star, even though his results have been horrible this year. If he is going to send out tweets using advanced stats, he needs to do a better job of providing context.

    FIP is obviously a great stat. But, it is susceptible to misuse when used to evaluate results. A mainstream media guy like Law needs to be more discerning when throwing out advanced stats to the public. Among the loud baseball voices at ESPN, Law is just about as close to a sabermetric as there is. In my opinion he does the community a disservice when he throws out tweets referencing advanced stats like he does without providing context. Just my two cents on the matter.

    Comment by Jonathan — August 26, 2010 @ 2:36 pm

  74. Hear, hear. I think it should be based on what the player did that year, lucky or not, sustainable or not. That said, to each voter his own methodology, and I wouldn’t begrudge someone from taking a different approach, as long as they (a) tried to be transparent about how they voted, and (b) tried to be consistent in applying their methodology from one year to the next.

    Having different mindsets and different voting philosophies is what makes the process interesting, to me, even if we don’t always get what we feel is the “right” or “best” result. As we saw with the NL Cy race last year, there was plenty of room for civil disagreement both before the vote and after the results were disclosed.

    Comment by Jason B — August 26, 2010 @ 2:38 pm

  75. The thing about pitching is how much do they really control? I always hear people say they control their strikeouts, and walks and everything else is out of their control. This is where I disagree, because in reality they don’t completely control that either because they can’t control what the batter does. They don’t control whether a batter swings through a hanging breaking ball for strike 3, or if they hit a 95mph fastball low and away and on the black out of the ballpark. The one could be a terrible pitch but they got away with it, the other could be a great one and they got screwed.

    Comment by Dwight S. — August 26, 2010 @ 2:38 pm

  76. Why is that so outrageous? Votto is having a better season this year.

    Comment by Jonathan — August 26, 2010 @ 2:39 pm

  77. Liriano, Lester anyone?

    Comment by Ty — August 26, 2010 @ 2:51 pm

  78. Even for position players, WAR doesn’t give a clear picture of what happened. WAR uses UZR, which doesn’t, nor is it intended, to demonstrate what did happen.

    Comment by lifewontwait — August 26, 2010 @ 2:52 pm

  79. I disagree completely. GB rate is an important piece of information in learning about HOW a pitcher pitches, but not necessarily about the overall success of a pitcher. Pitchers can be FB pitchers and succeed, as long as they have high K rates to go with.

    Comment by Everett — August 26, 2010 @ 3:00 pm

  80. I have never heard or read Buster Olney say anything remotely similar to what you are “quoting” him as saying.

    His tweet said “Looks like by the end of the night,Trevor Cahill will be 14-5, with the second-best ERA in the AL, No. 2 in WHIP -and No. 2 or No. 1 for CY?”

    Where in this quote or any other quote has he said anything like what you are accusing him of? He made an observation about the success Cahill has had this year. Despite luck or skill that we’ve never seen before in baseball shown in his incredibly low BABIP, he is having a very good year.

    Four days ago Olney said “I had Lee over Felix Hernandez for the AL Cy Young with six weeks go; as of Sunday, I’d put King Felix in the lead again.”

    How you determined the mans intelligent level and that he is so blind to new thinking out of 122 characters of words is amazing to me. Based on your 202 characters I would rate your intelligence a lot lower than his.

    Comment by LearnToRead — August 26, 2010 @ 3:01 pm

  81. Maybe the secret scout had a righty on the mound.

    (Really, the wording is dumb. Pitching around, and thus likely putting on base, either in front of the other would be pretty stupid strategy.)

    Comment by Mister Delaware — August 26, 2010 @ 3:01 pm

  82. I disagree, because the overall talent/ability of the batters they’re facing comes out in the pitcher’s overall numbers. A crappy pitch here, crappy decision there that they get away with will be countered by a great pitch that gets overmatched by a talented batter in another scenario.

    Comment by Ty — August 26, 2010 @ 3:03 pm

  83. and pitchers can have low K rates and succeed as long as they have high GB rates to compensate. so would you argue that we should not consider K rate either? that might be a defensible stand, but i don’t think i’d take it.

    Comment by notdissertating — August 26, 2010 @ 3:06 pm

  84. Strikeout Rate: 40 percent
    Walk Rate: 30 percent
    Home Run Rate: 15 percent
    Batting Average On Balls In Play: 10 Percent
    Left-On-Base Rate: 5 percent

    Dave, I sketched out a possible “spreadsheet” using varoius metrics that I may use in a weighted fashion to examine who I think is the Award Winner. It’s more of an exercise to see how I weight/value the various metrics.

    Comment by CircleChange11 — August 26, 2010 @ 3:07 pm

  85. RBI the most important stat for a person batting 3rd or 4th? Therefore, if the same person is on a team with a great 1-2 punch, he’s a good hitter, and if the 1-2 hitters have a terrible OBP, he’s a bad hitter? RBI measures the ability of the hitters to get on base in front of a person, combined with the timeliness of hitting to score them. RBI is significantly flawed in that it is a HIGHLY dependent statistic that tells us very little about an individual’s talent level. If you want a stat that does measure clutchness/timely hitting, you want one of the WPA variants.

    Comment by Everett — August 26, 2010 @ 3:07 pm

  86. Yes, but just because it is unsustainable doesn’t mean it’s luck. If we go back and look at all the pitching he’s done and we see that 75% of his pitches are like he’s throwing BP, then yes, clearly it’s luck, but I doubt that’s the case.

    In relatively small sample sizes, like 1 season, you can skillfully achieve unsustainable results. For example, if a pitcher strikes out the side, that’s clearly an unsustainable 27k/9 rate, but it wasn’t dumb luck that he did it.

    Players aren’t robots. They have good stretches and bad stretches, and those stretches aren’t all caused by luck, often times it has a lot to do with correctly repeating mechanics. What we see as the average babip is just that, an average. It includes the days when the pitcher is feeling a bit of a dead arm, when his curveball isn’t breaking, when he doesn’t have sharp command. So it would stand to reason that when a pitcher is able to do what he wants with the ball, and repeat his mechanics properly, his babip will be below the average (in the good way, since the average includes the days when he isn’t throwing well). So if a pitcher is fortunate enough to have command of his pitches for 32 starts in a season, it would stand to reason he’d have unsustainably good stats.

    Comment by lifewontwait — August 26, 2010 @ 3:11 pm

  87. We need to clarify the difference between true “luck” and simply the contributions of others being counted in the evaluation of the pitcher’s accomplishments. Some of what we term “BABIP luck” is balls being hit more weakly and/or to easier spots to field than is sustainable. These are results that the pitcher is getting in his confrontation with the batter. I have no problem giving the pitcher credit for this.

    However, some of it also the other people on the field playing well. I see no reason to give a pitcher credit for this, just as I don’t like using RBI to judge a hitter because it is so strongly influenced by the performances of his teammates.

    In short, giving a player credit for the things he does = good. Giving a player credit for the things other people do = bad. “Luck” conflates these things and makes these conversations more difficult.

    Comment by Rick — August 26, 2010 @ 3:12 pm

  88. Which is the entire point of the WPA group of stats.

    Comment by Everett — August 26, 2010 @ 3:12 pm

  89. I honestly can’t tell if this is a troll post or not. The first paragraph is utter nonsense, but the last two appear to be real. Help?

    Comment by Everett — August 26, 2010 @ 3:14 pm

  90. Well stated. This really should be the goal of any performance statistic. Finding a way to give credit for what they deserve credit for, and remove credit for what do not deserve credit for. ERA goes too far in one direction, FIP too far in the other direction.

    Comment by Everett — August 26, 2010 @ 3:17 pm

  91. Not if they have a stretch where they are consistently lucky/unlucky.

    Comment by lifewontwait — August 26, 2010 @ 3:28 pm

  92. Agree with KJOK

    Comment by Nick Steiner — August 26, 2010 @ 3:37 pm

  93. Votto is having a very, very, very slightly better season (175 wRC+ vs. 168 wRC+) and that includes him having a BABIP almost 60 points higher. ZIPS projects Pujols for a .452 wOBA going forward and a Votto for a .394 wOBA.

    Besides, even if Votto were better than Pujols you wouldn’t ever walk a guy in front of him, as that increased the run expectancy far more than the difference between Pujols and Votto batting.

    Comment by Nick Steiner — August 26, 2010 @ 3:43 pm

  94. FIP was never designed to be a backward-looking metric designed to tell us what actually did happen.

    I don’t disagree, but why then is fangraphs WAR for pitchers – the ultimate backward-looking value metric – built off of FIP (please correct me if I am wrong about this)?

    Comment by salb918 — August 26, 2010 @ 3:45 pm

  95. Palmer’s BABIP would have been affected by defense of Paul Blair, Mark Belanger, Brooks Robinson, Bobby Grich and many more. You’d have to do a bit of work to isolate Palmer’s BABIP skill from the skill of the defense playing behind him.

    Comment by Detroit Michael — August 26, 2010 @ 3:46 pm

  96. Correct, and that would show in the peripheral stats. Dwight S.’s point was that pitcher’s don’t control when, for example, he throws a poor pitch and the batter chooses not to swing. Is this luck? Or is this the batter having poor pitch selection skills? Where do we draw the line between talent and luck?

    Comment by Ty — August 26, 2010 @ 3:49 pm

  97. You conveniently ignored the fact that I specifically quoted the “batter MVP” part of your post.

    Using this logic, a batter who homers with the bases loaded has provided more “value” than one who homers with the bases empty. So to you, RBI measures something important.

    Comment by Jake — August 26, 2010 @ 3:51 pm

  98. There are a couple of reasons why one my object to simply using ERA and calling it “performance”.

    One is the obvious influence of defense. A pitcher with a 2.50 ERA and a 3.50 FIP likely received very good defensive support, while the pitcher with a 3.50 ERA and a 2.50 FIP likely received very poor defensive support. I don’t think that defense should have anything to do with a pitchers measured performance, so it should be removed (however you see fit to do that).

    The other is, even disregarding the effects of defense, results isn’t synonymous with performance. If a pitcher throws a perfect game, but allows 27 line drive outs right to the center fielder, I certainly wouldn’t argue that he performed very well and I wouldn’t want to award to the Cy Young to such a pitcher.

    I’m sure there is a way to tease out unsustainable performance with flat out luck, but simply using ERA isn’t the answer.

    Comment by Nick Steiner — August 26, 2010 @ 3:52 pm

  99. Using ERA or RA rewards the pitcher for not just what he does but also for what his defense did. FIP gives the pitcher no credit for inducing easy-to-field balls in play, so it errs in the opposite direction.

    Comment by Detroit Michael — August 26, 2010 @ 3:54 pm

  100. Strikeout Rate: 40 percent
    Walk Rate: 30 percent
    Home Run Rate: 15 percent
    Batting Average On Balls In Play: 10 Percent
    Left-On-Base Rate: 5 percent

    so by these stats who are the top 5 contenders in both leagues.

    Comment by Muddy waters — August 26, 2010 @ 3:55 pm

  101. http://www.beyondtheboxscore.com/2010/8/13/1622256/what-is-the-cy-young-award

    They don’t get it right 95% of the time. I thought they got it right about half the time, plus a few very close calls, but not 95% of the time.

    Comment by LeeTro — August 26, 2010 @ 4:00 pm

  102. Agreed, FIP is clearly trying to isolate pitching and defense – it just does so to a very harsh degree.

    Comment by Nick Steiner — August 26, 2010 @ 4:01 pm

  103. So you are saying we should give Bob Gibson full credit for his ridiculously low BABIP unless we can prove that he didn’t deserve it? It should be the opposite IMO. It’s been proven that, for the league as a whole, single season BABIP has disproportionately luck over skill. Why should we just assume Gibson was any different?

    Comment by Nick Steiner — August 26, 2010 @ 4:05 pm

  104. Here’s one other factor that is often overlooked… consistency. I know some will probably brush this aside but would you rather have a guy that gives up 1 run a game for 5 games or a guy who throws 4 shutouts and then one 5 run outing. FIP, WAR, even ERA doesn’t really care about something like that.

    I know this is old school but giving your team chance to win is one of the roles of a pitcher (yes it is dependent on defense, luck and run support). I know people don’t believe this but does Halladay or Sabathia or Oswalt really pitch the same way with a runner on 3rd and 1 out with a 1-0 lead vs a 3-0 lead in the 8th inning?

    Also Dave I think you need to give IP some weight in your weighting system.

    Comment by joe — August 26, 2010 @ 4:09 pm

  105. Here’s my article on Cy Young winners: http://www.beyondtheboxscore.com/2010/8/13/1622256/what-is-the-cy-young-award

    In summary, I said that Rally WAR should be the starting point. It uses runs allowed, then adjusts for league, ballpark, and defense, which is exactly what we’re looking for. For someone having an exceptionally lucky year (aka Livan Hernandez), you can give it to someone just behind him in WAR if he has a much better FIP.

    Comment by LeeTro — August 26, 2010 @ 4:11 pm

  106. We’d have to find a good way to analyze correlation between performance and K rate vs. performance and GB/FB rates, to see which provides better correlation. My suspicion is that K rate has a much higher correlation than does GB rates. The problem with this is that we can’t use WAR as the measure because WAR uses FIP, which uses K rate as one of its three measures. That takes us right back to the point of the post, which is which deals with how we should best evaluate performance statistics vs. predictive statistics.

    My guess as to why GB rate was not originally included is because there are plenty of GB people who’s performance isn’t that great, where there aren’t nearly as many high K people who’s performance isn’t great, but I can’t speak for Dave’s thought process on this one.

    Comment by Everett — August 26, 2010 @ 4:13 pm

  107. Jake,

    Yes, context matters. A HR is a high leverage situation is more valuable than a HR in a low leverage situation. If you have two guys with the exact same overall batting line, the one with the superior performance in high leveraged situations (clutchiness) will help his team win more baseball games, thus being more valuable. Clutch hitting isn’t a skill, but clutch hits have value.

    Presumably, you think a HR is more valuable than a strikeout. The HR gives you at least one RBI, whereas the K gives you none. Does that mean that you think RBI measure something important? No, it just means you think one event is more valuable than another.

    You seem to be infatuated with the “you like RBI so you must be teh stupid” line of reasoning, which probably makes you feel superior to the “average” fan. In reality, it just makes you the opposite side of the same coin.

    Comment by Sam — August 26, 2010 @ 4:16 pm

  108. It’s possible that a low BABIP is luck only in the sense that very few pitchers can repeat a low BABIP performance. It might not be luck in that it might result from something the pitcher does on purpose. (We know this happens sometimes, as with knuckleball pitchers.) If that’s true, couldn’t someone figure out what they’re doing right when they’ve got a low BABIP and learn to repeat the performance? Thinking on my feet here, it sort of looks like this is what Chris Carpenter did. His BABIP in Toronto: 371, 307, 331, 310, 308, 326. And in St. L. (full seasons only): 280, 282, 276, 274, 273. (Dropping the partial seasons from his St.L years only leaves out five games with bad BABIP.) Maybe his expected BABIP needs to be regressed from those numbers some, but five solid years of low BABIP looks like something changed. Could a pitcher LEARN to beat DIPS? Or, another way of putting the question, what did Chris Carpenter learn in 2003?

    Comment by nick — August 26, 2010 @ 4:18 pm

  109. Unless its a fat Chien-Ming…

    Comment by swheatle — August 26, 2010 @ 4:23 pm

  110. So you are saying we should give Bob Gibson full credit for his ridiculously low BABIP unless we can prove that he didn’t deserve it?

    No, I’m saying “We don’t know” is a MUCH better answer than “it must be luck”. Not knowing something is better than thinking you know something you don’t.

    Furthermore, when I say “we shouldn’t coumnt it agianst him”, why is the automatic knee-jerk conclusion/assumption for him to get “fiull credit”? Isn’t that a little junior highish?

    Why wouldn’t the default assumption be “somewhere between full credit and no credit”. I am a strong advocate of the “average em” in terms of WAR variance, projection systems, etc. So, let’s give Bob 50% credit for the lower BABIP, and call 50% luck … at least we’re half right instead of all wrong.

    Comment by CircleChange11 — August 26, 2010 @ 4:28 pm

  111. I agree with that.

    Comment by Nick Steiner — August 26, 2010 @ 4:30 pm

  112. Well, thats the advanced scout’s opinion, not Buster’s.

    Comment by swheatle — August 26, 2010 @ 4:41 pm

  113. RBI is significantly flawed in that it is a HIGHLY dependent statistic that tells us very little about an individual’s talent level.

    Firstly, it is not as simplistic as you’re representing it. Secondly, if RBIs are representative of any ONE factor, it may not be the hitting of the 1st and 2nd hitters, but the POWER of the 3rd hitter.

    SEA is a good example. Ichiro and Figgins are getting on base enough for a good 3 hitter to have A LOT of RBIs. Ichiro is on pace to score something like 75 runs and their RBI leader is on pace to be under 80. Throw Ryan Howard in that lineup, and het gets 120 RBIs … not necessarily because of the guys always on base, but because he hits a boatload of home runs. Say he hits 40 HRs, and suppose only half of them come with a runner on, just a single runner … that 60 RBIs alone, and only 20 were due to another player. He’ll get a lot more RBIs due to other varoius hits, including extra base hits, and increased BA with RISP. All of the sudden the worst offense in the league isn’t the worst.

    The big slugging RBI machines in the heart of lineups are not completely reliant on others for generating their RBIs (within reason, don’t get all extreme example on me bu suppossing a 3 hitter never comes to plate with men on). Someone like Tommy Herr that gets 100 RBI’s on 8 HRs, is dependent on Vince Coleman and Ozzie Smith getting on base and stealing themselves into scoring position.

    It’s not an either or (opportunites v. power), but a combination of both. There are also better metrics to look, or at least a combination of things to look at.

    RBI’s include context which is what some stats want to avoid. I don’t want to avoid context in all situations, because baseball is not solely an individual sport.

    Comment by CircleChange11 — August 26, 2010 @ 4:42 pm

  114. FIP is not measuring luck or defense it is trying to model it. To say FIP narrows thing down to true outcomes pitchers can control is a bit naive. Maybe it’s narrowing things down to outcomes pitcher have MORE control over, but that shouldn’t be confused with total control as Dwight points out. Just like eliminating things that pitcher have only partial control over (a ball in play) shouldn’t be confused with having no control.

    So we understand there is a vast difference in luck and defense between a ball hit off Josh Beckett that clears the Green Monster vs one that goes off of it 30 feet up? He has significant responsibility for one of those ‘true’ outcomes but not the other?

    Comment by joe — August 26, 2010 @ 4:43 pm

  115. They don’t get it right 95% of the time. I thought they got it right about half the time, plus a few very close calls, but not 95% of the time.

    C’mon man. You start out with the priori that “FIP decides CYA”, then use FIP vs. CYA as evidence, and then conclude that since 50% of the time the FIP leader wins the CYA, that the award is only given out correctly half the time.

    That’s ridiculous.

    I should also clarrify my statement. The thinking attributed to “traditional voters” is that “wins = CYA”, and when you go back and look, the leader in wins doesn’t always win the CYA, and sometimes the leader has had something like 27 or 28 wins and the CYA leader has 21 or so. The BIG stat in CYA is strikeouts … which (IMO) is a pretty darn good indicator of pitching quality, despite my affection for the high GB% guys.

    So, yes, if start with the determination that FIP is THE criteria for CYA, and then compare the award winners to FIP leaders, it works out 50% of the time. I suppose one could use that as criteria for “correctness” if one wanted to. I don’t.

    I’m not basing my personal CYA winners on any single metric. Hell, one could just say “strikeout leader” and get it “right” the majority of the time.

    Comment by CircleChange11 — August 26, 2010 @ 4:46 pm

  116. Some would say that you are just rewarding that pitcher for something he could not control (the timing of his appearance), and I say that type of thinking stinks.

    Situation stuff MATTERS.

    Comment by CircleChange11 — August 26, 2010 @ 4:49 pm

  117. The Cy Young is supposed to go to the pitcher who performed the best, and FIP does a great job of showing that. The outcomes are irrelevant.

    This is entirely different from the MVP debate, because there’s no question of “value” with the Cy Young. It’s just who was better, not who ultimately enjoyed better outcomes.

    I’d happily use WPA for the MVP, but I wouldn’t go near it for the Cy Young. Of course, this means that if I ever select a pitcher for MVP, that doesn’t automatically mean I think he deserves the Cy.

    Comment by Llewdor — August 26, 2010 @ 4:53 pm

  118. “In relatively small sample sizes, like 1 season, you can skillfully achieve unsustainable results. For example, if a pitcher strikes out the side, that’s clearly an unsustainable 27k/9 rate, but it wasn’t dumb luck that he did it.”

    Your example for “relatively small sample” goes from a full season to ONE inning??!?

    As far as the example, with an expected SO/PA rate, we can pretty easily calculate the odds of a pitcher striking out the side to determine a rough skill/luck split.

    If the pitcher strikes out 30% of the batters he faces, then striking out three consecutive batters would be 2.7%. So, for a particular inning, about 97% luck and 3% skill. If this pitcher pitched a 200 IP season and had about five 1-2-3 3SO innings, I think you’d have to say that was about right [mostly skill]. If he had one or zero — you’d probably figure that was a bad-luck influenced result… If he had 20 such innings, probably a bit of good luck there.

    Comment by Eric R — August 26, 2010 @ 4:54 pm

  119. At the end of the article, I say results, meaning Rally WAR, are more important than FIP. There are some years where more than one pitcher are justifiable, but Welch winning the award in 1990, even with his 27 wins, was a joke.

    Comment by LeeTro — August 26, 2010 @ 5:00 pm

  120. @Alexander,

    Sure there is luck involved in it, but if you take a look at the leaders in the category, they have a much higher GB/FB ratio than the league as a whole. Maybe its random, but GB pitchers do seem to more likely to post an especially good LD rate than FB pitchers.

    Comment by Nitram Odarp — August 26, 2010 @ 5:07 pm

  121. CC11, I’m not arguing that RBIs don’t tell us anything, but that they don’t do a very good job of telling us things. You suggest power as an important factor, and I can agree with that. However, for measuring power, I’d rather use ISO or SLG or even HR. You agree that OBP in front is an important factor. The last part you cover is context, which I completely agree with. However, we’ve got a group of stats that do an excellent job of covering context, in my opinion, in the WPA stats. My problem is that RBI speaks to a bunch of interrelations, but doesn’t do a very good job with any of them. There are worse stats out there (productive outs comes to mind), but also many that are better.

    Comment by Everett — August 26, 2010 @ 5:25 pm

  122. Everett took the words right out of my mouth when it comes to RBI.

    RBI are dependent on the guys hitting in front of you. Much like runs scored, it is an opportunity statistic. The greater opportunity one has, the better chance they have at succeeding – and in this case, it comes down to driving in runs.

    Just because Ryan Howard drives in 150 runs, does not mean he is the sole reason for that occurring. Is he a big part of it? Well, sure…that would be ridiculous to think otherwise. But if the guys hitting in front of him don’t get on, he never gets the opportunity to drive those runs in.

    Comment by Erik — August 26, 2010 @ 5:36 pm

  123. It is outrageous because of the points already made in this piece.

    As great of a season as Votto is having, I still would never think about pitching around Joey Votto to face Albert Pujols…and I assume there are very few, if any, managers in major league baseball that would feel the same way.

    Comment by Erik — August 26, 2010 @ 5:40 pm

  124. I would be cautious when using WHIP – since it accounts for hits allowed, which again, is not the direct result of the pitcher…and only the pitcher.

    Comment by Erik — August 26, 2010 @ 5:42 pm

  125. Good question. I would plan on digging into this, but someone may have already done it.

    Comment by Erik — August 26, 2010 @ 5:46 pm

  126. Sam,

    Of course extra-base hits matter, but they don’t matter to BABIP, in the same way they don’t matter to normal batting average.

    Comment by Lance W — August 26, 2010 @ 7:17 pm

  127. “Or, another way of putting the question, what did Chris Carpenter learn in 2003?”

    To hazard a guess, Chris Carpenter learned that he had better fielders behind him in St. Louis than in Toronto?

    Comment by Justin Mosovsky — August 26, 2010 @ 7:31 pm

  128. Joe – you’re right. I’d much rather have the INCONSISTENT pitcher. He would guarantee you wins in four of those games and give you a chance in the fifth one as well. On the other hand the consistent pitcher could possibly lose all five games. Don’t believe me? Here’s a good article on the very subject that shows that less consistent pitchers actually win more games: http://www.hardballtimes.com/main/article/same-old-same-old/.

    Or how about Colby Lewis’ last 7 games, all between one and four Runs per game with a 3.35 RA and yet he went 0-5 and his team 1-7 overall. More shutouts and one blowout would have led to quite a few more Ws even if the ERA ended up the same.

    Comment by Toffer Peak — August 26, 2010 @ 8:31 pm

  129. Well, the American League doesn’t have many really qualified guys to win the Cy Young Award this season. Since Cliff Lee basically dropped out of the race after going to the Rangers, Lester got shelled his last outing, and Felix Hernandez with only 10 wins… who should win? FIP is obviously a great tool to use, however; results should matter at least to an extent.

    The National League has 7 or 8 guys who would easily take the Award if playing in the A.L… but if Buchholz doesn’t win it, and I hate the Yankees, but Sabathia has 17 wins, has been on fire and pitches in a sandbox under media scrutiny. So unless King Felix wins 5 or 6 of his last games, he shouldn’t win. He pitches in pitchers’ ballpark, and for a horrible team who has no pressure on them.

    Obviously, Carpenter learned how to pitch in 2003 after being injured… he threw hard, but decided it’s best to get a quick out on the ground instead of throwing straight 95-97 mph fastball that major league hitters feast on. He and Halladay were good friends… Doc learned that, as well.

    Comment by Kyle — August 26, 2010 @ 8:33 pm

  130. Fellas, when I say something something positive about RBI, or suggest that it may not be a completely useless stat, I am not inherently going to the other extreme and saying it’s a great stat … only that it tells us something.

    Generally I am saying that the good 3 and 4 hitters supply big RBI numbers primarily because they are power hitters (and not necessarily because they have 2 runners on base every time they come up), and power hitters drive in runs better than someone else that may have a similar wOBA, but achieves it with walks and singles, rather than doubles and homers.

    Walks are valuable, but with men in scoring position, you don’t always want your best hitters to walk, especially with 2 outs.

    I don’t intentionally play devil’s advocate on everything, but I have a dislike for absolutes … such as “RBIs are useless” or “RBIs don’t tell us anything”. They tell us something, maybe not as thorough as we would like, but they do tell us something, and it can be valuable.

    IMHO, no stat tells us what we need to know in isolation.

    Comment by CircleChange11 — August 26, 2010 @ 9:25 pm

  131. RBI are dependent on the guys hitting in front of you. Much like runs scored, it is an opportunity statistic.

    In part, but great leadoff men and great power hitters increase their own odds of scoring runs and driving them in.

    If we assume a league average lineup around them, Rickey Henderson is going to score more runs than other leadoff men because he’ll be on base more, get into scoring position more often (SB’s), and hit for more extra bases (including HRs) as compared to others. Likewise Ryan Howard will drive in more runs than a league average 3 or 4 hitter because he’ll hit a ton more HRs than they do, and *could* get perhaps 75 RBI off 45 HRs alone. That type of thing will produce RS and RBI in almost any lineup.

    The mor a leadoff hitter with speed gets on base, the more he influences his own runs scored. The more homers a middle of the order guy hits, the more he influences the RBI totals … and they take that influence to whatever lineup they’re in.

    Certainly, the better the lineup the more RS and RBI they’ll get, but those types of guys can get a large amount of RS and RBI even in a non-elite offense, just because of how they increase their own odds.

    Comment by CircleChange11 — August 26, 2010 @ 9:39 pm

  132. For me, all of the extra IP are what puts Halladay ahead (possibly for good) of Wainwright. IP matter, and they really help the team in terms of saving the bullpen.

    Comment by CircleChange11 — August 26, 2010 @ 9:42 pm

  133. A pitcher with a 2.50 ERA and a 3.50 FIP likely received very good defensive support, while the pitcher with a 3.50 ERA and a 2.50 FIP likely received very poor defensive support.

    For me, I would need to know what accounts for the E-F.

    A guy could have a lower ERA because he’s a groundball pitcher that would rather issue a walk than groove fastballs (i.e., more hits). Or he could have a high rate of HR/FB, but not a high rate of “other runs”.

    Likewise, a guy with a high ERA and low FIP might just not give up many walks and homers, but give up a ton of other hits, and K a decent amount of batters.

    Hits allowed, IMO, is not just all “good defense” or “bad defense”. It’s very possible that pitchers in a given season get lucky on HR/FB, just like some guys get good/bad luck on BABIP. FIP credits the pitcher for giving up a double off the wall (or FB to the track) instead of allowing a ball that travelled 20 feet further for a dinger.

    Rather than just look at ERA or FIP, I like to look at them together, and if there is a descrepency, then I like to look at other metrics and try and find out why, rather than just saying good/bad luck or good/bad defense.

    Generally, pitchers and defense work together. Rarely will a great defense make an average pitcher appear excellent, but it could happen that a great defense makes a B or B+ look like an A-. High GB% pitchers are the most dependent on defense, but they also produce a type of BIP that that don’t go for hits, and it’s darn near impossible for a GB to go for a HR. Some would say that’s how GB% fits into FIP (the HR component). If I could have more confidence in that aspect, I would support FIP to a grater degree.

    There are a handful of important metrics that should be viewed “together” to get as complete of a picture as you can. There’s really no reason, to look at just one metric and be self-limiting.

    Comment by CircleChange11 — August 26, 2010 @ 9:55 pm

  134. Walks and K’s are also influenced by the batters and umpire, as we all know, but seemingly ignore.

    [1] Strasburg K’s 12 in his debut, and nay-sayers point out “it was the Pirates”. Why?

    [2] The Rays have 200 more team walks as the Astros. Might pitching against the Rays more often affect the number of batters you walk?

    [3] The Blue Jays have hit 120 more homers than the A’s. Might playing the Blue Jays more often affect your HRA?

    Likewise, the DBacks have K’d almost twice as much as the Royals — seriously), think which team you’re facing affects your number of K’s?

    We need to stop thinking of anything as a “direct result” of the pitcher. It’s just degrees of influence. I would consider a strikeout on a pitch down the middle as being more lucky than giving up a hit on a ball down the middle.

    Comment by CircleChange11 — August 26, 2010 @ 10:04 pm

  135. Dave, who are your CYA picks and why?

    Comment by khup — August 26, 2010 @ 10:11 pm

  136. [1] DH instead of pitchers as batters.

    [2] StL philosophy of pitching to contact and pounding the zone with late-breaking pitches (2-seamers, cutters, sinkers, etc).

    [3] Pitchers and defense work in tandem. If you have high GB%, then you’ll have/need a good SS.

    [4] StL has not had a good fielding 3B since Rolen, and 2B is a converted OF, and David Freese and Troy Glaus were not stellar.

    Yet, despite all that, CC29′s BABIP in StL has remianed vey consistent. Maybe he is doing something intentional, as part of the pitching philosophy and emphasis.

    Wainwright, Carpenter, and Garcia all have low BABIPs. Is anyone here really stating that StL has a dominant defense, with Lopez, Freese, Schumaker, as 1/2 of the IF?

    If that’s the case, then there should be article after article on how Brendan Ryan is the greatest thing since Ozzie Smith, because he will have single-handedly made carpenter, Wainwright, and Garcia look outstanding.

    It might also have something with pitching philosophy of St. Louis. It’s not a coincidence that their star pitchers are VERY similar, or that when guys come to StL and have success when they have not elsewhere, that they do so by throwing a lot of late-moving (i.e., high velocity) strikes, like 2-seamers, cutters, sinkers, etc.

    It’s a combination of things, and I explained this in another post. StL takes an offensive hit at SS because their pitchers do throw ground balls and defense at SS is a priority. The D is built for the pitcher types & philosophy, and the pitchers pitch to the defense. They work in tandem.

    If StL had a different philosophy and had high K, high FB pitchers, emphasis on SS defense could be lessened. Good SS play and GB pitchers is kind of a tradition in StL since the 80s.

    But are Glaus, Freese, Lopez, and Schumaker really great defenders? No, not at all. Ryan and Pujols are good. Carpenter also pitched when Chris Duncan was a LF, partially balanced by Edmonds in CF.

    It’s not just strictly luck, and it’s not just strictly defense.

    The problem is we don’t know what % of each there is, so we just chalk it up as either 100% or 0%.

    I have no idea why we don’t give the pitcher credit for 50% of the difference in BABIP, instead of ALL or nothing, just because we cannot yet accurately attribute the pitcher’s influence to lower/higher BABIP in a particulr time span.

    Did I mention these guys had to pitch with Chris Duncan (a 6’5 lumbering 1B) in LF. Jim Edmonds can only do so much.

    Comment by CircleChange11 — August 26, 2010 @ 10:23 pm

  137. It’s just who was better, not who ultimately enjoyed better outcomes.

    I think people really believe that statement.

    Think my boss will fall for that line: “I was really better than X, Y, and Z … they just had better outcomes.” It’s worth a shot.

    Comment by CircleChange11 — August 26, 2010 @ 10:26 pm

  138. Haha

    Hits allowed, IMO, is not just all “good defense” or “bad defense”.

    Did you miss where I said “likely”? We know that other things can effect BABIP, but by far the most obvious is defense. If a guy has a really low BABIP, he likely recieved better support. It doesn’t mean the discrepancy was “all” defensive support (I find it funny that you accuse me of reading comprehension problems above), it’s that a player with a lower ERA than FIP more often than not received positive defensive support.

    The rest of your post has nothing to do with what I said. All I did was point out the problems with solely relying on ERA as a “performance metric”, nowhere did I advocate using FIP as such.

    Comment by Nick Steiner — August 26, 2010 @ 10:33 pm

  139. It’s just who was better, not who ultimately enjoyed better outcomes.

    “I think people really believe that statement.”

    I think poker is the best analogy to use. If a player loses going in with pocket aces against pocket Kings because of a king hitting on the turn it isn’t a sign that the player who played kings against aces was smart… Any time going into that hand, you want the aces if you have a choice. The aces are a better hand without a doubt. Do you say good play to the person who got caught the king and say that his performance is better because he got better results? No you don’t. The BEST poker players aren’t necessarily going to have better results every time. That doesn’t mean they didn’t play the smartest/best.

    Comment by Justin Mosovsky — August 26, 2010 @ 10:43 pm

  140. great article. this is why FG is a mustread.

    Comment by dank12 — August 26, 2010 @ 11:10 pm

  141. +1. I like tERA more than xFIP. I always use them both, but if I was forced to use one, most of the time I would use tERA. Its just less convenient on this site.

    Comment by Max — August 26, 2010 @ 11:16 pm

  142. Terrible idea. A stellar performance in a rout should not count for more than a stellar performance in a nailbiter.

    Comment by Anon21 — August 26, 2010 @ 11:29 pm

  143. Should not count for LESS, rather.

    Comment by Anon21 — August 26, 2010 @ 11:30 pm

  144. GB rates have a modest correlation to R/9. The linear correlation coefficient between GB/FB and R/9 (baseball reference stats) among pitchers with at least 100 IP in the last 20 years is -0.16. As GB rate goes up, R/9 goes down. This is due to the even stronger correlations between GB rates and HR/9, double play rates and slugging. The correlations between GB/FB and HR/9, GB/FB and GDP/9, and GB/FB and Slugging are -0.42, 0.46 and -0.28, respectively. (Remember that correlations closer to -1.0 and 1.0 are stronger.)

    For comparison’s sake, the correlation between K/9 and R/9 is
    -0.35, so quite a bit stronger, but the GB/FB correlations are NOT negligible.

    Comment by Matthias — August 26, 2010 @ 11:36 pm

  145. I don’t agree that FIP alone decides which pitcher is “better” versus the one that had better outcomes (which is my real point). We’re seperating “better” from “outcomes” by selecting which “outcomes” we use. Walks, K’s, homers are all outcomes right?

    We often chalk BABIP up to luck or defense, but HRA is not bad/good luck? I’d believe the latter if most HR’s cleared the fence by 50 feet. It’s not luck that a ball travelled 395 feet instead of 385?

    What did the pitcher do differently to influence that situation (HR) that he doesn’t do to influence other batted balls?

    I am perfectly fine with splitting the difference in actual and expected BABIP, so we give the pitcher some credit for it. Chalk it up to 50% influence and 50% luck, instead of 0% and 100%, (by ignoring it). Or give it 33% pitcher influence, 33% luck, 33% defense. But, just ignoring it and hand-waving it off as nothing is bogus.

    He’s obviously doing something to influence it, otherwise wouldn’t all of the pitchers on the staff have the same low, low, low BABIP? Surely the Braves great D and the BABIP Fairy don’t just show up when Tim Hudson pitches.

    I am fine with saying “we cannot currently measure with accuracy and confidence, the pitcher’s influence on such a thing”, but I’m not fine with “We don’t know so we’ll ignore it, chalk it up to luck/defense, etc”. Let’s give the P some credit until we know for a fact that he has no influence on it.

    Comment by CircleChange11 — August 26, 2010 @ 11:58 pm

  146. This is pretty close, though it’s based largely on ’06 research and could be updated slightly: http://www.breakingdownbaseball.com/

    Comment by Ethan — August 27, 2010 @ 12:07 am

  147. Since when has it been proven that pitchers certainly have control over their home run rate? I thought research has shown that have very miniscule control, if any. If they did have control over it then xFIP would be one useless stat.

    Comment by Ryan — August 27, 2010 @ 1:30 am

  148. http://0rz.tw/h6Kkn welcome to this 0nline malls. there are have many of cl0thing ,sh0es jewelry handbag , etcthe pr0duct type to be complete, quality fine price preferential benefit.Enjoys your shopping pleasure !!!? ??
    ??????????..??? ? ?? …./
    ??????????????????????
    ?????????????????????
    ????????????????????????^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ++++++

    Comment by fdgtff — August 27, 2010 @ 4:03 am

  149. Could you also factor in line drive percentage (LD%) when you factor in BABIP, since if a pitcher is getting hard-hit liners, as opposed to dribbling soft grounders, the former are more likely to lead to hits and runs than the latter

    Comment by BX — August 27, 2010 @ 9:12 am

  150. he never said he was the leader

    Comment by Flipp — August 27, 2010 @ 11:59 am

  151. An out is an out. I don’t see why strikeout rate should be the primary consideration unless it’s weighted some way against unsuccessful outcomes. Sandy Koufax said he became a good pitcher when he stopped trying to make hitters miss and started trying to make them hit his pitches in the spots he wanted.

    But I’m sure you all know more about pitching than Koufax.

    Comment by rockymountainhigh — August 27, 2010 @ 12:38 pm

  152. This thread is so frustrating to read. Circle Change is bringing up some very valid points and it seems nobody agrees with him and I just can’t see why.

    If I go out and pitch 7 innings, allow zero runs but strike out 1 and walk 4 and my team wins… I pitched a great game!!! I don’t care if my FIP stinks, baseball is a team game and a pitcher is part of the defense and my pitches made the hitters “miss”.

    If I do that for two games in a row, than that’s two great games. If I do it for a whole season, well no offense to every other pitcher in the league, but I pitched the best that year. The whole purpose of pitching is run prevention and pitchers use their defense to help with this process. To rely only on stats that ignore this critical function of baseball is just madness.

    Comment by Don Headly — August 27, 2010 @ 12:58 pm

  153. Also, why penalize a pitcher for recognizing that he has a good defense behind him and pitches in an environment favorable to pitchers and pitching to contact?

    Comment by The Original Tommy — August 27, 2010 @ 3:18 pm

  154. It’s not how long the wang is, but how fat it is.

    Comment by Ari Collins — August 27, 2010 @ 4:42 pm

  155. Exactly, Don. Circle Change has made some great points on here about FIP and even about people completely dismissing the value of RBIs as a stat.
    Far too many people on here are misunderstanding what was FIP was originally trying to accomplish.

    Comment by bstar — August 28, 2010 @ 2:56 pm

  156. Scherzer is a Cy Young!

    Comment by asl — August 28, 2010 @ 3:03 pm

  157. Hola, he estado mirando por el blog y no encuentro una forma de ponerme en contacto contigo. Me podrías decir una forma, por favor? Muchas gracias.

    Comment by Ken Bayn — February 24, 2011 @ 12:31 pm

Leave a comment

Line and paragraph breaks automatic, e-mail address never displayed, HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>


Current day month ye@r *

Close this window.

0.363 Powered by WordPress