New SIERA, Part One (of Five): Pitchers with High Strikeouts Have Low BABIPs

Predicting baseball statistics is a tough job, especially when it comes to pitchers.

For every Roy Halladay pitch machine, there are 10 James Shieldses – guys whose ERAs change a run or two every year. Basically, it’s a crapshoot when it comes to figuring out the next ace – or the former ace-in-waiting who’ll lose his job by the all-star break. Don’t believe me? Consider this: In the past 11 years, four hitters have led the major leagues in WAR; eight pitchers have led the majors in ERA.

So while the league-leading run producers might be predictable, league-leading run preventers change almost every year. But hope isn’t completely lost when it comes to figuring out the next pitching superstar – or dud. Some pitching stats are very predictable, and focusing on those few numbers might lead us to a better system to evaluate talent.

Enter Skill-Interactive ERA – or SIERA — which follows in the footsteps of xFIP by using statistics that don’t change much from year-to-year: strikeout rate, walk rate and ground-ball rate.

When Eric Seidman and I developed the original SIERA — which appeared at Baseball Prospectus more than a year ago — we didn’t totally appreciate why it worked. Essentially, since it uses regression analysis, it asks the question: What’s the typical ERA for pitchers with similar strikeout, walk and ground-ball rates in recent years?

Our thought was that the main reason SIERA worked so well was that it took into account the interplay between those three statistics. But on further analysis, it turns out that SIERA is successful mainly because it assumes a low BABIP and HR/FB for strikeout pitchers (and for fly ball pitchers, as well).

By including both of these effects, SIERA measures pitcher performance in unique ways. Unlike batters, pitchers generally participate in consecutive plate appearances. Metrics like wOBA can approximate hitter performance using linear weights — effectively assuming that the hitter isn’t responsible for the events immediately before or after his performance. Pitchers also have only a share of the responsibility for the outcome of a plate appearance.

Metrics like FIP and xFIP attempt to isolate the outcomes that pitchers do control — like strikeouts, walks and home runs — and then credit them only with those outcomes. Using linear weights and highlighting the run-preventing and run-creating effects of these defense-independent events, the numbers can approximate a pitcher’s contribution to wins and losses far better than a simple ERA formula.

Take American League ERA-leader, Jered Weaver, as an example. xFIP sees Weaver’s 7.81 K/9, 2.06 BB/9 and 48.3% fly ball rate and gives him a 3.32. SIERA sees the same numbers and puts Weaver at  3.15.

So why is SIERA such a Weaver fan? It’s pretty simple: Year-to-year correlation in strikeout rate, walk rate and home run rate is significantly higher for pitchers than BABIP. That implies pitchers have a greater ability to control those outcomes.

Understanding this, Tom Tango developed FIP, which credits a pitcher with the effects of strikeouts, walks and home runs, while assuming that the player had a league-average BABIP. Taking that a step forward, xFIP also assumes a league-average home-run-per-fly-ball rate. Nate Silver took another approach by creating QERA, which uses regression analysis to estimate ERA. And last year, Eric Seidman and I developed SIERA, which also used regression analysis to predict earned run average but made several tweaks to Silver’s original version.

As a result, SIERA gives more credit for a strikeout than FIP and xFIP and less blame for a fly ball. It not only credits pitchers with the run-dampening effect of a strikeout, but also assumes that they allow fewer hits and home runs than FIP and xFIP, since high-strikeout pitchers allow fewer hits and home runs. While xFIP and FIP both assumed that Weaver’s 2010 BABIP was .297, SIERA assumed Weaver had a similar BABIP as other high-K-rate and fly-ball rate pitchers – which was about .278. His actual BABIP was .277.

Weaver isn’t an exception. Last year’s strikeout-rate leader, Jon Lester, had a BABIP of .291. Tim Lincecum led the league in strikeout rate in 2009, and his BABIP was .288. And while Lincecum’s BABIP was .310 in 2008, the 2007 strikeout-rate leader was Erik Bedard, who had a BABIP of .284.

In fact, look at the BABIPs for the league-leaders in strikeout rate during the past nine years:

Year Pitcher SO/PA BABIP
2010 Jon Lester 26.1% .291
2009 Tim Lincecum 28.8% .288
2008 Tim Lincecum 28.6% .310
2007 Erik Bedard 30.2% .284
2006 Johan Santana 26.5% .271
2005 Mark Prior 26.8% .281
2004 Randy Johnson 30.1% .267
2003 Kerry Wood 30.0% .275
2002 Randy Johnson 32.3% .291

Pitchers who allow less contact see weaker contact from hitters. While the batter, the defense and luck have a larger influence on whether a batted ball lands for a hit, pitchers have a role, too. Pitchers who allow ground balls also allow more hits, but fewer extra base hits on balls in play.

Looking at all 3,328 pitcher-seasons (with at least 40 innings pitched) between 2002 and 2010, I sorted players into four groups by strikeout rate. The higher the strikeout rate, the lower the BABIP and HR/FB at each level of strikeout rate.

STRIKEOUT GROUP BABIP HR/FB
HIGH .286 9.1%
MEDIUM-HIGH .295 10.2%
MEDIUM-LOW .298 10.7%
LOW .301 10.7%

Pitchers do have some control over their BABIPs, but there’s too much noise in their actual numbers to infer their true skill levels. Weaver has had BABIPs as low as .238 and as high as .316 during his career. Still, small sample sizes of peripherals say more about BABIP skill level in a way that BABIP alone cannot.

On the extreme end, SIERA assumed the lowest BABIPs in 2010 for:

Jered Weaver (.278 predicted, .277 actual)

Ted Lilly (.280 predicted, .254 actual)

Phil Hughes (.284 predicted, .275 actual)

Colby Lewis (.284 predicted, .277 actual)

Matt Cain (.285 predicted, .254 actual)

It assumed the highest BABIPs for:

Jon Garland (.307 predicted, .268 actual)

Paul Maholm (.306 predicted, .332 actual)

Fausto Carmona (.306 predicted, .284 actual)

Rick Porcello (.305 predicted, .308 actual)

Clay Buchholz (.305 predicted, .263 actual)

While Weaver’s strikeout-inducing prowess makes him an extreme example, SIERA is on the money more often than not. Time after time, a player’s future ERA is closer to SIERA than any similar pitching metric. SIERA does this not by factoring projection trends — like reversion to the mean and aging — but because it answers the most basic question that can be asked about a pitcher: How well did the guy actually pitch?

The next four stories will:

1. Discuss the previous research I’ve done on pitching and how SIERA utilizes it. Discuss the SIERA changes for FanGraphs. Introduce the new formula and explain why it works.

2. Discuss pitchers with large differences in their xFIPs and SIERAs and explain what they teach us about pitching.

3. Test SIERA against different ERA estimators.

4. Discuss attempted SIERA changes that didn’t work.




Print This Post



Matt writes for FanGraphs and The Hardball Times, and models arbitration salaries for MLB Trade Rumors. Follow him on Twitter @Matt_Swa.


86 Responses to “New SIERA, Part One (of Five): Pitchers with High Strikeouts Have Low BABIPs”

You can follow any responses to this entry through the RSS 2.0 feed.
  1. eric says:

    can somebody explain to me that point about 4 hitters leading in WAR and 8 pitchers leading in ERA? I’m a bit confused.

    Vote -1 Vote +1

    • Ari Collins says:

      WAR is more consistent thane ERA, which fluctuates wildly. Thus you have more different pitchers leading in ERA than you have different hitters leading in WAR.

      Vote -1 Vote +1

    • jay says:

      “while the league-leading run producers might be predictable, league-leading run preventers change almost every year”

      Basically, it’s a lot more difficult to predict the ERA leaders than the WAR leaders. Personally, I would argue that difference is probably due to the comparative strength of WAR (ERA is a terrible predictive stat).

      Vote -1 Vote +1

    • MikeS says:

      In addition to the accurate replys given, it may also have something to do with the fact that Barry Bonds (3 times) and Albert Pujols (4) don’t pitch. Two guys may be skewing this sample quite a bit.

      Vote -1 Vote +1

  2. Stormin' Norman says:

    I’m intrigued, but is there any fantasy aspect to SIERA? I mean, FIP and xFIP have shown to be relatively good predictors in future performance, more so xFIP, but does SIERA have any ties to the fantasy community?

    Vote -1 Vote +1

    • Max says:

      It predicts ERA right? I would say yes, this should help in fantasy. The way I am reading it is that while SIERA doesn’t necessarily use aging curves like some other stat predictors, it makes up for that and more by using rates that stay relatively consistent year-to-year, and also have good correlation with BABIP, and then also (ideally) ERA.

      Vote -1 Vote +1

    • Phillie697 says:

      Think of SIERA as a better version of FIP and xFIP. Does that answer your question?

      Vote -1 Vote +1

      • Temo says:

        There’s value in SIERA, but it’s not necessarily better than FIP. At least it wasn’t really “better” than FIP in its first incarnation, more than a year ago.

        However, I want to read the rest of Matt’s articles before I form a more complete opinion.

        Vote -1 Vote +1

      • Matt Swartz says:

        Last year’s version was more predictive than FIP, but not considerably better than xFIP. This year’s version is a little more predictive than both xFIP and old SIERA. Thursday will break this down in more detail.

        Vote -1 Vote +1

    • Yirmiyahu says:

      FIP and xFIP are useful to you in fantasy because they are better predictors of future ERA than ERA itself is. SIERA works along the same lines as FIP and xFIP, but apparently does an even better job of predicting future performance..

      Vote -1 Vote +1

  3. Dave Cameron says:

    Just to add a quick point to what Matt is showing, pitchers with high strikeout rates tend to be fly-ball guys more often than ground-ball guys. You’ll probably note that most of the examples that Matt gives of low BABIP projections from SIERA are heavy fly ball guys.

    So, while it’s a true statement that high strikeout pitchers can post lower than average BABIPs, it’s probably because of the relationship between strikeouts and fly balls. I would not encourage you to take the headline and assume that a high K/high GB pitcher will post a low BABIP.

    Vote -1 Vote +1

    • Yirmiyahu says:

      Dave, have you or Swartz/Seidman done any research in regards to this?

      I mean, there could be a correlation between high strikeouts and low BABIP’s independent of GB/FB tendencies. It makes logical sense, if a guy induces a lot of swing-and-misses, he may also induce a lot of swing-and-not-quite-misses.

      Vote -1 Vote +1

      • Dave Cameron says:

        It’s a matter of how he generates those swing-and-misses. A lot of pitchers do it by throwing a lot of high fastballs. If you can get the hitter to chase, it’s a pitch that’s very hard to make contact with, especially if it has some oomph behind it. But, it needs precision command – if you throw it too high the opposing batter won’t swing, and if you throw it too low, it’s batting practice.

        Sure, it could be that some pitchers are able to place their high fastballs in just the right spot where they’ll generate a lot of weak fly balls even when batters make contact (this is basically the Weaver/Cain explanation theory), but trying to separate that out as a repeatable skill is hard. If a pitcher’s command slips at all, that pop up can easily become a rocket. There’s not a lot of margin of error when you’re throwing fastballs at the top of the strike zone.

        Vote -1 Vote +1

      • Yirmiyahu says:

        I understand that, and I understand that fastball location explains the correlation between high K rate and flyball tendencies. But I’m just curious if you or anyone has studied the data to try to tease out whether there might be an independent correlation between K rate and BABIP.

        Based on the way that SIERRA was developed, I would assume that Swartz & Seidman would know.

        Vote -1 Vote +1

      • Garrett says:

        I think Dave just wrote a very verbose “No”.

        Vote -1 Vote +1

      • I would guess that some of the multiple regression analysis done to come up with SIERA can also separate some of the independent effects of K rate and flyball rate on Babip, if babip is used as the response variable.

        Dave/Matt, were FB rates and K rates checked simultaneously against babip with interaction effects?

        Vote -1 Vote +1

      • Matt Swartz says:

        Yes, they were both significant. Walks approached significance too (more BB lower BABIP) but the effect was minimal compared to K & FB which were both independent, conditional on the other.

        Vote -1 Vote +1

      • THANKS!

        Were BB and K rates by batter faced, or by inning used? I feel like there might be a natural correlation between BABIP and BB/9 and K/9. I’m just thinking that if BABIP goes up due to some randomness, it may naturally increase a pitchers K and BB/9 just because his chances to do so have been artificially increased.

        Thoughts?

        Vote -1 Vote +1

    • Temo says:

      And if you find a high-K/high-GB guy, you’ve got Roy Halladay and you don’t care what his expected BABIP is :p

      Vote -1 Vote +1

      • Yirmiyahu says:

        The Red Sox have been actively collecting these guys. Lester, Beckett, Buchholz, Lackey, Miller all fit the profile, at least on paper.

        Vote -1 Vote +1

      • juan pierre's mustache says:

        lackey says forget your paper, i’ll do whatever i want

        Vote -1 Vote +1

    • Fish says:

      Dave – If it’s the case that high flyball pitchers have lower BABIPs than high groundball pitchers, why does BABIP on your site take HRs into account?

      I’ve wondered this for a while, this seems like the relevant conversation to ask it.

      Vote -1 Vote +1

  4. Ari Collins says:

    Sounds like a fantastic series.

    Vote -1 Vote +1

  5. Sky Kalkman says:

    Did you park-adjust any of this? It’s plausible that pitchers can attack the zone more in parks that decrease BABIP and HR/FB.

    Also, could you show the GB and/or FB rates for the four SO groups? Other research has shown that FB pitchers post better BABIP and HR/FB rates than GB pitchers. (Actually, that’s part of SIERA, no?)

    Vote -1 Vote +1

    • evo34 says:

      I think this is an important point. I would imagine that for parks, a high K-rate is correlated with a low HR/FB rate.

      Vote -1 Vote +1

  6. Nate RFB says:

    Has there been any consideration to utilizing SIERA, tERA, or xFIP in WAR calculations instead of FIP? It sure seems like most people would now say that those three are more indicative of a pitcher’s peripherals value/ability than FIP at this point of time. I know I would be very interested in how much pitcher WAR would change with that as a base instead of FIP, just for curiosity’s sake.

    Vote -1 Vote +1

    • Ben Hall says:

      I know Dave’s answer to using FIP in the past has been that actual HR/FB rates make sense for WAR, since WAR reflects what happened, not what we would expect to happen. SIERA and xFIP (and tERA, I think, but can’t remember) use expected HR/FB rates. So even if a pitcher gave up more home runs than we would expect (even if it was a couple of “just enoughs”) they were still home runs, and they still hurt the team.

      Vote -1 Vote +1

      • Yirmiyahu says:

        Yeah, this is the explanation I’ve seen in the past. But it is inconsistent. Why exclude LOB% luck from WAR, but not exclude HR/FB% luck?

        Btw, tERA doesn’t use expected HR rate; it uses actual HR’s. I used to be a fan of the stat, until I realized that. It does incorporate batted ball data to come up with expected runs on the in-play stuff. Combined with the fact that it is so line-drive dependent (a function of luck/scorer bias, from what I’ve read), I don’t think it’s terribly useful.

        Vote -1 Vote +1

      • Nate RFB says:

        I feel like if you’re going to go that route, you might as well just use RA. Why the mixture of peripherals like K’s/BB’s value and luck-based value like HR/FB? I like having a difference between Fangraphs and B-R just for comparison’s sake so having theirs based around RA is fine, but I can’t help but feel that Fangraphs’ version could be improved. And what SIERA is doing looks really nice, since while I do like xFIP quite a bit it ultimately punishments high FB%, where as SIERA appears to take into account that high FB% don’t tend to have high HR/FB%.

        The long and short of it is, I think greater “value” can be currently ascertained from other rate statistics besides FIP at this point of time. Though historical WAR would have to rely on it, of course.

        Vote -1 Vote +1

  7. Jeremy says:

    SIERRA, FIP, and xFIP still have a huge flaw. That is the assumption that all balls in play are equal from pitcher to pitcher. This assumption originated during the steriod era and was more valid then because every hitter was essentially the same. Now that steriods are diminished in MLB, not every hitter is a power hitter so you cannot just rely on BB, K, and HR to determine a pitchers value. SIERRA takes it farther in assuming that high strikeout pitchers see weaker contact from hitters. They cherry pick their supporting data by only using the best pitchers from each season. When it comes to average pitchers, these metrics are much more flawed.

    Vote -1 Vote +1

    • Phils Goodman says:

      Metrics that have evolved from DIPS theory don’t assume that all batted balls in play are actually equal. They just don’t attempt to measure the difference, because they acknowledge the problem of separating pitcher skill from noise. When you’re trying to evaluate pitcher skill by starting all over from scratch, it makes sense to build on the outcomes we know the pitcher has the most influence on. Of course these limitations will produce flawed metrics, but the fact that these metrics are less flawed than ERA indicates that ERA is measuring an awful lot of noise.

      Vote -1 Vote +1

    • Yirmiyahu says:

      SIERRA does not assume that all balls in play are equal. It assumes that high strikeout pitchers induce weaker contact. It assumes that groundball pitchers induce fewer doubles/triples. It assumes that if a guy allows a lot of men to reach base, that HR’s will hurt him more than a guy who doesn’t allow many baserunners.

      I don’t know where you come up with the idea of cherrypicking data. The regression analysism was based on 3328 pitcher-seasons. Every pitcher with 40+ innings. For 9 seasons. Look at the chart in the article, which reveals that low strikeout pitchers have high BABIP’s and high HR/FB rates. That’s taken into account.

      I think SIERRA addresses exactly the kind of doubts you have about FIP and xFIP.

      Vote -1 Vote +1

      • Phils Goodman says:

        Well, SIERA would still count any two ground balls roughly the same way (for instance, a slow roller to the first base bag and a chopper over the 3Bs head). So the stats are still “flawed” in the sense that they rely on a certain level of abstraction to account for what they cannot grasp descriptively. But that’s just the “nature of the beast,” as they say.

        Vote -1 Vote +1

      • Matt Swartz says:

        SIERA will not count all ground balls the same. It will count all ground balls from pitchers with similar peripherals the same. Guys like Derek Lowe, Tim Hudson, Brandon Webb- these guys will all generate choppers that are easy to field and will be treated as such.

        Vote -1 Vote +1

      • Fish says:

        Why the assumptions on hard contact? Isn’t that something we can now confirm with all the available video? I realize it’s somewhat subjective, but there can certainly be a consensus on most balls.

        As unscientific as “consensus” and “most” sound, that’s better than “assumption”, right? We’re at the point that we can KNOW if pitchers induce weak contact or not.

        Vote -1 Vote +1

    • Phillie697 says:

      Jeremy,

      While not all hitters are created equal, pitchers don’t face the same hitter in every at-bat either. I think it would be next to impossible to distinguish pitchers by the quality of hitters they face (I mean, people talk about AL East pitchers needing perhaps some adjustment for years, but never has anyone really did anything along those lines). If you can find a way to do it, you’d be the next big shot in saber circles. Go to it :)

      Vote -1 Vote +1

    • joser says:

      During the steroid era every hitter was essentially the same? Really? When exactly was the steroid era, then, because it apparently happened outside of the careers of both Tony Gwynn and Ichiro.

      And I’d say all balls in play are more equal from pitcher to pitcher than they are from batter to batter, at least within the categories described (how different can one pop-up be from the next, regardless of pitcher? Are there weak grounder pitchers vs hard grounder pitchers in some important way that isn’t captured by BABIP?)

      Vote -1 Vote +1

      • Matt Swartz says:

        SIERA treats ground balls from ground ball pitchers differently. This is explained in more detail tomorrow, but it is factored in.

        Vote -1 Vote +1

  8. JobaWockeeZ says:

    For a one year evaluation would SIERA or tERA be a better bet?

    Vote -1 Vote +1

  9. Babip Avengers says:

    Will we see SIERA on the leader board pages? Any chance we can get a synopsis of how it will be available at FG?

    Vote -1 Vote +1

    • Phillie697 says:

      It’s already there… Just click on the Advanced tab.

      Vote -1 Vote +1

      • Babip Avengers says:

        Indeed it is, thank you very much (SIERA shows up under batted ball for individual players, so I was looking for it under the same tab on the leader boards). Thanks again.

        Vote -1 Vote +1

  10. Phils Goodman says:

    I’d love to see a leader-board for pitcher sWAR (it would be the same process as pitcher fWAR, but plug in the SIERA instead of the FIP). I’d do it myself if I knew the formula for converting FIP-over-innings pitched into WAR.

    Vote -1 Vote +1

  11. Jonas says:

    Great addition to Fangraphs. But can anyone explain why this same article was posted on Fangraphs a week ago, only to be removed quickly, and now reposted?

    Vote -1 Vote +1

  12. Welp says:

    How long has Fangraphs misspelled “independent” in FIP (and now SIERA) without me noticing? Sheesh.

    Anyway, awesome news.

    Vote -1 Vote +1

  13. Matt Swartz says:

    Hi, everyone! I’ll try to answer more of these questions later, but the short points to make.

    1) Dave is right about FB%. I didn’t focus on the low BABIP & HR/FB of fly ball pitchers, but that group is factored in too. Pitchers with high K% have low BABIPs & HR/FBs, as do pitchers with high FB%. So the leaderboard in expected BABIP includes pitchers who are good at both, because independent of K%, higher FB% -> lower BABIP; and independent of FB%, higher K% -> lower BABIP.

    2) Yes, it’s tested well at predicting future ERA. This will be discussed in more detail in Part 4 on Thursday.

    3) The ERA used in the regression is park-adjusted. I discuss some of the park-adjustment experiments in Part 5 on Friday.

    4) SIERA leaderboards are under the “advanced” tab in the pitching leaderboards.

    5) For clarity, the examples in the article were cherry-picked but the whole formula is not, nor is the testing.

    Vote -1 Vote +1

    • Pierre says:

      I think the poster meant paek-adjusting the output of the eauation (i.e. the SIERA). Presumably, the same inputs (i.e. “peripherals”) yield different results in Fenway v Petco, etc.).

      Vote -1 Vote +1

  14. domingoes says:

    Anyone else ever notice over the past year or so that Johan Santana’s name always links to Johan Santa, a Ranger’s prospect from the Dominican summer league?

    Vote -1 Vote +1

  15. Jason says:

    I don’t know if the question is inappropriate for this format, but is there reason to believe, Matt, that BP will adjust their SIERA formula as well?

    Vote -1 Vote +1

    • Matt Swartz says:

      I don’t know, but I don’t think so. I developed the version that they have there now and I didn’t do the adjustments until I started at FanGraphs. Anyone will have access to the formula, since it will be in tomorrow’s article.

      Vote -1 Vote +1

  16. AustinRHL says:

    I’m very happy to see SIERA finally come to Fangraphs – it’s long overdue, IMO. It instantly becomes my favorite pitching statistic.

    To keep pushing on the relationship between fly balls, strikeouts, and BABIP – did you do a regression analysis to see if either FB% or K% gave no new information once the other was present? It doesn’t seem obvious which one is the primary driver of low BABIP, or if they contribute equally.

    Vote -1 Vote +1

    • Matt Swartz says:

      Yes, both were very significant actually after controlling for the other. That’s why the guys listed above at the bottom are all extreme SO/FB pitchers on the low BABIP list and extreme contact/GB pitchers on the high BABIP list.

      Vote -1 Vote +1

  17. jim says:

    Wait, is this the sabr community finally, officially admitting that pitchers have a degree of control over the type of contact they induce?

    Vote -1 Vote +1

    • Temo says:

      It’s long been accepted that pitchers control GB%, which is a type of contact that is induced. I think the same can be said of infield fly ball %.

      BABIP, however, involves a lot of noise- both random noise and the type of noise that indicates hitter skill rather than pitcher skill. Pitcher skill in BABIP is probably the least influential in determining BABIP, and I don’t think this article disputes that notion.

      Vote -1 Vote +1

  18. IvanGrushenko says:

    Why does SIERA only go back to 2002 on Fangraphs and back to 1950 on Baseball Prospectus?

    Vote -1 Vote +1

    • Yirmiyahu says:

      That’s interesting. Batted ball data only goes back to 2002. Without that, SIERA would only have K/BB/HBP to look at.

      I guess there’s groundout/airout data to consider, but that’s not reliable and is quite different from GB/FB. I guess you could come up with a formula to convert GO/AO to GB’s/FB’s, and then stick that data into the SIERA formula.

      Curious what the deal is with BO’s pre-2002 SIERA formula.

      Vote -1 Vote +1

  19. Pretentious Polyester Poodle says:

    ’99 Pedro – 37.5 K%, .323 BABIP. How’s that for unlucky?

    Vote -1 Vote +1

    • Temo says:

      He had a decent defense too.

      Did have the monster to deal with though. Then again, he had a .333 BABIP at home and .314 BABIP on the road. Not awful splits. And his BABIP went all the way down to .236 the next year (.261 home/.218 road).

      So yea, BABIP is still very random.

      Vote -1 Vote +1

    • Yirmiyahu says:

      2009 Ricky Nolasco: 5.06 ERA, 3.23 SIERA.
      2010 Clay Buchholz: 2.33 ERA, 4.29 SIERA.

      Vote -1 Vote +1

  20. evo34 says:

    Love that Swartz is writing here, and the idea of improving any metric. But I think it should be pointed out that SIERA is not well suited as an in-season fantasy ERA predictor. It fails to account for park effects on flyballs, and so ERA will usually be lower than SIERA for teams like Padres and Giants, and it will usually be higher than SIERA for teams like Rockies and D-Backs. ANd it tries to remove defense, which is non-negligible for fantasy ERA predictions.

    I hope Swartz can find an easy way to factor park effect and fielding back into a version of SIERA for in-season fantasy prediction purposes.

    Vote -1 Vote +1

  21. Joel says:

    I always look to Tim Wakefield when evaluating these numbers. Here’s a guy that confounded DIPS from the very beginning, as he’s a flyball pitcher who elicits weak contact. Even more confounding is the fact that his K/9 rates haven’t been very high for quite some time, although he frequently gets batters to swing and miss.

    Long story short, Wakefield’s career FIP is 4.71, which is about 1/3 run higher than his career ERA of 4.39. With over 3000 innings to his credit, this is not a sample effect. Interestingly, xFIP doesn’t fare much better. However, SIERA does, although those numbers only track back to 2002.

    I recall some discussion a long time back that Maddux had a similar ability to suppress runs relative to his peripherals.

    Vote -1 Vote +1

    • Yirmiyahu says:

      SIERA is not more accurate than FIP when it comes to forecasting Wakefield’s. Since 2002:

      ERA FIP xFIP tERA SIERA
      4.37 4.57 4.76 4.53 4.59

      Having said that, I think it’s kind of silly to evaluate any DIPS metric by using Wakefield. He’s the one guy who should break the rules. Unlike normal pitchers, Wakefield has exhibited a repeatable skill in the areas of IF/FB and LD%.

      Vote -1 Vote +1

  22. FairweatherFan says:

    SIERA hates Barry Zito, putting his career mark at 4.70, Almost an entire run higher than his career 3.88 ERA.

    Over 2200+ innings. Interesting.

    Vote -1 Vote +1

    • evo34 says:

      Some of that (about 0.30 runs) is due to his environment. A’s and Giants SPs outperformed their SIERAs during the years Zito was with each. This is due to park and defense.

      Vote -1 Vote +1

  23. ctt8410 says:

    Has there been any thought given to a SOS adjustment?

    For example, the average hitter that David Price has faced this season is hitting .265/.334/.425 total against all pitchers faced.

    The average hitter that CJ Wilson has faced is hitting .255/.319/.390 total against all pitchers faced.

    It’s not immediately obvious to me how much difference this would make, but it seems like we need to adjust pitcher stats to a league-average hitter to remove all factors that are outside the pitcher’s control.

    Vote -1 Vote +1

    • Matt Swartz says:

      It’s an interesting idea. I’ve never quite figured out how to do it, because the difference between good hitters and bad hitters is partly that they do more damage with their GB & FB and hit more LDs, so it’s partly adjusted based on that. But better hitters also hit more FB, strike out less, walk more. So I’d need to separate out those two values.

      Vote -1 Vote +1

    • evo34 says:

      Excellent point. I think that’s the next major step in player evaluation. Makes no sense that stats accumulated by an Orioles player should be viewed the same as those by a Twins player. They play very different schedule strengths.

      Vote -1 Vote +1

  24. russel58 says:

    I am newer to the site and these concepts but is rare for a pitcher to have his actual ERA, Xfip and SIERA all be the exact same? Roy Halladay currently has a 2.45 for all three. Great info guys!

    Vote -1 Vote +1

    • beaster says:

      I think it proves that Halladay is in fact a cyborg – engineered to defeat armies of ash wielding humanoids

      Vote -1 Vote +1

      • Matt Swartz says:

        Yes, and apparently the way to beat a cyborg is to overheat it, per Doc’s experience at Wrigley tonight.

        Vote -1 Vote +1

      • beaster says:

        There are currently two theories floating around the internet as to why Philadelphia Phillies starter Roy Halladay had to leave last night’s game against the Chicago Cubs without recording a single out in the fifth inning: 1) temperatures approaching 40 degrees Celsius caused last year’s National League Cy Young Award winner to experience a form of heat exhaustion; and 2) the Phillies forgot to bring the proper machinery coolant with them when they traveled to Chicago and didn’t want to further damage the corrosion-resistant tantalum-based alloy that comprises a third of Halladay’s structure.

        Vote -1 Vote +1

  25. Nathaniel Dawson says:

    Matt, any chance you would scale SIERA to R/9, rather than ERA? I know, I know, everybody is used to looking at ERA, but I think it’s clear that R/9 is superior to ERA. We get better information from it, and at some point, we really should supplant ERA with R/9. This would be a great opportunity to start moving people in that direction.

    Vote -1 Vote +1

  26. Shaun Catron says:

    you think Shields ERA/BABIP this season have anything to do with Sam Fuld patroling the outfield? may sound foolish but if i remember correctly the Rays outfield saves the most runs.

    Vote -1 Vote +1

  27. FFFFan says:

    SIERA is to xFIP, as wOBA is to OPS.

    Takes the underlying performance elements (and adds some elements) and weighs them by their impact on runs.

    Vote -1 Vote +1

  28. adohaj says:

    It’s nice to see someone say that BAPIP does not equal luck

    Vote -1 Vote +1

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>