Hidden statistics are the bread-and-butter of any good analysis, but most DIPS models rarely go beyond the obvious to find a player’s true value. FIP can take you most of the way by looking at the three true outcomes (home runs, walks and strikeouts) and xFIP adjusts FIP to what it would be with a league-average HR/FB rate. But neither of those systems considers how well pitchers control more volatile statistics — the ones that take up the other 70% of plate appearances. Now, with the new SIERA here at FanGraphs, we’re finally gathering the kernels that’ll help all of us figure out the small things that make good pitchers, well, so good.
A year after its release, SIERA has undergone some important changes, which we’re highlighting this week. I think you’ll like what you see. FanGraphs’ new-and-improved ERA estimation system now uses different proprietary data, takes more interactions and quadratic terms into account when reaching its conclusions, treats starters and relievers differently and adjusts for run environment. In other words, the new SIERA does an even better job analyzing pitching skills.
The run-environment tweak is perhaps the most important part of the SIERA version at FanGraphs. Almost immediately after rolling out the initial version in 2010, run scoring began to decline in baseball. That means the initial SIERA version at Baseball Prospectus is about .25 runs higher than the league-average ERA. At FanGraphs, the constant term will be adjusted yearly for SIERA — as it is in xFIP and FIP — which better approximates the current run environment and gives a more accurate statistical assessment.
The differences between the coefficients in bpSIERA (Baseball Prospectus SIERA) and fgSIERA (FanGraphs SIERA) are summarized in the table below.
|Variable||bpSIERA coefficient||fgSIERA coefficient|
|Year coefficients (versus 2010)||–||From -0.020 to +0.289|
|% innings as SP||–||0.367|
(where netGB=(GB-FB), and where +/-(netGB/PA)^2 is + when GB>FB and – when GB<FB.)
Add the following numbers to the constant term for each year:
For now, the constant for 2011 is -.210 to mimic the league-average ERA, though this will be updated after the season’s end.
While most of the relationships between skills and run prevention were already accounted for in the original SIERA, many of these make more sense than before. During my research, I’ve discovered several important things about how pitchers keep runs off the scoreboard — such as:
• Pitchers with more strikeouts have a lower HR/FB ratio.
Example: Tim Lincecum
Simply put, Lincecum is among the best at preventing fly balls from flying too far. With a career HR/FB of 7.2%, hitters struggle to launch baseballs against him, even when they put them in the air. It’s a pretty simple concept: Pitchers who allow less contact see weaker contact from hitters. Since SIERA only looks at ERAs for high-strikeout-rate pitchers like Lincecum, it gives them credit for the run-prevention effects of the strikeout and for the lower BABIPs and HR/FBs they generate.
• When low-walk pitchers give up a walk, it doesn’t hurt them as often.
Example: Tom Glavine
Glavine walked 15% of his opposing batters with first base open — even after netting out intentional walks — but he only walked 6% of hitters the rest of the time. While Glavine might be the most extreme example, most control-wizards’ walks are more often unintentionally intentional. Essentially, better pitchers risk walks when making a mistake is less risky.
• Pitchers with more strikeouts have lower BABIPs.
Example: Jered Weaver
Pitchers who allow less contact also see weaker contact from hitters. Overall, the average BABIP for high-strikeout pitchers is lower than that for low-strikeout pitchers.
Yesterday’s article discussed this in detail.
• Pitchers with more strikeouts get more ground balls in double-play situations.
Example: Cole Hamels
Hamels has a career 48.1% ground ball rate in double play situations, but just a 42.6% ground ball rate the rest of the time. On average, this effect is only slight, but it’s notable. Situational pitching is a skill when you control the strike zone.
• Relief pitchers have lower BABIPs and HR/FB.
Billy Wagner had a career BABIP of .261, a number almost impossible for a starter to match. Wagner got one inning at a time, which meant he could unleash what he had without conserving energy.
But perhaps the best example is Tom Gordon, a pitcher who both started and relieved during his 21-year major-league career. As a starter, Gordon had a .300 BABIP. As a reliever, it was .275.
Knowing that relievers consistently allow lower BABIPs and HR/FBs, SIERA would add 0.37 to a pitcher’s ERA if he starts in all his innings, versus if he relieves.
• The more base runners a pitcher allows, the higher the percentage of them will score.
Put me on the mound, and I’ll load the bases. And how many hits that I give up will be run-scorers? That’s pretty obvious.
Now put nine of me at the plate and figure out how often I’d find myself in scoring position? Again, pretty obvious.
Considering these examples of lopsided competition, the pitcher who constantly has runners on base is going to benefit more from a strikeout. The pitcher who rarely has runners on base will suffer less from occasional contact.
The new SIERA includes a coefficient on strikeout-rate squared, which allows a strikeout’s effect to be different for pitchers with higher and lower K rates. Simply put, a pitcher won’t decrease his SIERA as much when his strikeout rate goes from 24% to 25% as he will when his strikeout rate goes from 14% to 15%.
Similarly, a walk with a man on first base puts a runner in scoring position, while a walk with the bases empty does not. SIERA allows a walk to have the snowball effect of a real-life situation. A jump in walk rate from 4% to 5% wouldn’t increase a pitcher’s SIERA score as much as a jump from 14% to 15%. This walk-rate-squared term also is an addition to the new SIERA.
By incorporating an interaction term, SIERA allows the importance of ground-ball rate to vary with strikeout rate. Pitchers who strike out fewer hitters need ground balls more often to generate double plays, and fly balls hurt them more often.
On the other hand, pitchers who K enough batters to keep the bases empty aren’t hurt as much by an occasional home run. For example, the average home run in 2010 knocked in 0.6 extra runners — but home runs against Johan Santana only scored 0.4 extra runners. This home-run mitigation is what I call the “Johan Santana effect.”
• The more walks and singles that a pitcher allows, the more often a ground ball will induce a double-play.
In SIERA’s previous version, the coefficient of walk rate multiplied by ground-ball rate was negative. This time, though, it’s slightly positive. On one hand, ground balls after walks can lead to double plays, erasing potential base runners — and pushing the coefficient towards the negative. But walks after ground-ball singles put runners in scoring position — pushing the coefficient towards the positive.
Overall, both effects are in play, so I let their combined effects shine through by including the small, positive coefficient. For most pitchers, this will only affect the hundredths digit on their SIERA.
• Ground balls become hits more often than fly balls.
BABIP on ground balls last year was .233; on fly balls, BABIP was .137. SIERA assumes higher BABIPs for pitchers with more ground balls. This explains why Lincecum had higher BABIPs than some other fireballers who were discussed in yesterday’s article.
• The more ground balls a pitcher allows, the easier they are to field.
Example: Brandon Webb
The infield defense behind Brandon Webb has been very average in his career, but his BABIP on ground balls is not the league-average .233. During his career, it’s been .205. Webb’s sinker is heavy, and it has a tendency to chop off the bottom of the bat. Nearly every pitcher with Webb-like ground-ball skills allows fewer grounders to go for hits. Fausto Carmona, Roy Halladay, Derek Lowe, Tim Hudson, Chien-Ming Wang and Jake Westbrook all rack up ground outs. This is why the coefficient on net ground-ball-rate squared is actually negative.
In other words, a pitcher who increases in ground-ball rate from 45% to 50% will not help his SIERA as much as a pitcher who increases his ground-ball rate from 55% to 60%, because even though both are giving up fewer home runs by increasing their ground-ball rates, the latter pitcher is getting more outs on those extra ground balls.
This is probably the most important non-linear term included in SIERA. In the article linked above, I showed that all of the pitchers who had consecutive 60% ground-ball-rate seasons had below average BABIPs on ground balls.
• Pitchers who have higher fly ball rates allow fewer home runs per fly ball.
Example: Matt Cain
With a career 45% fly ball rate, Matt Cain is among the best at keeping his fly balls in the yard. He gives up mostly infield flies and shallow outfield flies, which is why his career HR/FB is just 6.8%. SIERA assumes that pitchers who allow more fly balls have below-average HR/FB rates, and that’s exactly what happens.
• Run scoring has dropped in the past few years.
SIERA now has a term to represent the run environment, net of peripheral statistics. A pitcher with the same peripherals in 2006 and 2010 will have an ERA that is .29 runs higher.
For the researcher, SIERA’s discoveries give a blueprint to better analyze pitching. For the average fan, SIERA factors in each of the pitching tendencies highlighted here, and spits out a neutralized ERA version of that accounts for all of them.
Eyeballing a pitcher’s strikeout rate, walk rate and ground-ball rate will give you a pretty good sense of how well a pitcher has pitched. But juggling the interplay between all of the effects hasn’t been possible. In a way, SIERA frees your mind.
Fans now can look at pitchers like Cain, Glavine, Lincecum and Weaver and get the shorthand explanation behind their successes. For the rest of the pitchers, SIERA makes you think about their performances in ways that other DIPS metrics have traditionally ignored.
The next parts of this series will:
1. Discuss pitchers with large differences in their xFIPs and SIERAs and explain what they teach us about pitching.
2. Test SIERA against different ERA estimators.
3. Discuss some attempted changes to SIERA that didn’t work.