SIERA Mailbag Answers

by Matt Swartz

July 28, 2011

1. What is SIERA? Sounds like xFIP to me. Give me one sentence on it.

SIERA is an estimator of what a pitcher’s ERA would be with average luck, defense, and park, by looking at other pitchers with similar strikeouts, walks, and ground ball rates in recent seasons, and goes a step further than similar estimators by accounting for the BABIPs and HR/FBs of similar pitchers.

2a. Can you give some of the details to the analysis showing that, stripping out the effects of fly balls, high strikeout pitchers have lower BABIP?

2b. How much of a low BABIP is caused by each variable?

Holding starter/reliever role and fly balls constant, a 10% increase in strikeouts decreases your BABIP by about six points (.006). So if you struck out 15% of hitters, your BABIP might be .300, but if you brought that up to 25%, your BABIP would be about .294.

Holding starter/reliever role and strikeouts constant, an decrease in net ground ball rate by ten percent (so maybe a 5 percent increase in fly balls and a 5% decrease in ground balls) will lower BABIP by about 10 points. So if you had a 50% ground balls, 30% fly balls, 20% line drives breakdown and a .300 BABIP, that would drop to .290 if you had 40% ground balls, 40% fly balls, 20% line drives.

3. Could the correlation between high strikeouts and lower babip be that high strikeout pitchers have men on base at a lower percent (overall, not babip) than a pitcher in all ways similar but with a lower strikeout rate? Since the defense shifts to a different alignment when a man is on first that allows them to hold runners on and turn double plays it would make sense that the defense wouldn’t get to the same number of balls in play when runners are on.

Could be some of it. I looked at the quartiles of strikeout rate from highest to lowest for pitchers with at least 40 IP and did BABIP with bases empty only: .285, .296, .296, .298. Contrast that with overall BABIPs of .285, .295, .298, .301. So definitely some evidence of what you’re talking about but not a huge difference.

4. Why do we want to predict future ERA? If we’ve decided that ERA is a “bad” metric for evaluating pitchers, why are we anchoring against it? I understand the fantasy applications, but if we’re trying to get a handle on how good a play is – shouldn’t we consider a better measure of player value?

The point is that RA is better than ERA because unearned runs and earned runs are both the partly fault of the defense and partly the fault of the pitcher. But scaling differently only matters if pitchers with similar SO,BB,GB rates are more/less likely to give up unearned runs. GB pitchers are slightly more likely to give up unearned runs but the effect is so minor that SIRA & SIERA are very similar.

5a. Matt, Great work, as always. Been a huge fan since BP Idol and your work on SIERA and home field advantage. My question: Regarding the extreme similarity of SIRA vs SIERA rankings, can’t I just gross up SIERA by 9% (very close to the consistent MLB ratio of total runs over earned runs)?

5b. How do you convert from SIERA into SIRA (turn it into runs, rather than earned runs)? Do you just have to estimate, or is there a change you can make in the formula? Thanks!

Thanks. The formula for SIRA is in Part Five. I’ll try to make a Google spreadsheet where people can plug it in too. Follow me @Matt_Swa for this soon. If you just multiply by 8.5%, you’ll be off by less than 0.05 about 72% of the time for pitchers with over 100 innings, and within 0.10 about 95% of the time.

6. Can you type out exactly how the formula should look. I know you posted the coefficients but I cannot find an actual copy of the new formula spelled out in any of the articles.

If you go to Part Two, look at the table. Multiply each variable listed in the leftmost column by the number in the fgSIERA column, and then add all those up. I’ll put the calculator in my Google Spreadsheet.

7. Follow-up question: You have posted comparisons of next year predictiveness (?) for SIERA vs. xFIP, FIP, QERA (which as I recall from your initial research was very weak). How does it do vs. PECOTA ERA predictions? Are those simply QERA adjusted for body type and age?

QERA and PECOTA have nothing to do with each other than their originator, and I don’t know how much of Nate Silver’s PECOTA relates to the current version. I assume that regressing to the mean will do better than not, so the RMSE for PECOTA must be lower. I can’t say about correlation, but I’m guessing any projection system worth its salt can beat SIERA’s correlation with next year ERA because SIERA was designed to do so. I’d be curious how it fared against SIERA*, from Part Four.

8a. Since the correlation of your version of SIERA to xFIP is .94, is it really worth all the complications and the annual tweaking to add an unproven statistic?

8b. Which pitchers have the biggest difference between their xFIP and SIERA, and why?

I broke this down in Part Three. The short answer is that they often differ. You could probably come up with something with a .94 correlation with SIERA that was a terrible ERA Estimator if you really tried. SIERA gives more credit to strikeout pitchers and less to ground ball pitchers.

Annual tweaking isn’t really what I’m doing. The methodology shouldn’t be tweaked — I just will run the same program on new data, just like is done with FIP and xFIP to generate the constant.

9. Did you do a Box-Cox test or something similar on the relationship between the independent variables and ERA? Basically how did you determine a linear model was appropriate?

SIERA is not-linear, but ERA is relatively normally distributed, so I didn’t need to run a regression on a transformation of ERA.

10. How do you explain pitchers whose ERAs consistently beat their SIERAs year to year (such as Mark Buehrle)? Are they just recipients of very good luck or do they have some skill that SIERA is not accounting for?

What skill do pitchers possess that is not correlated with amount of contact, angle of contact, and control? Those skills exist and SIERA will miss them. I’m sure that there are more of the pitchers that regularly top their SIERAs are lucky than are unlucky, but guys like Matt Cain, Jair Jurrjens, Tom Glavine, and Mark Buehrle probably have skills that go beyond SIERA.

11. Considering TangoTiger’s work and the incredible case of Matthew Cain (as well as the Giants as a team), is the next version of SIERA going to roll out soon with HR/FB data appropriately regressed?

SIERA is unique in that it does not regress HR/FB all the way. It regresses it to the average among similar pitchers. So Matt Cain is a fly ball pitcher with a decent strikeout rate, so SIERA assumes his HR/FB rate is low…just not as low as it is. Including HR/FB in the regression would just be pre-supposing runs. A projection tool should weight a pitcher’s historical HR/FB appropriately, but SIERA is not designed to do that.

12a. What happens to the suggested appearance of multicollinearity in SIERA when you apply Principal Components?

12b. Baseball Prospectus recently put up an article essentially questioning the merits of SIERA. Have you read the article, and if so how would you respond to their criticisms (IE the unnecessary complexity and risk for confounding variables?)

For the mathematically intrepid, hop into my replies at Tom Tango’s blog here. In short, the RMSE test is misleading because no one would or should evaluate pitcher skill over 1,000 IP. It does test better over 100 IP to 400 IP, which is what SIERA should be used for. Additionally, Multicollinearity is not an issue with squared and interactive terms in regressions. The argument he makes boils down to saying strikeouts are correlated with themselves, something that is true and irrelevant. More detail can be found in that thread.

13. Will it be used in WAR instead of FIP any time soon?

It depends how you feel about what WAR should include. Should it include luck? Should include luck on home runs? FIP is a worse estimator of skill than xFIP and SIERA but it is a better estimator of what happened. I think it’s fine the way it is, and might even consider a step further and look at RA directly (adjusting for park and defense).

14. Does SIERA adequately explain Worley’s season to date?

It explains that he is highly unlikely to keep this up. Every year several rookies (and often Phillies rookies) have lucky BABIPs and people try to rationalize post hoc why this pitcher is likely to keep this up. Vance Worley is an average pitcher, which is a highly valuable asset to have for six years.

15. What is Michael Cuddyer’s SIERA?

Michael Cuddyer’s SIERA is 8.53. He trails Mitch Maier’s 6.21, but leads Wilson Valdez, who is at 9.82. But Valdez is 1-0.

11 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Yirmiyahu

12 years ago

Awesome work, Matt. I love the stat, I’ve loved all your posts.

Should we expect you to post here regularly, beyond your SIERA work?

Matt Swartzmember

Reply to Yirmiyahu

Thanks. Yes, I’ll be writing here beyond SIERA, and I’ll also be writing a couple other places too though. I have a few projects up my sleeve.

BAL	CHW	LAA
BOS	CLE	OAK
NYY	DET	SEA
TBR	KCR	TEX
TOR	MIN	HOU

ATL	CHC*	ARI
MIA	CIN	COL
WSN	MIL	LAD
NYM*	PIT	SDP*
PHI	STL	SFG