## Using xK and xBB With SIERA

Expected ERA metrics like SIERA, FIP and xFIP are one of the best sabermetric developments we have benefited from in recent years. No longer do we have to eye the various luck rates and attempt to make mental calculations as to how lucky or unlucky a pitcher has been. Expected ERA metrics do the work for you and spit out a number on the same scale as ERA so you could quickly identify those pitchers benefiting from good fortune and those who have not.

Unfortunately, as great as the expected ERA metrics are, they work best if the inputs have little statistical noise. For example, they all heavily rely on a pitcher’s strikeout and walk percentages. But what if those rates themselves are a bit flukey over smaller sample sizes? You know the saying, garbage in, garbage out. So I decided to take my recently published formulas for both expected K% and expected BB% and plug them into Matt Swartz’s SIERA formula. The idea here is that this might be an even more accurate picture of how well a pitcher has pitched.

So I began by combining FanGraphs and Baseball Reference data once again to get all the relevant metrics on one spreadsheet. I then started with the following formula provided by Matt.

**SIERA = constant – 15.518*(SO/PA) + 9.146*((SO/PA)^2) + 8.648*(BB/PA) + 27.252*((BB/PA)^2) – 2.298*(netGB/PA) –/+4.920*((netGB/PA)^2) – 4.036*(SO/PA)*(BB/PA) + 5.155*(SO/PA)*(netGB/PA) + 4.546*(BB/PA)*(netGB/PA) + .367*(IP as SP)/(IP total)**

*netGB = GB – FB

**The -/+ term is determined based on whether or not GB >= FB. If it is, you would use the minus sign, otherwise if FB > GB, use the plus sign.

I first had to calculate this myself, rather than rely on the published SIERA marks on the player pages, because I needed to figure out what the constant is. It turned out to be about 5.65 and after several player page checks, the SIERA marks matched for all.

I then calculated each pitcher’s xK% and xBB% and substituted those expected marks for their actual marks in the SIERA formula. However, I actually used slightly different versions of the formulas I linked to in the second paragraph. Turns out, using Str% from Baseball Reference improves the adjusted R-squared for xK% by a marginal, but meaningful degree, so I used this new formula instead.

I then noticed a problem with the xBB% calculations. On average, they were a bit on the high side, with the average pitcher owning an xBB% about 0.4% higher (or a little more than 5% greater) than his actual BB%. Since the xBB% formula wasn’t as strong as the xK% formula to begin with, I decided that it should be scaled down just to match the BB% average. So I simply reduced each pitcher’s by the same rate so the average of the xBB% and BB% columns were about even.

Because I still feel we have much more room to improve the xBB%, this can provide us a new way of looking at SIERA. It’s just a matter of substituting the best xK% and xBB% marks with the pitcher’s actual SO/PA and BB/PA marks in the current formula.

This new SIERA could be compared with the pitcher’s actual SIERA marks and then sorted. Tune in tomorrow to see the results and learn which pitchers’ SIERA marks should actually be higher or lower given their xK% and xBB% marks.

Print This Post

*Projecting X 2.0: How to Forecast Baseball Player Performance*, which teaches you how to project players yourself. His projections helped him win the inaugural 2013 Tout Wars mixed draft league. He also sells beautiful photos through his online gallery, Pod's Pics. Follow Mike on Twitter @MikePodhorzer and contact him via email.

Whoaaa, so this is like an expected expected ERA? That’s meta.

So the new xBB% formula is the same as the previous one, but minus 0.4%?

xMETA?

Haha, ’tis true (to the expected expected ERA). The adjustment I made to xBB% was just for this particular exercise. I wanted to make sure it scaled given the data population, but it might be a bit different in another population. It would be much easier if we could just figure out a really good xBB% though!

Well, the point you brought up in your article on xBB% was that you needed to take into account sequencing. Maybe looking at splits and taking into account a pitcher’s control on 3 ball counts would help? You could add zone% on 3 ball counts, or Str% on 3 ball counts. I’m not sure if that’s a repeatable skill or not, but it might be. It seems plausible to me that there would be some pitchers who have great control but throw more balls than you would expect early in counts trying to get hitters to chase. Then, if their control is really great, they’re able to pound the strike zone on a 3 ball count. Just a thought.