## The Definitive Pitcher Expected K% Formula

There have been many attempts to develop an expected pitcher strikeout percentage (xK%) formula, usually involving one of my favorite metrics SwStk%, perhaps average fastball velocity, and maybe another statistic or two. All of the regression equations did a fairly decent job, but there were always outliers, and I was beginning to see a commonality between them.

A couple of years ago, StatCorner included a pitcher’s called and foul strike rate. For some reason, those metrics are no longer displayed on the player pages. However, when they were, they seemed to explain a lot of the discrepancies between the old xK% and actual K%. For example, Hiroki Kuroda and Jaime Garcia would consistently appear to be unlucky given their high SwStk% marks, but when digging further, you learn that they had gotten fewer called strikes than the average pitcher.

After this data disappeared, I vowed to find a new source. That source was eventually found, as Baseball-Reference displays the trio of “L/Str” (Looking Strike Rate), “S/Str” (Swinging Strike Rate) and “F/Str” (Foul Strike Rate) under the “More Stats” tab in the “Pitch Summary — Pitching” section. Initially, I was unable to find a leaderboard for these metrics, making it impossible to develop a regression equation. But then I realized that I should go directly to the founder himself. So I emailed Sean Forman and asked if a leaderboard existed for these stats, and sure enough, they do! The data isn’t nearly as easy to manipulate as the exported stats from FanGraphs, but after much effort, my data set was clean and ready for analysis.

But first, why these metrics? Well, to record a strikeout, a pitcher needs to throw at least three pitches that result in a strike. Generating each of the three types of possible strikes are the underlying skills that directly lead to the ultimate result of the at-bat, the strikeout. While velocity is certainly important when predicting strikeouts, technically that should just lead to a higher swinging strike rate. So instead of including various metrics that merely predict the direct underlying metrics that feed into strikeout rate, it would be best to just use the direct underlying metrics to begin with.

I initially began my data set with all pitchers (starters and relievers, as there was no simple way to filter for only starters, and besides, I am not sure that would have even been necessary) from 2008-2012. While I would have liked to include as many seasons as possible, the work involved in cleaning each year just made it too time consuming. I felt like five years was good enough. I then narrowed down the data set to only those pitchers who threw at least 50 innings in a season. This left me with 1,629 pitcher seasons to analyze. Before we get to the results, the following table details the correlations between the three metrics and a pitcher’s K%.

L/Str | S/Str | F/Str |
---|---|---|

0.01 | 0.81 | 0.20 |

It shouldn’t surprise anyone that swinging strike percentage has such a high correlation. Heck, we could have only used that metric to predict K% and it would yield pretty decent results. I am, however, surprised that looking strike rate has essentially no correlation. This is especially true compared to foul strike rate, because a hitter obviously cannot strike out on a foul ball, but he could on a called strike.

Now let’s get to the regression formula and graph.

**xK% = -0.61 + (L/Str * 1.1538) + (S/Str * 1.4696) + (F/Str * 0.9417)**

No, that was not a misprint on the graph. This equation, using the data I described above, produced an R-squared of 0.892. I would suggest that the remainder of the variation is just due to strike sequencing, which is essentially the luck component.

Interestingly, despite an essentially zero correlation, L/Str has a higher coefficient than F/Str in the regression. This makes much more sense as, once again, a pitcher cannot record a strikeout by inducing a foul ball.

In addition to developing an xK% formula, I was also interested in determining how repeatable these skills are. I was fairly confident that inducing swinging strikes was highly repeatable, but wasn’t so sure about the other two strike types. So once again, I went back to my data set and narrowed the group to only those pitchers who had pitched in consecutive seasons, with at least 50 innings in both. That left me with n = 886. The following table represents the year-to-year correlations of each strike type rate.

L/Str | S/Str | F/Str |
---|---|---|

0.64 | 0.73 | 0.57 |

Again, it’s no surprise that S/Str reigns supreme, but I did not expect the other two to rate so highly. If anything, these numbers suggest that pitchers are relatively consistent from year to year and do possess a high degree of control over these rates. So if you find a pitcher whose L/Str has suddenly spiked over a small sample of innings, it might not necessarily be such a fluke, but perhaps a true skills surge. Clay Buchholz is a perfect example of this scenario. Check out his L/Str rates over his career:

This season is the clear outlier, which may normally be chalked up to a small sample fluke. But given that L/Str rates do have a reasonable degree of consistency, it’s very possible that Buchholz has taken this skill up a notch.

Next week, I’ll put the 2013 data to work and report on the pitchers whose strikeout percentages are due for a surge or decline after consulting their respective xK% marks.

Print This Post

*Projecting X 2.0: How to Forecast Baseball Player Performance*, which teaches you how to project players yourself. His projections helped him win the inaugural 2013 Tout Wars mixed draft league. He also sells beautiful photos through his online gallery, Pod's Pics. Follow Mike on Twitter @MikePodhorzer and contact him via email.

Fantastic stuff Mike. I would have guessed that L/Str would have been less controllable than the numbers show. It is nice to be able to put some stock in an increase in that number.

Thanks for posting the link, too. I immediately sorted by S/Str and was not shocked that Yu is leading but WAS shocked by how much he is leading. 26% and the next in line is 22%.

Next week? How about posting it today?!

Yes – that! Great stuff!

how do you get the coefficients in the xK% equation?

I ran a regression using the Data Analysis add-in in Excel. It spits out a whole bunch of numbers, along with the coefficients for the equation.

Fantastic work! Can’t wait for the follow up.

To avoid overfitting, you might want to divide the data set in two, use one set to compute the coefficients, then see how well the resulting function predicts the actual results from the second half of the data set.

I concur.

The data at baseball reference is for % of all strikes that are swinging, foul or looking. I’d rather know the percentage of all pitches thrown that are swinging, foul or looking, although that should be fairly easy to figure out.

Fascinating stuff. I figured Mike Pelfrey would be among the leaders in %F/Str. He has no putaway pitch really. I saw many batters foul off 2 strike pitches from him until they got one they wanted. But surprisingly Verlander is high in that category as well.

It would also be interesting to dig into which pitchers get more or fewer called strikes than their pitch fx data would seem to suggest. How many pitches they throw that are in the zone, not swung out but not called strikes, or outside the zone that are called strikes. Maybe that’s already available somewhere.

Isn’t called strike percentage just 1-Z-Swing%, plus or minus bad umpiring/catcher framing?

It’s probably close to that.

Awesome. Thanks!

I see you, 2012 Craig Kimbrel!

What happens when you add Fastball velocity to this?

Oh, and percentage of pitches thrown for balls, and first pitch strike %.

Cool stuff. I went back to look at it myself, and apart from fiddling with some data errors from B-Ref it was pretty quick. They split up each pitcher’s year by team stints which was kind of a pain, and they double-listed certain players’ totals that played on three different teams (2008 Julian Tavares, 2011 Trever Miller, 2012 Chad Qualls) but after it was cleaned up the data looks really good.

I went through and isolated only qualified starting pitchers (found 427 pitcher seasons) between 2008-2012 and got the same R-squared (0.891) as you did. Much smaller sample size, same R-squared. I think this will be a bit better for predicting SP K%. I did the same for 679 qualified RPs and got an R-squared of 0.861.

I looked at the pool of 223 pitchers who qualified as starting pitchers in years N and N+1, found the correlation between xK% for year N and K% in year N+1 and found it to be 0.75 which is pretty good.

For those interested, the coefficients I got were 1.128 for L/Str, 1.464 for S/Str and 0.827 for F/Str, with an intercept of -0.567. The coefficients also don’t show how predictive each component is, because the average value for each component is very different. P-value for S/Str was by far the highest though, followed by L/Str and then F/Str.

Next I’d be interested in combining some other PitchF/x plate discipline stats to get an even better picture of not only K%, but BB% and ERA as well. I might use Edge% and Heart% too, from BaseballHeatMaps. And hell, for fantasy purposes, I’ll probably check K/9 since that’s what we use instead of K%.

Annnnd after checking the correlation between K% in year N and K% in year N+1 it’s 0.79 so it turns out that plain old K% correlates better to next year’s K% than the fancy formula. It’s possible that at smaller sample sizes and for relief pitchers, that xK% will do a better job. I’ll check those next. But if we have a big enough sample size we should just use K%.

Very cool, nice to see you do this because I was curious if it would be predictable, but you saved me the work! And yes, there’s a difference between a metric being an estimate of what happened, or an expected mark based on what has happened, and predictive.

This metric isn’t meant to be predictive, but what a pitcher’s K% should be given his L/Str, S/Strk and F/Str. Since those three metrics don’t have really high YoY correlations to begin with, it would probably be better to project each of them yourself using historical data and then plug them into the regression.

I initially was going to run all this using K/9, but in smaller sample sizes, really high or low BABIP marks could inflate or deflate a pitcher’s K/9, so I wanted to stick with K%.

OK I keep replying to myself here, but I just added in the PitchF/x plate discipline stats as well as FBv. R-squared went up to 0.924 (!) and year to year correlation for xK% in Year N to K% in year N+1 is now up to 0.964 which is pretty damn amazing and much better than simply using K%. Most of the improvement was from FBv and SwStr% (S/Str% and SwStr% were not redundant). This is all for starting pitchers between 2009-2012 with at least 600 batters faced in both year N and N+1 (a sample of 273 pitchers).

This will be my last post unless anybody has questions about results for BB% or K/9 or ERA, for which I’ll probably run similar tests with this data.

haha, HOLD THE PRESS! Clearly I did that year to year correlation wrong, had the columns screwed up. Still only .754 year to year. Which does not help us project K% in subsequent seasons yet. But it may help in small sample sizes. Should have known something was up with a correlation that high but I think my brain is mushy right now.

Very interesting stuff Mike. Lead me to a few questions:

Wouldn’t a year over year correlation in L/Str, S/Str or F/Str suggest that a pitcher performs at a similar level over his career rather than suggesting they have control over such metrics?

Along a simliar line – what does the YoY correlation of HR Rate look like?

Doesn’t that mean the same thing? If a pitcher performs at a similar level, that means they have control, right?

A smattering of YoY correlations were done earlier this year by Matt Klaassen. You can find everything here http://www.fangraphs.com/blogs/basic-pitching-metric-correlation-1955-2012-2002-2012/

From looking at the graph and the number of outliers above the regression line at both tails, it appears some sort of curve may fit the data better than a linear regression. Thoughts?

This would sort of makes sense. An extra “swinging strike” or “called strike” for a strikeout pitcher is more valuable (in terms of predicting K%) than an extra strike for a non strikeout pitcher, since its more likely that the extra strike will result in a strikeout.

I tried both an exponential and logarithmic curve. The exponential R-squared was just 0.8125, while the logarithmic was pretty close, but still worse at 0.8842.

What about an interaction factor added to the regression (eg L/Str*F/Str*S/Str)? The residuals just seem too odd to me, something must be missing.

I’m not sure what you mean by an interaction factor. And what is odd about the residuals?

http://www3.nd.edu/~rwilliam/stats2/l55.pdf

Now that I think about it though, that would fit the right tail better, but the left tail worse. What concerned me was the fact that in both tails, there are basically no examples of a pitcher below the best fit line, and that the largest xK% has the largest positive residual. No negative residuals in either tail seems odd, but it could just be an anomaly. Just makes me think some sort of 2nd order terms may help, but maybe not.

This is excellent work! I wish there was a way to use this on fangraphs on player sheets or leaderboards. One thing I recommend, is using this equation to predict an xK% for a given year (one that’s not included in your regression that generated the equation). Then go back and compare the previous year’s K%, and see whether or not (and how much ) it does a better job predicting K%.

Outside of this, it would be helpful to understand how quickly these statistics stabilize (and if they stabilize more quickly then K% itself). This would help set the threshold where this equation could be more helpful then looking at just K%.

Thanks! I could definitely put together your first suggestion, but calculating how quickly a stat stabilizes is beyond my skills. I’ll ask Steve Staude and see if he could whip something up.