## Evaluating the Eno Sarris Pitcher Analysis Method

For regular listeners of the Sleeper and the Bust podcast , I do not need to tell you what the Eno Sarris Pitcher Analysis Method is (let’s drop the Eno and leave the Sarris so we can call it SPAM). For those who aren’t familiar, you can see it at work in this article and this one over here. Basically, it is based on the idea that a pitcher can be evaluated by comparing their performance in several key metrics against league averages. We are primarily looking at swinging strike rates and groundball rates by pitch type.

I wanted to see how well this method works, so I grabbed my handy Excel toolkit and pulled down lots of pitching data. Unfortunately, pitch-type PITCHf/x data is not on the FanGraphs leaderboard (come on, Appelman!), so I headed on over to Baseball Prospectus to use their PITCHf/x leaderboards. I pulled the GB/BIP, swing%, whiff/swing, and velocity data for all starters that threw at least 50 of each pitch type in a given season. Is 50 pitches an arbitrary cut-off? Yes, yes it is.

I included four seam fastballs, two seam fastballs, cut fastballs, curves, sliders, changeups, and splitfingers. I used all the data that was available, which goes back to 2007. And, because I am impatient and couldn’t wait until the 2014 season was in the books, I didn’t include data from the last two weeks of this season. I calculated the swinging strike % by multiplying the swing % and the whiff/swing values together. After this, I pulled the K%, ERA, and WHIP data from the FanGraphs leaderboards. In all, I analyzed 1,851 pitcher-seasons.

Note: the swinging strike rates I calculated do differ from those on the player pages at FanGraphs. I’m not sure why there is a discrepancy since they are both based on PITCHf/x data, but there is one. Therefore, I did not use the FanGraphs pitch-type benchmarks in this analysis.

I pulled K%, ERA, and WHIP because I wanted to use these as proxies for pitching outcomes (i.e. my dependent variables). I amended SPAM to include four-seam velocity, because we all know how much of an effect velocity has on run prevention.

Here’s how I did this. I first calculated the league averages for each metric for each season to account for the pitching environment of that season. The table below shows the league average values for each of the metrics for each season.

FF | FT | FC | CU | SL | CH | FS | |

Year | SwStr% | SwStr% | SwStr% | SwStr% | SwStr% | SwStr% | SwStr% |

2007 | 6.1% | 4.6% | 10.0% | 10.2% | 13.4% | 13.1% | 13.9% |

2008 | 5.9% | 4.5% | 9.7% | 9.7% | 14.2% | 13.0% | 14.1% |

2009 | 6.1% | 4.7% | 9.7% | 10.1% | 14.1% | 12.5% | 15.2% |

2010 | 6.0% | 4.8% | 9.8% | 9.5% | 14.1% | 13.5% | 14.5% |

2011 | 6.3% | 4.5% | 9.1% | 9.9% | 14.9% | 12.8% | 14.7% |

2012 | 6.6% | 5.0% | 10.3% | 10.9% | 15.6% | 13.1% | 15.5% |

2013 | 6.7% | 5.1% | 9.3% | 10.5% | 15.0% | 13.8% | 17.2% |

2014 | 6.56% | 5.1% | 9.8% | 10.7% | 15.5% | 14.3% | 17.4% |

FF | FT | FC | CU | SL | CH | FS | FF | ||

Year | GB% | GB% | GB% | GB% | GB% | GB% | GB% | Velocity | BB% |

2007 | 33.8% | 49.8% | 44.8% | 47.2% | 42.9% | 48.1% | 52.9% | 91.06 | 8.92% |

2008 | 33.2% | 49.9% | 44.1% | 48.7% | 44.1% | 46.8% | 52.0% | 90.87 | 9.17% |

2009 | 33.1% | 48.9% | 42.9% | 50.6% | 43.7% | 47.2% | 53.5% | 91.17 | 9.13% |

2010 | 35.6% | 48.9% | 43.9% | 50.1% | 44.0% | 47.6% | 52.9% | 91.22 | 8.61% |

2011 | 33.8% | 49.9% | 45.2% | 48.9% | 45.8% | 47.3% | 54.7% | 91.57 | 8.23% |

2012 | 34.0% | 50.9% | 43.8% | 52.2% | 43.9% | 48.6% | 53.2% | 91.76 | 8.36% |

2013 | 34.6% | 51.4% | 45.0% | 50.2% | 45.8% | 47.4% | 54.6% | 92.02 | 8.33% |

2014 | 35.8% | 50.6% | 46.1% | 49.9% | 45.3% | 50.3% | 52.7% | 92.24 | 7.84% |

I then gave each pitcher one point for each metric that was above league average. For example, King Felix this year gets above average whiffs on five pitches, gets above average grounders on four pitches and has above average four-seam velocity, so he gets ten points. I then computed the SPAM score for each pitcher in each season by summing the scores for the individual metrics.

Here is a table of some randomly-selected pitcher-seasons to give you an idea of the types of SPAM scores I found. This table shows you that there are certainly outliers, guys with good results and bad scores or vice-versa.

Player | Year | Score | ERA | WHIP |

Felix Hernandez | 2014 | 10 | 2.07 | 0.91 |

Zach McAllister | 2014 | 6 | 5.51 | 1.49 |

Yu Darvish | 2012 | 11 | 3.90 | 1.28 |

Bronson Arroyo | 2011 | 2 | 5.07 | 1.37 |

Drew Pomeranz | 2011 | 1 | 5.40 | 1.31 |

Johan Santana | 2008 | 7 | 2.53 | 1.15 |

Zack Greinke | 2008 | 8 | 3.47 | 1.28 |

Edinson Volquez | 2010 | 9 | 4.31 | 1.50 |

Before we dive into the results, I am not a statistician, but I am an engineer, so maybe I’m not completely off the hook. I am looking at these results from a high level and a simple perspective. Maybe I can build off these results and look for deeper connections in the future. First, let’s just look at some averages.

SPAM without BB% |
||||

Averages in Each SPAM Bin | ||||

SPAM Score | ERA | WHIP | K% | # of Pitcher Seasons |

0 | 7.27 | 1.77 | 13.3% | 44 |

1 | 5.92 | 1.62 | 14.1% | 120 |

2 | 5.64 | 1.56 | 15.4% | 218 |

3 | 5.05 | 1.49 | 15.9% | 298 |

4 | 4.72 | 1.42 | 16.9% | 297 |

5 | 4.52 | 1.40 | 17.2% | 293 |

6 | 4.14 | 1.34 | 18.7% | 226 |

7 | 4.02 | 1.31 | 19.7% | 182 |

8 | 3.79 | 1.30 | 19.9% | 110 |

9 | 3.60 | 1.27 | 20.9% | 38 |

10 | 3.39 | 1.20 | 22.1% | 17 |

11 | 3.42 | 1.21 | 23.0% | 7 |

12 | 3.45 | 1.12 | 26.8% | 1 |

The above table shows the average K%, ERA, WHIP for each SPAM score, along with the number of pitcher-seasons that earned that score.

Finally, onto the scatter plots! First up, we have the K% vs. SPAM score graph. We expect this one to have a strong positive correlation, since whiff rates and velocity normally correspond to strikeouts (ground balls, not so much). I used a simple linear regression, since it seemed to be the best fit and the easiest to understand.

Here is the WHIP vs. SPAM score graph.

Here is the ERA vs. SPAM score graph.

Obviously, none of these show strong R^{2} values, but the table of averages above and these graphs do show there is a clear trend here, with higher scores mostly leading to lower ERAs and WHIPs, and higher K%.

None of the above accounts for control directly, so I thought I would try adding BB% as another metric to the SPAM score. I computed the league average walk rate for each season and handed out the points. The addition of BB% changed the values, but didn’t really impact the trends. Below is the averages table for the SPAM scores with BB%. Below that, you will find the three graphs again. The linear trend lines are a little better fit now, but nothing earth-shattering.

SPAM with BB% |
||||

Averages in Each SPAM Bin | ||||

SPAM Score | ERA | WHIP | K% | # of Pitcher Seasons |

0 | 7.70 | 1.88 | 13.1% | 27 |

1 | 6.38 | 1.73 | 13.7% | 73 |

2 | 6.02 | 1.65 | 15.0% | 160 |

3 | 5.24 | 1.52 | 15.7% | 254 |

4 | 4.89 | 1.45 | 16.2% | 287 |

5 | 4.69 | 1.42 | 17.2% | 289 |

6 | 4.34 | 1.37 | 17.4% | 270 |

7 | 4.03 | 1.31 | 19.1% | 197 |

8 | 3.90 | 1.29 | 20.0% | 160 |

9 | 3.71 | 1.27 | 20.1% | 80 |

10 | 3.44 | 1.22 | 21.0% | 34 |

11 | 3.41 | 1.20 | 22.5% | 14 |

12 | 3.36 | 1.20 | 21.5% | 5 |

13 | 3.45 | 1.12 | 26.8% | 1 |

So, what does all this tell us? Well, it seems that Eno’s SPAM method does a pretty good job of identifying pitchers that will be successful and is useful for identifying breakout pitchers. The beauty of this method is that it does not require a lot of data. Per-pitch metrics stabilize faster than per plate-appearance ones, so we can start to evaluate pitchers after only a start or two instead of waiting for the 170 PA required for BB% or the 70 PA for K%. I plan on digging deeper into this data over the offseason to see if I can pull any more insights from it. Please let me know in the comments if you think of something worth investigating further. Eno, if you are reading this, I hope I gave your method the treatment it deserves. And, as I do in all of my online ramblings, I will end with Tschüs!

Print This Post

Rob Parker writes occasionally at WeTalkFantasySports.com and now has a Twitter account @park_ro He is equal parts engineer, nerd, and baseball fan. He also frequents the fantasy baseball subreddit as WisconsinsWestCoast.

Ja,warum denn Tschuess? Bist Du ergenwie Deutsche?

Nein, aber Ich habe Deutsch im Gymnasium gelernt. Jetzt ist mein Deutsch nicht so gut. Ich vergesse mehr jeden Tag. Ich mag Deutschland und die Deutsche Sprache, davor Ich “Tschuess” schreibe.

this is good stuff.

im wondering if you could weight how much above average each pitch is and what the outcome would be. as i take it, simply being a little above average gets you a point but what if you’re the league leader in swstr% on a pitch? you’re classified the same as a pitcher whos offering is only a bit better than average.

This is a good suggestion and might be a good way to refine the analysis. I could perhaps use the number of standard deviations above average to account for this effect. I would just need to find a way to structure the scoring to incorporate the additional points for how far above average a pitch is.

I went and did this, using zscores added together. It increased the rsquared to .2172 for starters, using the entire pool of pitchers to find averages for the pitches.

To expand on what I did, I took each pitch component (velocity for fastball/sinkers, swstrk% and gb% for all pitches), z-scored them against the rest of the pitchers who threw more than 50 of them, then weighted them by what % that pitcher threw of the pitch, added all those values for each pitch together, so you get a combined Z score. Then I added the z-score for BB%.

There are a few avenues I’ll try to improve it next time I get some time:

– some kind of weighting by % thrown before doing the z-score to stop the element of surprise from having an undue effect on the scores.

– some kind of adjustment for variety of pitches

– figure out what’s an appropriate weight for bb%, currently I made it equal to the pitch quality score.

Also, the

Cool, thanks. Yeah, I wasn’t sure whether to include relievers when calculating averages for the pitches, since the method is built to evaluate starters and identify sleeper starters and relievers can skew the values with their extremes (Britton’s GB% and Chapman’s SwStr%, for example).

That is a noticeable increase in R-squared. I think the per-SPAM score averages table is probably more useful than the graphs, since the data is quantized and there is sufficient spread at each SPAM score to throw off the linear fit.

BP seems to be missing two seamers? Where did you get that data?

oh, duh forgot the sinker/twoseamer thing