## Simplifying Pitcher Valuations: Background

After looking at simplifying the evaluation of fantasy hitters, I have now moved onto pitchers. The key for me with pitchers is that they have very little personal control over some of their stats. Wins require their team to score runs or the bullpen to hold a lead. Bad fielders can lead to more hits (higher WHIP). More hits lead to more runs scored (higher ERA). More runs scored leads to less Wins.

A pitcher really has only 3 attributes that they can control: strikeouts, walks and ground balls. Some evidence exists showing that they have some control over their BABIP and HR/FB%, but a large sample of data is needed to make this call.

To start with, I took all of the starters since 2002 (the first year that Fangraphs has ground ball data) with over 140 IP. I collected the Wins, Ks, WHIP and AVG for each pitcher. Also their GB, K and BB data. I converted the Wins into Wins/GS and K/TBF. I used several different methods (e.g. using K/9 vice K%, rating each variable by itself vs as a group, etc) to evaluate the pitchers. With each method, I kept coming up with basically the same answer:

Strikeouts – Walks = Pitcher Fantasy Value

Using just this method, here are the top 5 pitchers from 2011:

Name, (K – BB)

Cliff Lee, 196

Clayton Kershaw, 194

Justin Verlander, 193

Roy Halladay, 185

CC Sabathia, 169

While using the preceding formula would be the simplest, it probably is not the best. Through the various methods, I found that there should be a factor of less than 1 for the BB variable. The variable was between 0.7 and 1.0 depending of the method of analysis I used.

Here is the analysis on the equation I plan on using going forward:

I ran a linear regression against the value with BB%, K% and GB% as the variables. Most projections don’t use GB%, so I ran one with just K% and GB%. I adjusted the numbers so that the K% factor was 1. Here are the results

K%, BB% and GB%

Pitching Talent = K% – 0.825*BB% + 0.117*GB%

r-squared = 0.65

K% and BB% only

Pitching Talent = K% – 0.864*BB%

r-squared = 0.64I removed W/GS from the regression equation and ended up with:

K%, BB% and GB%

Pitching Talent = K% – 0.755*BB% + 0.082*GB%

r-squared = 0.73

K% and BB% only

Pitching Talent = K% – 0.792*BB%

r-squared = 0.72

A few items to take away:

1. Adding GB% doesn’t make the data a lot more accurate.

2. Taking Wins out of the equation makes K’s become more important.

The values make sense with K% being the only component of K’s. BB% is a huge component of WHIP. Both have an effect on a pitcher’s ERA.

With this knowledge, I have decided to use the equation with W/GS and GB% not factored in. First, taking GB% out of the equation makes it easier to calculate. Second, without incorporating Wins, relievers and starters can be on the same scale. The final equation can be adjusted for Wins and Saves later.

Using this equation for player talent, here are the top 5 pitchers from 2011 (>140 IP):

Name | Pitcher Valuation |

Zack Greinke | 0.231 |

Clayton Kershaw | 0.225 |

Brandon Beachy | 0.224 |

Cliff Lee | 0.223 |

Justin Verlander | 0.211 |

In observed talent, 3 of these pitchers were no brainers after their 2011 season (Kershaw, Lee and Verlander). The other two pitchers, Greinke and Beachy, had only partial seasons, so their final Win and K totals were not as impressive.

This equation is the rate of the pitcher’s talent. It now needs to be adjusted for time on the mound. Taking the IP into account, here are the top 5 pitchers from 2011:

Name | Pitcher Valuation * IP |

Justin Verlander | 53.1 |

Clayton Kershaw | 52.5 |

Cliff Lee | 51.7 |

Roy Halladay | 48.1 |

James Shields | 44.3 |

Verlander, Kershaw and Lee stayed on the top with Shields and Halladay joining them. CC Sabathia, which was on the earlier list, ended up 6th.

This evaluation of a pitcher gets to the core of their projectable pitching talent. Now some futher adjustments can be made to their Talent Value like HR/FB%, BABIP, run support and if the pitcher is a closer. I will begin looking at some of these next time.

That is it for today. Let me know if you have any questions or suggestions.

Print This Post

Gotta include HR rates, Jeff. Do that and you’ve re-discovered LIMA!

If you are meaning Sandler’s work, the equation are a bit different.

First he uses GB%, not HRs

Second, they are not simple in any way.

Finally, when I used K/9 and BB/9 vice K% and BB% the results were worse. Usually the r-squared values were on average 0.10 less

Definitely K% is way better than K/9. As noted here, K/9 gets inflated for guys who can’t get batters out using anything but a K (e.g. who are giving up a lot of hits, basically).

With that said, the the GB% and/or distance of fly balls would probably also be things that you’d think would potentially correlate to performance. While HR can be fairly volatile, I would think that measures of hard hit balls (e.g. carry) would correlate better to long term HR rates, per park. Because it’s well established that IF% is actually better than GB% for performance, because those are basically automatic outs. In a similar way, inducing a lot of shallow OF flies would be a good thing- at least as good as GB.