Simplifying Pitcher Valuations: Background

After looking at simplifying the evaluation of fantasy hitters, I have now moved onto pitchers. The key for me with pitchers is that they have very little personal control over some of their stats. Wins require their team to score runs or the bullpen to hold a lead. Bad fielders can lead to more hits (higher WHIP). More hits lead to more runs scored (higher ERA). More runs scored leads to less Wins.

A pitcher really has only 3 attributes that they can control: strikeouts, walks and ground balls. Some evidence exists showing that they have some control over their BABIP and HR/FB%, but a large sample of data is needed to make this call.

To start with, I took all of the starters since 2002 (the first year that Fangraphs has ground ball data) with over 140 IP. I collected the Wins, Ks, WHIP and AVG for each pitcher. Also their GB, K and BB data. I converted the Wins into Wins/GS and K/TBF. I used several different methods (e.g. using K/9 vice K%, rating each variable by itself vs as a group, etc) to evaluate the pitchers. With each method, I kept coming up with basically the same answer:

Strikeouts – Walks = Pitcher Fantasy Value

Using just this method, here are the top 5 pitchers from 2011:

Name, (K – BB)
Cliff Lee, 196
Clayton Kershaw, 194
Justin Verlander, 193
Roy Halladay, 185
CC Sabathia, 169

While using the preceding formula would be the simplest, it probably is not the best. Through the various methods, I found that there should be a factor of less than 1 for the BB variable. The variable was between 0.7 and 1.0 depending of the method of analysis I used.

Here is the analysis on the equation I plan on using going forward:

I ran a linear regression against the value with BB%, K% and GB% as the variables. Most projections don’t use GB%, so I ran one with just K% and GB%. I adjusted the numbers so that the K% factor was 1. Here are the results

K%, BB% and GB%
Pitching Talent = K% – 0.825*BB% + 0.117*GB%
r-squared = 0.65

K% and BB% only
Pitching Talent = K% – 0.864*BB%
r-squared = 0.64

I removed W/GS from the regression equation and ended up with:

K%, BB% and GB%
Pitching Talent = K% – 0.755*BB% + 0.082*GB%
r-squared = 0.73

K% and BB% only
Pitching Talent = K% – 0.792*BB%
r-squared = 0.72

A few items to take away:

1. Adding GB% doesn’t make the data a lot more accurate.
2. Taking Wins out of the equation makes K’s become more important.

The values make sense with K% being the only component of K’s. BB% is a huge component of WHIP. Both have an effect on a pitcher’s ERA.

With this knowledge, I have decided to use the equation with W/GS and GB% not factored in. First, taking GB% out of the equation makes it easier to calculate. Second, without incorporating Wins, relievers and starters can be on the same scale. The final equation can be adjusted for Wins and Saves later.

Using this equation for player talent, here are the top 5 pitchers from 2011 (>140 IP):

Name Pitcher Valuation
Zack Greinke 0.231
Clayton Kershaw 0.225
Brandon Beachy 0.224
Cliff Lee 0.223
Justin Verlander 0.211

In observed talent, 3 of these pitchers were no brainers after their 2011 season (Kershaw, Lee and Verlander). The other two pitchers, Greinke and Beachy, had only partial seasons, so their final Win and K totals were not as impressive.

This equation is the rate of the pitcher’s talent. It now needs to be adjusted for time on the mound. Taking the IP into account, here are the top 5 pitchers from 2011:

Name Pitcher Valuation * IP
Justin Verlander 53.1
Clayton Kershaw 52.5
Cliff Lee 51.7
Roy Halladay 48.1
James Shields 44.3

Verlander, Kershaw and Lee stayed on the top with Shields and Halladay joining them. CC Sabathia, which was on the earlier list, ended up 6th.

This evaluation of a pitcher gets to the core of their projectable pitching talent. Now some futher adjustments can be made to their Talent Value like HR/FB%, BABIP, run support and if the pitcher is a closer. I will begin looking at some of these next time.

That is it for today. Let me know if you have any questions or suggestions.




Print This Post

Jeff writes for FanGraphs, The Hardball Times and Royals Review, as well as his own website, Baseball Heat Maps with his brother Darrell. In tandem with Bill Petti, he won the 2013 SABR Analytics Research Award for Contemporary Analysis. Follow him on Twitter @jeffwzimmerman.


14 Responses to “Simplifying Pitcher Valuations: Background”

You can follow any responses to this entry through the RSS 2.0 feed.
  1. Blue says:

    Gotta include HR rates, Jeff. Do that and you’ve re-discovered LIMA!

    Vote -1 Vote +1

    • Jeff Zimmerman says:

      If you are meaning Sandler’s work, the equation are a bit different.

      First he uses GB%, not HRs
      Second, they are not simple in any way.
      Finally, when I used K/9 and BB/9 vice K% and BB% the results were worse. Usually the r-squared values were on average 0.10 less

      Vote -1 Vote +1

      • B N says:

        Definitely K% is way better than K/9. As noted here, K/9 gets inflated for guys who can’t get batters out using anything but a K (e.g. who are giving up a lot of hits, basically).

        With that said, the the GB% and/or distance of fly balls would probably also be things that you’d think would potentially correlate to performance. While HR can be fairly volatile, I would think that measures of hard hit balls (e.g. carry) would correlate better to long term HR rates, per park. Because it’s well established that IF% is actually better than GB% for performance, because those are basically automatic outs. In a similar way, inducing a lot of shallow OF flies would be a good thing- at least as good as GB.

        Vote -1 Vote +1

  2. slash12 says:

    GB% in terms of fantasy is a tradeoff, higher GB% = higher WHIP, lower ERA, lower GB% = lower WHIP higher ERA. The main thing missing, is not accounting for park factors, and defense factors(hard to predict). When you want to compare pitchers skills, abstracting these things out makes sense, but in terms of fantasy, we don’t care how theoretically good a pitcher is, we care what team/park he’s stuck with. Granted, these are difficult things to qualify.

    Vote -1 Vote +1

    • johnnycuff says:

      gotta agree with everything here. do you know of any good attempts to quantify team defense?

      Vote -1 Vote +1

      • slash12 says:

        I’ve got this crazy process I’m working on for pitcher BABIP, it calculates a pitchers expected BABIP based on: K%(helps), GB%(hurts), Park factors, and I did a UZR based regression that further adjusts based on team defense. the UZR based modification is the most difficult part, since it’s really difficult to project a teams UZR for next year, but I’m going to try, closer to next season.

        Vote -1 Vote +1

      • johnnycuff says:

        you might consider doing the regression based on UZR numbers for just infielders, considering that outfielders aren’t going to record many put outs related to the GB%.

        the fangraphs fan predictions also include UZR predictions, but i can’t speak for their accuracy.

        Vote -1 Vote +1

      • Jeff Zimmerman says:

        I tried after the fact with:

        http://www.fangraphs.com/fantasy/index.php/effects-of-defense-on-era-and-whip/

        I am looking into some prediction of it for 2012. THT runs defensive projections, so I may use them.

        Vote -1 Vote +1

      • johnnycuff says:

        good stuff. thanks jeff. looking forward to your future work.

        Vote -1 Vote +1

    • Jeff Zimmerman says:

      Slash – It is tough to look at the surrounding variables, but most people start looking at pitchers that way. I want a pitcher for a good team or high GB% and take the best one off them. Instead, start with the core ability of the pitcher and then expand out from there.

      Vote -1 Vote +1

  3. Will H. says:

    Hey Jeff! Great stuff, but a question: you looked at 2011, but wouldn’t seeing its effectiveness as a projection system work better if you found the leading choices based off of 2010 data and then comparing it to how they did in 2011 work better? (or, of course, a regression, 3-2-1 or whatever is more sophisticated than that)

    Vote -1 Vote +1

  4. jkb says:

    I’ve used team ERA – team FIP+ to estimate defense + park factors. Then I add that amount to each pitcher’s FIP+ to come up with an ERA projection.

    Vote -1 Vote +1

  5. Wade Ortega says:

    So Jeff you’re saying (K-BB)*IP = weighted fantasy value? THAT simple?

    Vote -1 Vote +1

    • Jeff Zimmerman says:

      Almost.

      I would prefer (K%-0.9*BB%)*IP. It is a little more accurate. It takes into account some of the pitcher’s BABIP control as mentioned by BN and Slash12 above in the comments.

      Vote -1 Vote +1

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>