FanGraphs Fantasy Baseball


RSS feed for comments on this post.

  1. Would have been interesting to incorporate called strikes and called 3rd strikes. Some pitchers that nibble relentlessly and consistently produce high called 3rd strike rates like Gallardo may be askew in this model. Fun piece though.

    Comment by Neal — November 15, 2011 @ 8:54 am

  2. Nice!

    Would also liked to have seen 2009-2010 data run, with an xK/9 for 2010, to see what happened to the outliers for 2010. Did those outliers return to expected K/9 rates, or were they a harbinger of good or bad for 2011?

    Interesting info.

    Comment by Dave S — November 15, 2011 @ 10:49 am

  3. yeah, there are a lot of ways to go with this from here. Bringing in 2009 data, lowering the IP threshold for qualified starters, bringing in relievers, etc. Keep up with the suggestions, they’re helpful!

    Comment by Michael Barr — November 15, 2011 @ 11:06 am

  4. Thanks for making me go look up a word :)

    Comment by Sean — November 15, 2011 @ 11:14 am

  5. Noticed Cory Luebke wasn’t on the list since he was a reliever in 2010 and only started most of the 2011 season.

    He projects to a 8.96 xK/9 for 2011 using his reliever K/9 # from 2010. Expect him to potentially post a K/9 around 8.5 in 2012.

    Comment by Scott — November 16, 2011 @ 10:32 am

  6. yeah, Luebke didn’t qualify, but he’s an interesting case. His overall K% was 27.3%, good for third in baseball – sandwiched in between Greinke and Kershaw. Based on his minor league track record, I’m skeptical if it will stay that high, but that’s just gut speaking. Too bad we don’t have 150 IP from 2010 as a backdrop.

    Comment by Michael Barr — November 16, 2011 @ 10:39 am

  7. I think we’d be better off here with a three-year weighted sample of K/9 (or better yet K%) as input to the predictor. Part of the reason we’re interested in a predictor like this is to spot if a guy exceeded his talent level last year. I think having last year’s K/9 as a primary input dilutes the result. Another interesting tweak might be to exclude K’s vs. pitchers batting in the NL, as those can inflate K/9 for some.
    Anyway, interesting article, thanks!

    Comment by brendan — November 16, 2011 @ 1:59 pm

  8. all good suggestions – and I’ve actually started on the K% angle. I did have a measure to control for league, but it wasn’t statistically significant – but we know league is at least part of the story with K’s. Having 2010 K/9 as a control variable actually tightened up the model where SwStr% and FBv didn’t tell the whole story. Still a few directions to go with this stuff though, for sure. Thanks for the suggestions.

    Comment by Michael Barr — November 16, 2011 @ 2:12 pm

  9. Just saw this article now…some thoughts: Simply put, there are many many factors at play. Some of it may have to do with pitch arsenal and rates; Change-up guys (DHudson, Hamels, etc) vs cutter throwers. Handedness is likely a fairly large impact as well (particular handedness in relation to certain pitches…like cutter or 2seam). Called strikes obviously have a LOT to do with it (my perfect example from 2011 would be Bartolo Colon and his 2-seamer). Also have to factor in the obvious ones such as consistency of figures. You are basing this largely on an equation that takes into FB velocity and swinging strike rates into account. They vary year to year and at times greatly. Therefore the expected figures would vary as well. I’m assuming all figures pulled for the above are from 2010 (in order to predict 2011). But what ability do we concretely have at predicting the variables in the equation? Other minor things like zone% and/or O-Swing% may be a factor in predicting rates. Also, what about GB/FB rates. And predicting Ks largely falls on secondary pitches which vary greatly in and of themselves.

    Comment by bballer319 — December 11, 2011 @ 7:18 am

  10. We’re also to assume the relationships are linear. I’d imagine there are non-linear relationships with velocity.

    Comment by bballer319 — December 11, 2011 @ 7:20 am

  11. Yet another thought….you probably shouldn’t run the equations and relationships off of the raw data. I’m thinking you should normalize each value and compare the normalization. It would make it more apples to apples in terms of variability within each column of data.

    Comment by bballer319 — December 11, 2011 @ 7:48 am

  12. Using 2010 data I isolated the extremes in some context of swingstr% vs K/9 (top 10 each side). These are the vague eyeball test items I observed (pos/neg referring to the “diff” that you obverved and pitches req at least 2% thrown to compare)

    1) The pos tend to have below avg z-swing% and vice versa (62 vs 66%)
    2) The neg tend to be high % change-up throwers (22.6 vs 11.4%)
    3) Also regarding the change-up…the neg also tended to be “slower” change-ups, while the pos were “faster” (85.0 vs 81.4 mph)
    4) Velocity itself was across the board difference with lower velocities being in the neg group (FB 92.5 vs 89.9 mph), (SL 85.6 vs 83.8), (CU 77.1 vs 73.9) and even (CT 87.7 vs 86.1)
    5) Summing up SL and CU % thrown…the pos group threw more of them (22.7 vs 15.9%)

    Areas that seemed to pose no meaningful difference:
    1) Contact percentages
    2) FB%

    And fwiw, the correlation between normalized K% and K/9 figures was 99% out of qualified SP.

    Comment by bballer319 — December 11, 2011 @ 8:32 am

  13. Heteroscedacicity and collinearity be damned? I think we should consider actually causes strikeouts instead of these unsurprising correlations. Right?

    Comment by Bryce — December 13, 2011 @ 9:45 pm

  14. not sure I follow. The heteroscedacicity was a comment on Bradley’s interesting piece (if you read it, he admits to the effect, but it was still interesting research and a jumping off point for other questions). And it’s not that the correlations are surprising or unsurprising, it’s their value as a predictive tool to sniff out over and underperformers relative to strikeout rates.

    Comment by Michael Barr — December 13, 2011 @ 10:25 pm

  15. Why isn’t this on fangraphs?

    Comment by Bip — January 1, 2012 @ 3:00 am

  16. The second two graphs appear as if the determinant and response variables are flipped on the x and y axis. What am I missing? FB velocity should influence K rate, not vice versa.

    Comment by Puzzled — January 21, 2012 @ 1:27 pm

  17. I came to chime in about leagues as well. Cliff Lee and Greinke both switched to the National League, and those are two of the three pitchers whose strikeout increases were considered “flukey” by the formula here.

    Comment by PumpkinMcPastry — January 31, 2012 @ 5:45 pm

  18. Pumpkin – will have a new one up on Second Opinion for 2012, which has a bit of narrative about the league changes on certain guys. It also looks at K% instead of K/9.

    Comment by Michael Barr — January 31, 2012 @ 5:47 pm

  19. Looking forward to it, Michael.

    Love seeing this kind of work.

    Comment by PumpkinMcPastry — February 2, 2012 @ 11:36 am

  20. Let me point out that K/9 is usually an inferior metric to K%. Simply put, different types of pitchers face different numbers of batters per inning, even when not considering strikeouts. For example, no one gave out fewer BBs than Josh Tomlin last year, by any metric. He also had a pretty low BABIP, fueled in part by his fly ball tendencies. Put that together, and you have a guy whose K/9 is not going to match his K%, since he’s facing fewer batters than any other pitcher with a comparable strikeout ability (which admittedly is not that great).

    Or, look at it this way, a pitchers punishment for increasing his BB/9 is a chance to increase his K/9. Similarly, variations in BABIP can have minor variations on K/9.

    Comment by Omikron — February 14, 2012 @ 5:59 pm

  21. Thanks Omikron – that’s exactly why I pubbed this:

    Available on FG+, it’s a similar project that uses K% instead of K/9. If you’re a subscriber, I hope you like it. Feedback is always welcome.

    Comment by Michael Barr — February 14, 2012 @ 6:05 pm

Leave a comment

Line and paragraph breaks automatic, e-mail address never displayed, HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Close this window.

0.164 Powered by WordPress