Would have been interesting to incorporate called strikes and called 3rd strikes. Some pitchers that nibble relentlessly and consistently produce high called 3rd strike rates like Gallardo may be askew in this model. Fun piece though.
Would also liked to have seen 2009-2010 data run, with an xK/9 for 2010, to see what happened to the outliers for 2010. Did those outliers return to expected K/9 rates, or were they a harbinger of good or bad for 2011?
yeah, there are a lot of ways to go with this from here. Bringing in 2009 data, lowering the IP threshold for qualified starters, bringing in relievers, etc. Keep up with the suggestions, they’re helpful!
Comment by Michael Barr — November 15, 2011 @ 11:06 am
yeah, Luebke didn’t qualify, but he’s an interesting case. His overall K% was 27.3%, good for third in baseball – sandwiched in between Greinke and Kershaw. Based on his minor league track record, I’m skeptical if it will stay that high, but that’s just gut speaking. Too bad we don’t have 150 IP from 2010 as a backdrop.
Comment by Michael Barr — November 16, 2011 @ 10:39 am
I think we’d be better off here with a three-year weighted sample of K/9 (or better yet K%) as input to the predictor. Part of the reason we’re interested in a predictor like this is to spot if a guy exceeded his talent level last year. I think having last year’s K/9 as a primary input dilutes the result. Another interesting tweak might be to exclude K’s vs. pitchers batting in the NL, as those can inflate K/9 for some.
Anyway, interesting article, thanks!
all good suggestions – and I’ve actually started on the K% angle. I did have a measure to control for league, but it wasn’t statistically significant – but we know league is at least part of the story with K’s. Having 2010 K/9 as a control variable actually tightened up the model where SwStr% and FBv didn’t tell the whole story. Still a few directions to go with this stuff though, for sure. Thanks for the suggestions.
Comment by Michael Barr — November 16, 2011 @ 2:12 pm
Just saw this article now…some thoughts: Simply put, there are many many factors at play. Some of it may have to do with pitch arsenal and rates; Change-up guys (DHudson, Hamels, etc) vs cutter throwers. Handedness is likely a fairly large impact as well (particular handedness in relation to certain pitches…like cutter or 2seam). Called strikes obviously have a LOT to do with it (my perfect example from 2011 would be Bartolo Colon and his 2-seamer). Also have to factor in the obvious ones such as consistency of figures. You are basing this largely on an equation that takes into FB velocity and swinging strike rates into account. They vary year to year and at times greatly. Therefore the expected figures would vary as well. I’m assuming all figures pulled for the above are from 2010 (in order to predict 2011). But what ability do we concretely have at predicting the variables in the equation? Other minor things like zone% and/or O-Swing% may be a factor in predicting rates. Also, what about GB/FB rates. And predicting Ks largely falls on secondary pitches which vary greatly in and of themselves.
Comment by bballer319 — December 11, 2011 @ 7:18 am
We’re also to assume the relationships are linear. I’d imagine there are non-linear relationships with velocity.
Comment by bballer319 — December 11, 2011 @ 7:20 am
Yet another thought….you probably shouldn’t run the equations and relationships off of the raw data. I’m thinking you should normalize each value and compare the normalization. It would make it more apples to apples in terms of variability within each column of data.
Comment by bballer319 — December 11, 2011 @ 7:48 am
Using 2010 data I isolated the extremes in some context of swingstr% vs K/9 (top 10 each side). These are the vague eyeball test items I observed (pos/neg referring to the “diff” that you obverved and pitches req at least 2% thrown to compare)
1) The pos tend to have below avg z-swing% and vice versa (62 vs 66%)
2) The neg tend to be high % change-up throwers (22.6 vs 11.4%)
3) Also regarding the change-up…the neg also tended to be “slower” change-ups, while the pos were “faster” (85.0 vs 81.4 mph)
4) Velocity itself was across the board difference with lower velocities being in the neg group (FB 92.5 vs 89.9 mph), (SL 85.6 vs 83.8), (CU 77.1 vs 73.9) and even (CT 87.7 vs 86.1)
5) Summing up SL and CU % thrown…the pos group threw more of them (22.7 vs 15.9%)
Areas that seemed to pose no meaningful difference:
1) Contact percentages
And fwiw, the correlation between normalized K% and K/9 figures was 99% out of qualified SP.
Comment by bballer319 — December 11, 2011 @ 8:32 am
Heteroscedacicity and collinearity be damned? I think we should consider actually causes strikeouts instead of these unsurprising correlations. Right?
not sure I follow. The heteroscedacicity was a comment on Bradley’s interesting piece (if you read it, he admits to the effect, but it was still interesting research and a jumping off point for other questions). And it’s not that the correlations are surprising or unsurprising, it’s their value as a predictive tool to sniff out over and underperformers relative to strikeout rates.
Comment by Michael Barr — December 13, 2011 @ 10:25 pm
I came to chime in about leagues as well. Cliff Lee and Greinke both switched to the National League, and those are two of the three pitchers whose strikeout increases were considered “flukey” by the formula here.
Comment by PumpkinMcPastry — January 31, 2012 @ 5:45 pm
Pumpkin – will have a new one up on Second Opinion for 2012, which has a bit of narrative about the league changes on certain guys. It also looks at K% instead of K/9.
Comment by Michael Barr — January 31, 2012 @ 5:47 pm
Looking forward to it, Michael.
Love seeing this kind of work.
Comment by PumpkinMcPastry — February 2, 2012 @ 11:36 am
Let me point out that K/9 is usually an inferior metric to K%. Simply put, different types of pitchers face different numbers of batters per inning, even when not considering strikeouts. For example, no one gave out fewer BBs than Josh Tomlin last year, by any metric. He also had a pretty low BABIP, fueled in part by his fly ball tendencies. Put that together, and you have a guy whose K/9 is not going to match his K%, since he’s facing fewer batters than any other pitcher with a comparable strikeout ability (which admittedly is not that great).
Or, look at it this way, a pitchers punishment for increasing his BB/9 is a chance to increase his K/9. Similarly, variations in BABIP can have minor variations on K/9.