FanGraphs Baseball


RSS feed for comments on this post.

  1. The paragraph immediately after the Swydan excerpt needs to be cleaned up a bit. Solid article, though!

    Comment by Yo — October 9, 2012 @ 4:36 pm

  2. Interesting. There has been some limited grumbling in these parts about Mauer’s spike in strikeouts this year, so I pulled his data into your report. According to your model, he should probably have struck out even more, and walked less than he did. Estimated 15.5% k rate to 10.2% walk rate. Actual, 13.7 to 12.5.

    Comment by payroll — October 9, 2012 @ 4:43 pm

  3. Interesting work, Jeff.

    A couple of months ago I tried to do something similar on the expected BB% side of things. I looked at it slightly differently than you. First I calculated these five things:

    1. Ball%: (1-Zone%) * (1 – O-Swing%)
    2. CallStr%: (Zone%) * (1 – Z-Swing%)
    3. SwStr%: SwStr%
    4. InPlay%: (HBP + GB + FB + LD) / Pitches
    5. Foul%: 1 – Ball% – CallStr% – SwStr% – InPlay%

    Then I looked at the probability of drawing a walk on 4 pitches, 5, 6, etc.

    Walk4%: (Ball%)^4
    Walk5%: (4 choose 1) * (CallStr% + SwStr% + Foul%) * (Ball%)^4
    Walk6%: (5 choose 2) * (CallStr% + SwStr% + Foul%)^2 * (Ball%)^4
    Walk7%: (5 choose 2) * (CallStr% + SwStr% + Foul%)^2 *(Foul%) * (Ball%)^4
    Walk8%: (5 choose 2) * (CallStr% + SwStr% + Foul%)^2 *(Foul%)^2 * (Ball%)^4
    Walk9%: (5 choose 2) * (CallStr% + SwStr% + Foul%)^2 *(Foul%)^3 * (Ball%)^4

    Contributions beyond 9 pitches added were negligible.

    Didn’t go back as far as you, but as far as my xBB% correlation to actual BB% from the same year, I got R-squared values of:

    2012: 0.721
    2011: 0.733
    2010: 0.748
    2009: 0.799

    This was for all players with at least 168 PA per season, since I think that’s where I read BB% stabilizes.

    Looked at it from a prediction standpoint, but in the general sense it was pretty much identical to using prior year BB% as a predictor. Some years a little better, some years a little worse.

    Does this make sense to you? I believe your formula is also using other underlying metrics from year X to give an expected BB% from that same year X, correct?

    Comment by Jon Roegele — October 9, 2012 @ 5:12 pm

  4. Why not also include Zone%, which has a very significant effect on BB (and K)?

    Comment by dcs — October 9, 2012 @ 5:56 pm

  5. You left out the part about him being a god that walks among men.

    Comment by rockymountainhigh — October 10, 2012 @ 9:58 am

  6. I like what you did with the walk probabilities.

    I wonder if the foul% changes when it’s a two-strike count. Hitter trying to protect may increase the probability of a foul. So an overall player’s foul percentage with less than 2 strikes would be less than his foul percentage as he’s fighting off pitches to get to a 9-pitch walk.

    It would be interesting to look at an actual player’s % of 4, 5, 6, …9-pitch walks to see if it lines up with the theoretical probability you calculated.

    Comment by Matthias — October 10, 2012 @ 11:01 am

  7. I also took a stab at this equation a few years back, but ultimately I just didn’t find it to be a very good predictor of future success, people who beat the model (had better K% or BB% then they should have) seemed to consistently continue to do so, indicating to me, that there was something more that they were doing that the equation wasn’t accounting for.

    That said…I think this is still useful for cases where you have a very small sample of data (since plate discipline numbers stabilize more quickly then K% and BB%), just not so much for predicting if a full season player is going to improve, or decline his BB% or K% next year.

    Comment by slash12 — October 10, 2012 @ 12:19 pm

  8. Very cool stuff. A definitive step in the right direction of greater understanding.

    Comment by Spit Ball — October 10, 2012 @ 12:34 pm

  9. Thanks Matthias,

    I agree that in this analysis every count is taken as being “equal” with respect to the chances of events occurring, when in real life it wouldn’t be, as you suggest. The only context I had thought about adding was F-Strike%, but realized this covers all non-ball events, so it doesn’t really work in this model (other than the four pitch walk, I suppose, since the first pitch must be a ball in this case).

    Looking at actual walks by count would be an interesting test. I’m still interested in taking this further, just wondering which way to take it.

    Comment by Jon Roegele — October 10, 2012 @ 1:34 pm

  10. Apparently, he is more the sort who does not walk among men (or anyone, really).

    Comment by williams .482 — October 10, 2012 @ 2:03 pm

Leave a comment

Line and paragraph breaks automatic, e-mail address never displayed, HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Current day month ye@r *

Close this window.

0.103 Powered by WordPress