Baseball’s Most Selective Hitter

Generally speaking, a decent proxy for a batter’s understanding of the strike zone is his O-Swing% — that is, the percentage of pitches outside of the zone at which he offers. The lower that figure, the less often a player is offering at pitches outside of the zone. The less often a player is offering at pitches outside of the zone, the more likely he is both to draw walks and (one assumes) swing at better pitches inside the zone.

As to the first point, that is borne out by the numbers: O-Swing% and walk rate correlate rather tightly. Consider the following graph, for example, which includes the O-Swing%s (from the PITCHf/x zone) and walk rates for all 143 qualified batters from 2012. (Note: average O-Swing% among this population is 28.9%. Standard deviation is 5.7%.)

As for the second point, however — that O-Swing% necessarily indicates a better idea of the strike zone — it recently occurred to the author (who isn’t very sharp) that perhaps these are not the same thing. Anyone who ever saw Mark Bellhorn bat, for example, will know that it’s sometimes possible for a player not only to refrain from swinging outside of the zone, but also to avoid swinging altogether. There is a difference, however, between selectivity — which we’ll define, for the sake of this post, as “ability to discern between balls and strikes” — and a refusal to swing the bat. The former, we reason, is a good thing; the latter, less so.

In fact, this appears to be a justified concern. As this second graph indicates (of those same 143 qualifiers from 2012), batters who swing less outside of the zone are also, frequently, swinging less inside of it. (Note: Z-Swing% represents pitches offered at within the zone.)

If one were to really “measure” something like selectivity, the better plan — instead of looking just at O-Swing% — might be to look at the separation between a batter’s O-Swing% and Z-Swing%. Each batter does, of course, have his own particular preferences so far as hitting is concerned. Perhaps there are pitches outside the strike zone that are, in their way, more hittable than those inside it. Conversely, there are areas within the zone to which a pitcher might throw and still induce weak contact. In lieu of a more granular approach, however, that somehow accounts for each batter’s preferences (an effort of which the present author is incapable), it seems fair to suggest that a batter who demonstrates the greatest difference between his O-Swing and Z-Swing tendencies would be the league’s Most Selective Hitter.

To that end, what I’ve done is calculated, for each of 2012′s qualified batters, z-scores (standard deviations from the mean) both for O-Swing% and Z-Swing%. In both cases, a positive z-score is better — which is to say, a positive z-score for O-Swing% means a batter chases fewer pitches outside the zone than the mean. I’ve then averaged those z-scores together for an overall selectivity measure (noted below as Sel). Sel is the average standard deviations from the mean for a batter by O-Swing% and Z-Swing% combined. Furthermore, just for reference, I’ve made a rough index version of Sel, as well (presented as Sel+). I’ve placed Sel+ on more or less the same scale (and with the same range) as wRC+ for this particular group.

Here are the top-10 qualified batters by this methodology:


Name Team PA O-Swing Z-Swing Oz Zz Sel Sel+
Yonder Alonso Padres 619 24.3% 73.5% 0.81 1.76 1.29 176
Andrew McCutchen Pirates 673 22.0% 67.6% 1.22 0.76 0.99 161
Dexter Fowler Rockies 530 20.7% 65.1% 1.45 0.34 0.89 156
Freddie Freeman Braves 620 31.6% 75.4% -0.47 2.08 0.81 152
Rickie Weeks Brewers 677 18.5% 61.3% 1.83 -0.31 0.76 150
B.J. Upton Rays 633 30.2% 73.0% -0.22 1.68 0.73 148
Derek Jeter Yankees 740 28.2% 70.8% 0.13 1.30 0.72 147
Josh Willingham Twins 615 18.7% 60.4% 1.80 -0.46 0.67 145
Chase Headley Padres 699 25.5% 67.2% 0.60 0.69 0.65 144
Jay Bruce Reds 633 28.3% 69.9% 0.11 1.15 0.63 143

And here are the bottom 10:


Name Team PA O-Swing Z-Swing Oz Zz Sel Sel+
Martin Prado Braves 690 26.8% 48.4% 0.37 -2.50 -1.06 58
Ryan Zimmerman Nationals 641 30.9% 53.6% -0.35 -1.61 -0.98 62
Ben Revere Twins 553 27.6% 50.7% 0.23 -2.11 -0.94 65
Dayan Viciedo White Sox 543 38.9% 63.7% -1.75 0.10 -0.83 70
J.J. Hardy Orioles 713 27.4% 52.6% 0.27 -1.78 -0.76 74
Alexei Ramirez White Sox 621 40.8% 66.5% -2.09 0.57 -0.76 74
Shane Victorino - – - 666 31.3% 56.7% -0.42 -1.09 -0.75 74
Erick Aybar Angels 554 38.0% 63.7% -1.59 0.10 -0.75 74
Brennan Boesch Tigers 503 40.2% 66.7% -1.98 0.61 -0.69 77
Mark Trumbo Angels 586 37.6% 64.4% -1.52 0.22 -0.65 79

Yonder Alonso — by this method, at least — was 2012′s most selective hitter; Martin Prado, its least. And, indeed, the presence of Prado among the laggards suggests that this way of measuring selectivity will run at odds with a more established idea of what selectivity is — and continues to suggest that, perhaps, this method has its own flaws. The reason for Prado’s low Selectivity rating has everything to do with his incredibly low Z-Swing%: while the average qualified batter offered at ca. 63% of pitches in the PITCHf/x zone (with a standard deviation of ca 6%), Prado swung at fewer than 49%. His approach obviously worked for him: Prado posted a 116 wRC+ in 690 plate appearances with almost identical walk and strikeout rates (8.4% and 10.0%, respectively). Relative to his O-Swing%, however, which was closer to league average, the Z-Swing% was quite low.




Print This Post



Carson Cistulli occasionally publishes spirited ejaculations at The New Enthusiast.


43 Responses to “Baseball’s Most Selective Hitter”

You can follow any responses to this entry through the RSS 2.0 feed.
  1. mettle says:

    Seems correct to me re: Prado. If you look at your second graph the dots on the top to the left-ish are the good-eye people, which are the ones in your list and those outlier-y people on the bottom btw 25-35 oswing% are the bad-eye people.

    This is different than the most “selective”, i.e., most likely to wait for their pitch regardless of zone, which would be the clump of people at the bottom left of your graph or hackers, which is the top right.

    Vote -1 Vote +1

  2. Kinanik says:

    Of course, the questions arise: How stable is selectivity? Does one year’s selectivity translate into another year’s? How does this predict other statistics? Can we say that, for instance, a batter with selectivity of .6 translate into a K/BB ratio, and from that can we see K/BB under/overpreformers for a given year?

    Cool stuff.

    Vote -1 Vote +1

    • Phil Livingston says:

      Selectivity is extremely stable. Which is why this trait is so important and marketable. See Bobby Abreu’s 2012 season, in which he had a BB% of 14.4% at 39 years old, a touch under his career norm of 14.7%. Bobby Abreu was released because of lack of production(and the fact that Jerry Dipoto can’t help but do his best to kill the Angels franchise with knee jerk reactions), when in fact he was fairly good at the plate. Now, don’t get me wrong, he isn’t the player he used to be, but it was unfair to him to give his AB’s to Vernon Wells.

      Vote -1 Vote +1

  3. Kiss my GO NATS says:

    Yonder Alonso my never hit 40 homers, but I think a batting average of over .300 with 15-20 homers will become a norm for him once he develops. The OB% is already there.

    Vote -1 Vote +1

    • James says:

      With the fences coming in, those projections (which are nothing new) are looking a lot more likely. In 2012, he lacked the consistency to hit .300 – his hot stretches were too few and far between (a la Jay Bruce). Maybe he can get that with experience.

      Vote -1 Vote +1

  4. fresheee says:

    Fun stuff.

    Seems to me there could be a number of other % stats that could bolster this idea of “selectivity.” What about Contact%? Or, better yet, a the difference between O-Contact% and Z-Contact%. Maybe the “selective” hitters have much more drastic gaps in O/Z-Contact% than the less “selective” hitters (ie, are these selective hitters just selective because they have to be?).

    Also, it’s curious how the top 10 pass the eye-test pretty well, but the bottom-10 not so much. Prado, Zimmerman and Trumbo all have solid wRC+ and each offeres a different hitter-type: contact guy, all-around hitter w/ power and low-OBP slugger. In other words, it’s difficult to understand what low “selectivity” should hint at.

    Vote -1 Vote +1

    • DCN says:

      It could be that they take strikes early in the count because they’re looking for particular pitches. I.e., if they want to see a fastball, but get a changeup early, they won’t try to hit it, even if it’s in the zone, in the hopes of later getting the fastball (or setting up for the change and getting it while they’re ready for it).

      Vote -1 Vote +1

  5. JT says:

    I observe that 7 of the 10 “most selective hitters” are NLers, while 7 of the 10 “least selective hitters” are ALers. Certainly not a statistically significant phenomenon, but it makes me curious about what potential differences might arise due to League difference. No good reason for this comes to mind immediately, but a brief look at the top 20 O-Swing% this season reveals 15 ALers (including 14 of the first 16).

    Vote -1 Vote +1

  6. Nivra says:

    Is difference a better metric than ratio? Do the leaderboards change significantly if you use ratio instead? If so, what are the differences and why?

    Vote -1 Vote +1

  7. KC Ron says:

    How far off from the least selective hitters was delmon young?

    Vote -1 Vote +1

  8. TK says:

    Prado many times would just straight up take pitches. This was usually the case when Michael Bourn was on first base. It actually might help him as his walk rate was up this year.

    Vote -1 Vote +1

  9. Nivra says:

    Hmm… I wonder if you could change this to first pitch selectivity, 2-strike selectivity, non 2-strike selectivity, and full count selectivity. You’d reduce your sample sizes immensely, but I think it would tell us a lot more.

    Prado probably lets a lot of non 2-strike pitches in the zone go by, waiting for his perfect pitch. That’s the only way I can see him maintaining such a low K-rate.

    Vote -1 Vote +1

  10. Al Dimond says:

    Often adding or subtracting rates with different denominators is a sign of trouble, that a metric is probably only meaningful for certain ranges of input (see FIP, OPS). I might suggest a metric whose denominator is pitches seen, and whose numerator is (zswings + otakes – (oswings + ztakes)) — basically, percentage of pitches where the batter “did the right thing”…

    But, of course, doing the right thing isn’t swinging at any pitch in the zone and taking any pitch outside of it. Many good hitters look for very specific pitches in favorable counts and take anything else. Lots of hitters take all the way in certain situations, acknowledging that their selectivity isn’t perfect. So the numerator of that metric is clearly bogus.

    And because hitters that swing a lot are less likely to face deep counts than hitters that don’t, and optimal hitting strategy depends a lot on the count, pitches seen is a terrible denominator. My suggested metric sucks all over. What really might do the trick is a series of different analyses for different counts, reflecting a selective hitter’s balance of goals in each count. That would be kind of hard.

    Vote -1 Vote +1

  11. dustygator says:

    Brandon Belt would probably be the leader if you ignored the qualified batters parameter.

    31.0% O-Swing (-0.37Oz)
    81.4% Z-Swing (~3.1 Zz)

    1.365 Sel

    Vote -1 Vote +1

    • Paul says:

      Brandon Belt always amazes me:he has really good eyes but strikes out really a lot.

      Vote -1 Vote +1

    • ElJimador says:

      Depends on how far you extended beyond qualified batters. Belt’s mark would be best if you include only those who just missed the qualified cutoff but go a bit further and Carlos Quentin posted a 29.4% O-Swing and 82.0% Z-Swing just in 340 PAs instead of Belt’s 472.

      That said, it’s amazing that Belt’s Z-Swing% was that high (3rd highest in baseball for hitters w/at least 300 PAs — behind only Hamilton and Quentin and just ahead of Sandoval and Delmon Young) and the criticism of Belt you hear over and over again from Giants management is that he’s not aggressive enough.

      Vote -1 Vote +1

  12. delv says:

    You gotta split switch hitters up into their L and R nmbers. Most switch hitters show a big difference in their OBP/iso numbers depending on what side of the place they’re batting on, often showing an inverse relationship (eg. goodOBP/badiso as a lefty and badOBP/goodiso as a righty; Nick Swisher comes to mind).

    Also, concerning selectivity… there’s a difference between that and strike zone judgment. A hitter may choose not to swing at pitches that are low but in-the-zone if that pitch-location don’t vibe well with that hitter’s swing path. Equally, a hitter may always swing at the pitch that’s down and in, even if it’s a ball, if they tend to mash on those pitches.

    Selectivity may best be measured by noting who has the most extreme contact rates per pitch-location. For example, a hitter that only swings (and always swings) at high pitches (in or out of the zone) and rarely ever at low ones is being highly selective—perhaps more so than the hitter that only swings at strikes but swings at them less absolutely.

    Vote -1 Vote +1

  13. martyn says:

    This is change I can believe in.

    Vote -1 Vote +1

  14. dcs says:

    Instead of averaging the Z scores of OSwing and ZSwing together (thus implying that the higher the ZSwing the better, which is perhaps/likely not true), it might be better to take the ZSwing as a ‘given’ of a batter’s general aggressiveness, figure out an expected OSwing based on that, and give credit for a lower OSwing than this expectation. Actually, since swing rates are largely influenced by contact rates, maybe it’s better to use Zone Contact rate as the starting point…

    Vote -1 Vote +1

  15. Steve says:

    Error bars!

    Vote -1 Vote +1

  16. DTF_in_DTL says:

    Carson, is it possible to get pitch by pitch results and compare rates of swings in zones a hitter has had success? A measure of identifying the strengths of a players swing and how often they use that to their advantage. There may be a zone, say low inside for a hitter they destroy and routinely swing even though it’s out of the zone because they have success.

    Vote -1 Vote +1

    • Phantom Stranger says:

      I think this might be a productive avenue of research. I would view the problem from the other direction. What mental approach is each batter taking to the plate and how does he implement it on a pitch-by-pitch basis? Most veteran hitters know their strengths and weaknesses intimately, and change their approach based on that knowledge.

      It would take someone working with a MLB team to have their hitters give an in-depth report on how they approach an at-bat. The plan would likely change depending on handedness of the pitcher and the type of pitcher (sinkerballer, hard breaking stuff, etc).

      Vote -1 Vote +1

  17. Choo says:

    +1 for the Mark Bellhorn reference. Just thinking of him and Brian Daubach and Bill Mueller jacking up pitch counts is pleasing to the soul.

    Vote -1 Vote +1

  18. Andrew says:

    Enjoyed this. Love plate discipline data, and while I’ve never conducted such calculations as in the article, by the eye test Chris Iannetta has been the gold standard for this the past few years. Thing is, he hardly ever qualifies for enough PAs But over his career:

    18.9% O-Swing
    44.7% Overall swing
    70.8% Z-Swing

    When you sort by O-Swing, the other 2 stick out like sore thumbs. Weeks & Willingham are the only guys with similar O-Swings, but Iannetta’s Z-Swing is 10% higher (17% higher on relative basis).

    In 2008 he was 16.2% / 44% / 72.3% in ~400 PAs. So admirable.

    Vote -1 Vote +1

  19. payroll says:

    I think you have to consider how a high contact rate affects your Z-Swing rates. For example, Ben Revere has a 92.6% contact rate. By contrast, Yonder has a 80% contact rate. Alonzo swings at more pitches in the zone, because he misses more pitches in the zone. I actually wrote a related piece relating to Revere and how passing on some of these pitches in the zone could be key in helping him raise his OBP.

    http://twinsdaily.com/entry.php?1919-Ben-Revere-Contact-King-and-the-Red-Light-of-OBP

    Vote -1 Vote +1

    • payroll says:

      That is to say, Selectivity in this case, is not necessarily reflective selectivity, as Revere would seem to either be the more skilled hitter, or when he swings, he is swinging at better quality strikes.

      Vote -1 Vote +1

  20. Obsessivegiantscompulsive says:

    Very nice analysis and logic.

    One fly in the ointment is that just because a pitch is in the strike zone does not mean that the batter should swing at it. I would suggest reading Ted Williams Science of Hitting tome, he discussed there how the strike zone is composed of regions which vary in results for the hitter, in some he would be a 400 hitter, in others, 250.

    Still, I think that the results can be very interesting as long as its limitations are understood. Good job.

    Vote -1 Vote +1

  21. DrBGiantsfan says:

    Just from an eyeball test of the top 10 and bottom 10, I would conclude that selectivity is a good characteristic for a batter to have, but unselectivity is not necessarily bad. The additional info on Brandon Belt would suggest that selectivity is not always good.

    No worries re. Belt, though. I believe he will continue to develop and have a monster breakout, possibly as soon as 2013.

    Vote -1 Vote +1

    • ElJimador says:

      Belt wasn’t particularly selective though. Per the comments earlier in this thread he scores high by this measure not because his O-Swing% was unusually low (just 122nd lowest out of 265 hitters w/300+ PAs) but because his Z-Swing% was so surprisingly high (3rd highest in baseball). As someone who watched probably all of Belt’s PAs this year I find these numbers pretty hard to believe (swinging at the same % of pitches in the zone as Sandoval? Really??). However if you do accept these stats at face value they suggest more that he was swinging too much at pitches inside the zone than that he was being too selective.

      I hope Belt will break out but I worry the constant, public harping from Giants management for him to be more aggressive is going to mess him up if it hasn’t already. If they get out of his head and leave him alone then I think he’ll be fine too. I’m just not confident they’re going to do that.

      Vote -1 Vote +1

      • DrBGiantsfan says:

        I disagree. He clearly needs to be more aggressive at the plate and stay limit his 2 strike situations where he becomes almost helpless.

        Vote -1 Vote +1

      • ElJimador says:

        But DrB, that’s the whole point. Is he getting to 2 strikes because he’s being too selective or not selective enough? The stats say he’s swinging at the same % of pitches in the zone as Sandoval, nearly the same % of first pitches (40% Belt, 42% Sandoval) and total pitches (83% Belt / 87% Sandoval) that does not sound like a guy whose problem is being too selective. Especially when pitchers weren’t even challenging him (his 40.1% Zone% was 14th lowest in baseball out of 265 hitters with 300+ PAs).

        If you look at his splits by pitching count, where Belt really excels (relative to other hitters’ splits in the same situations) is when he gets ahead (sOPS+ of 150 in 2-0 and 142 in 3-1). And I don’t see how being more aggressive than he already is is a recipe for getting ahead in the count more.

        Vote -1 Vote +1

  22. JKB says:

    Interesting that B. J. Upton and Freddie Freeman look similar in their relatively poor Oz coupled with relatively good Zz. The difference is that B. J.’s K% is 26.7, same as Dan Uggla’s K%, but Freeman has a 20.8 K%.

    Too bad we don’t have Early Swing % and Late Swing % data, because knowing WHEN to swing is an important component of selectivity too. If you swing early on a changeup thrown for a strike, isn’t that more like an O-swing than to a Z-swing?

    Vote -1 Vote +1

  23. LeeTro says:

    I just did the same thing for the BIS data, and Bruce is the most selective hitter, while Prado remained the least selective hitter. Here are the top 10:

    Bruce, Fowler, Freeman, Alonso, Uggla, McCutchen, Headley, Nelson Cruz, Yadi, Swisher

    Bottom 10:

    Prado, Hardy, Trumbo, Victorino, Pujols, Alexei, Viciedo, Zimmerman, Revere, Francoeur

    Who would have ever thought Pujols could rate below Francoeur in a plate discipline stat (no matter its validity).

    Vote -1 Vote +1

  24. JKB says:

    One other thing that would be helpful is to create new O-swing% and Z-swing% indices that have the same denominator. Currently O-swing% is the percentage of outside pitches swung at, and Z-swing% is the percentage of zone pitches swung at, so they don’t add up to 100% or to Swing%. If you use instead O-swings/Total Swings, and Z-swings/Total Swings, then your ratios add up to 100% for every player, and you can control to some degree for high vs. low Swing% (i.e., aggressiveness), especially if you are using qualified players only.

    Vote -1 Vote +1

  25. Mr Punch says:

    Mark Bellhorn was a middle infielder with the batting stats of an aging slugger – a 3 outcomes guy – who just wasn’t good at making contact, and knew it. The really selective Bosox player of the past decade was J.D. Drew, who would to all appearances purposely take 0-2 strikes because they weren’t where he wanted the ball. Maddening.

    But that (as some of the comments note) gets at the accepted meaning (isn’t it?) of selectivity – it’s not discriminating between balls and strikes, it’s waiting for your pitch, as recommended by a somewhat better Red Sox hitter.

    Vote -1 Vote +1

  26. Spencer says:

    Very interesting stuff, especially about Prado. He is tough to read at the plate. 2010 he was very conservative, along with the whole Braves lineup. 2011 he seemed to be much more aggressive. And back to conservative in 2012. Seems he is taking the Tony Gwynn approach and just waiting to see 2 strikes.

    Speaking of, it would be very interesting to see Tony Gwynn’s numbers on this.

    Vote -1 Vote +1

  27. supgreg says:

    I hate being someone that criticizes someone’s work that is presented to me for free, but “selective” is so far and beyond the wrong word, at least for this exercise.I know this is a math driven website, so I apologize for the English I’m bringing to the table. I think ” best/worst strike-zone judgement” would be the proper terminology instead of “most/least selective.”

    I did really like the premise of the article, until I got to the charts for most/least and noticed that there are good and bad hitters in both categories. I was hoping to see a bunch of shitty hitters in the least box.

    Vote -1 Vote +1

  28. DCN says:

    This presupposes that you want to swing at a pitch in the strike zone and want to lay off a pitch outside the strike zone. Neither is necessarily true. Some batters will take a strike early in the count to set up for the perfect pitch; they don’t want to swing, because the expected quality of contact would be lower. And some batters have zones that they hit well outside of the strike zone – Zimmerman, for example, hits inside pitches off the plate very strongly, and in fact had two of the top five farthest-from-the-plate inside home runs this year
    http://www.fangraphs.com/blogs/index.php/the-2012-season-in-inside-home-runs/

    So in the case of some of these hitters, it’s a matter of how strongly whether or not they select a pitch correlates with whether or or not that pitch is in the strike zone.

    Vote -1 Vote +1

  29. Moonraker says:

    Hey Carson, can you post the entire list of qualified batters and their “Sel” scores? I’d like to see whether I can derive some value from it for fantasy baseball purposes. Thanks.

    Vote -1 Vote +1

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>