When You Should Ignore the Data

When Jim Leyland was setting his lineup for Game 3 of the ALDS, he looked to data for guidance. What he found was that Ramon Santiago was 7-for-24 in his career against CC Sabathia, giving him a .292 average against the Yankees ace. How much that played into his decision to hit Santiago second, we can’t say for sure, but he did mention this fact to reporters before the game and he did hit Santiago second last night. It’s probably safe to assume that Santiago’s history against Sabathia played some role in his placement in the lineup.

When Ken Rosenthal reported this on Twitter, I threw out a response about batter/pitcher match-up data in general, saying “Specific batter vs pitcher data is probably the worst use of statistics in the entire sport.”

A lot of people took umbrage at this comment, and when Ramon Santiago proceeded to go 2-for-3 off Sabathia — including a double that momentarily gave the Tigers the lead — many were happy to point out that Leyland’s move to insert Santiago worked, and thus, his decision to look to batter/pitcher match-up data was justified. There are quite a few problems with this scenario, however.

1. Santiago’s “success” against Sabathia relies on one viewing offensive capability through the lens of batting average. Santiago did enter the game hitting .292 against Sabathia, but he had never drawn a walk against him and had just one extra base hit, so his overall line against Sabathia was .292/.292/.333, good for a .625 OPS. Unless we’re still evaluating hitters like it’s 1884, Santiago’s previous performances against Sabathia should not have convinced anyone that he was likely to do well against him last night.

2. Batter/Pitcher match-up data has been shown to have no predictive value. In The Book, Tango/Lichtman/Dolphin devote an entire chapter — Ch 6, “Mano a Mano” — to looking for evidence that previous results of specific batter/pitcher match-ups would predict future results in those same match-ups. It wasn’t there. Despite looking at the 30 most extreme examples of matched-pairs where the batter had dominated the pitcher over a three-year period, the group was barely better than average in the fourth season against those same pitchers. When looking at the flip side, where pitchers had dominated the hitters, the results were the same. Most interesting is that there was little difference in actual future performance by the 30 hitters who had dominated their rivals versus those who had been dominated by opposing pitchers. Even at the extremes, specific batter/pitcher data showed no real usefulness in projecting future results.

In reality, we shouldn’t be overly surprised that this data doesn’t really tell us anything. Even when looking at multiple years, you’re generally ending up with something in the 20-30 plate appearance realm, a ridiculously small number of confrontations from which to be drawing conclusions. But, the problems with batter/pitcher data go even deeper — in order to get a larger sample, you generally have to find players who have been matching up against each other for many years.

For instance, 16 of the 26 plate appearances Santiago had against Sabathia before last night came in 2002/2003, back when Sabathia was an inexperienced thrower trying to establish himself with the Indians. He’s a massively better pitcher now than he was then, and it’s hard to believe that anyone should care about what happened between those two 10 years ago. In fact, in the last four years, the two had faced off just three times, and Santiago had gone 0-for-3 and hit into a double play. Not only was the batter/pitcher match-up data of questionable use, it was almost all entirely from a time when the two players were at very different points in their careers.

This is the kind of data that just isn’t useful, which is why I decried its usage on Twitter. However, I want to make clear that I’m not saying that there are no scenarios where I believe a specific batter could have an advantage over a specific pitcher, or vice versa. We know certain hitters do better against certain pitch-types, and that platoon splits are very real, so we’d expect a left-handed masher to do very well against a right-handed side-armer. I’d even be open to hearing good arguments about why a specific player could have success against a specific pitcher beyond generalities like handedness and velocity.

However, I’d suggest that this is an area where the evidence would need to be based on something other than the data. Like high school statistics, the numbers are essentially useless, which is why no one spends any time quoting the results of a player’s high school performance in the draft room. That doesn’t mean that we can’t differentiate between amateur players, but that we’ve recognized that we need other tools beyond their performance to help us understand who is likely to succeed and who is not.

The same is true here. If you want to make a case that a specific batter has an advantage over a specific pitcher, go ahead and make that case. We’re not saying that there are no situations where that reality exists — we’re just saying that relying on the past results of batter/pitcher confrontations is not going to help you find those specific situations. The data tells you what happened in the past, but it shines no light on what will happen in the future, and for the purpose of deciding who should play and who should not, it should just be ignored.




Print This Post



Dave is a co-founder of USSMariner.com and contributes to the Wall Street Journal.


90 Responses to “When You Should Ignore the Data”

You can follow any responses to this entry through the RSS 2.0 feed.
  1. mister_rob says:

    who was the alternative and what are his numbers vs CC/CC type pitchers?

    -9 Vote -1 Vote +1

    • jake says:

      Raburn…career OPS vs lefties much better than Ramon. Been the third best hitter on the Tigers’ since AS break this year.

      +5 Vote -1 Vote +1

      • mister_rob says:

        The same Ryan Raburn who holds a 167/231/250 slash line with 9 K’s in 26 PAs against Sabathia?

        Makes Ramon’s mid 600′s ops look Miggy-ish in comparison

        -53 Vote -1 Vote +1

      • T-Roll says:

        Wow, way to completely miss the point, Mr. Rob.

        +21 Vote -1 Vote +1

      • mister_rob says:

        What point?
        Leyland chose Santiago over Raburn. Santiago career OPS is 200 points higher vs CC than Raburn’s. And Santiago went on to go 2-3 against CC
        So, does anyone HONESTLY think that Raburn, who had previously put up a Ted Lilly-esque slash line in his career vs CC, and has had 4 hits EVER vs Sabathia would have done better than Santiago did? Its not like he benched Miggy. We are talking about Ryan Raburn, he of the sub300 obp on the year

        Dont see how anyone could come to that conclusion. Therefore Leyland’s decision was the correct one

        -35 Vote -1 Vote +1

      • Matthias says:

        Individually against CC, both these hitters have an incredibly small sample size. And, as mentioned, many of Santiago’s PAs came against Sabathia nearly 10 years ago.

        However we can look at how these batters do against similar pitchers to get a bigger sample size and more confidence in our result, and it sounds like that data points to Raburn.

        +6 Vote -1 Vote +1

      • mister_rob says:

        And which lefties would you consider “similar” to CC? Because to me, what Raburn has done against guys like Rowland-Smith means nothing in this discussion
        How about Cliff Lee? Raburn has a 263 ops vs him in 20 PAs
        How about David Price? Raburn has a 607 ops vs him in 14 PAs
        Kazmir before he was bad? Raburn has a 643 ops in 14 PAs

        Seems to me Raburn has a real problem hitting hard throwing lefties. But hey he lights up rowland-smith, bruce chen, and aaron laffey. Thats got to count for something

        -14 Vote -1 Vote +1

      • mister_rob says:

        And gee, I just looked and (albeit small samples) Ramon Santiago has beat up on guys like Price and Kazmir
        So maybe, just maybe, Santiago is a better hitter vs hard throwing leties than Raburn is

        -12 Vote -1 Vote +1

      • CircleChange11 says:

        This, to me, is getting to where it’s at.

        While the broadcaster says “Santiago vs. Sabathia”, doesn’t it seem logical that teams compile stats on their players versus certain pitchers/types/pitches?

        Wouldn;t it be fairly obvious to them which players can’t hit a curveball or changeup.

        The players mentioned that Santiago does well against are non-curveball lefties. So as long as it doesn;t move a lot, Santiago may be able to make good contact. When it’s slower, with more bend, perhaps not.

        I was re-reading the chapter in BTN last night about PECOTA nad how it view player types.

        It probably would not take long to look at how Santiago does against the fastball-slider-cutter lefties and how he does against the curveball-changeup lefties. We may find something useful.

        I just don’t have much affinity for the “ignore it because it’s a small sample” stuff regarding situations that will never accumulate enough PAs to have high confidence levels. It’s a simplified (perhaps lazy) way of saying “we can never know”. Well, for a manager that has to make a decision “Aw shucks, it’s all luck anyway” is not a satisfactory reason.

        This is where statistics gets interesting to me, how players do against various player types. Since baseball eseentially comes down to a pitcher-batter confrontation, it is perhaps the most important data in the game … but we act like we can never know much about it. I think the more we dig, the more we’ll find that we do detect patterns.

        I’m doubtful that in Dave Duncan’s binder he has page after page of “It doesn’t matter, it’s just small sample size” or “anything can happen”.

        +7 Vote -1 Vote +1

      • CircleChange11 says:

        Sorry, the point I was getting at is that we should not inherently self-limit our research and commentary to “versus lefties”. To me, that does not represent advanced analysis.

        Certainly, we know that all lefties are not equal, let alone similar.

        While it’s true that we’re all great lovers, we throw different pitches at different speeds with varying levels of ability.

        Likewise, as we cna even see from our pages, there are batters that absolutely kill fastballs but little else, and other guys that murder changeups, but cannot hit a slider to save their life. Wouldn;t that play into the situation as much, if not more, than handedness?

        Vote -1 Vote +1

      • mister_rob says:

        That is what I was trying to get at, and earned a bunch of dislikes for my efforts. Maybe Dave should have presented who the alternative was, what he had done vs CC, or what he had done vs hard throwing lefties. and what Santiago had done vs other hard throwing lefties. But he didnt, maybe because it went against his point
        The dislikes should be directed to Dave on this one. At the very least a lazy piece of writing

        and someone let me know the next time Raburn goes 2-3 against CC. Even if it is July of 2017

        -15 Vote -1 Vote +1

      • dnc says:

        “So, does anyone HONESTLY think that Raburn, who had previously put up a Ted Lilly-esque slash line in his career vs CC, and has had 4 hits EVER vs Sabathia would have done better than Santiago did?”

        No, the chances are that Raburn wouldn’t have done better. Of course, if you play that game again, the chances are very strong that Santiago wouldn’t be able to do that either.

        You can’t take a 2-3 in one game and use it as some kind of evidence. The performance is an outlier (just as Raburn going 2-3 would have been).

        +7 Vote -1 Vote +1

      • CircleChange11 says:

        You can’t take a 2-3 in one game and use it as some kind of evidence.

        If it’s not evidence then what is? It IS evidence … just not as much as we want, nor the quality we seek.

        What managers are looking for is successful performance in various situations. We can say whatever we want, but Santiago continues to perform well against CC Sabathia.

        Does that mean he’ll continue to hit well against him in the future? Well, it’s no guarantee … but I like his chances versus a similarly talented player that has had terrible success against him.

        I’m trying to figure out what you guys want. You HAVE to play SOMEONE against CC Sabathia. Someone has to bat against him. You have Raburn and Santiago. If there’s a game 5 and for some reason CC starts, who do you play? Santiago or Raburn? Why?

        ——————————-

        Dave brings up an interesting point that I wish he would delve into.

        He mentioned that half of Santiago’s at bats are against the 2002 version of CC Sabathia. The insinuation is that because of this, the overall results don’t matter or are inconclusive.

        Couldn’t one look at it like “He could hit him in 2002 and he can still hit him in 2011?”. CC has gotten a lot better over the last 8 years, and Santiago is having some success against him.

        I don’t know either way, just pointing out how there are multiple ways of perceiving and anlyzing small sample data.

        Like I mentioned earlier, all of us would love to have loads of data that make the confidence level very high, but in baseball we rarely have that. So, you start with true talent … which is why guys like Howard and Granderson still play against lefties, but when true talents are about equal (or both below average), then you start looking for the “little edge” you can gain.

        We say that we don’t value small sample data or that we shouldn’t base decisions on it, but if we’re in the manager’s spot, we likely do the very same thing.

        Vote -1 Vote +1

  2. alex says:

    So obviously virtually everything you say is true. However, when we have a sample of batter v pitcher match-ups, it is, in fact, a sample, which will be a certain number of standard deviations away from the mean given the sample size. There was a thread about this on Brotherly Glove earlier this season (http://www.brotherlyglove.com/2011/07/06/owning-the-opposition/). My comment is the fifth one, and I’d be interested what you guys think.

    Vote -1 Vote +1

    • Matthias says:

      @Alex

      I like the breakdown in 28-plate-appearance buckets to get an idea of his standard deviation. However, I’m assuming each of those 28 PA buckets are coming in sequence, meaning against an assortment of pitchers. Facing different types of pitchers likely changes his true likelihood of success, and a constantly changing parameter can often lead to even more more variance.

      So what I’m trying to say is that against one single pitcher, like Volstad, his variance is likely to be lower in any 28 PA than it would be again an assortment of pitchers, and the thus the true Zscore may actually be a litter higher. Making it a little more convincing that Howard has something on Volstad (25% p value is a little sketchy, but I could handle maybe 10%).

      Thanks for the work!

      Vote -1 Vote +1

      • alex says:

        Right.. I randomized all his PA 10,000 times, so the buckets were probably often against 25 or more different pitchers.

        Vote -1 Vote +1

  3. Tangotiger says:

    It’s not completely random, and sample data is valid data. But you need to REGRESS the observed to get to the true.

    How much you have to regress 30 PA, I don’t know, but at the very least it’s going to be at least 90% toward the mean.

    Not to mention, as David said, is that the hitter’s observed performance was below average to begin with.

    +6 Vote -1 Vote +1

    • alex says:

      Tango, question:

      You wouldn’t regress the 30 PA *and then* figure out how many standard devs away from the mean the sample is, right? ie, determining how far from the mean the sample is is basically a laborious way of regressing 90%, or whatever, no?

      Vote -1 Vote +1

    • alex says:

      ….because the question we’re trying to answer is “How much of the Observed is true?”

      Vote -1 Vote +1

  4. Jon E says:

    Well….as a Tigers’ fan who has seen a lot of Leyland’s moves over the years, I’m at least fairly confident that Leyland wanted Santiago in the two-hole mostly for bunting purposes behind Austin Jackson and in front of his 3-4-5 guys. He didn’t have Santiago in the lineup on Friday night in the aborted game against Sabathia….so I don’t think the .292 BA was the reason he was inserted last night because he would have been in the Game 1 lineup if that was the case.
    I think Leyland saw that CC looked good Friday and noted with the series moving to Comerica that runs would be at a premium (at least in his opinion I’d bet)….so he was going to go into “manufacture” mode (as shown by Santiago’s failed sac-bunt effort prior to his big hit!).
    None of this is my personal endorsement of Leyland’s moves….I think he bunts way too much….just my guess at “why” Santiago was where he was. Hey….it worked out nicely. I went to bed happy.

    +14 Vote -1 Vote +1

  5. adr3 says:

    I absolutely agree that this is one of the most egregious and most frequently observed misuse of statistics…. The only slight advantage in some of these scenarios is that the players often report the confidence boost that comes with the awareness of good splits against a pitcher. Anything that makes a batter more comfortable in the box is a slight advantage….so the moral is as with everything else: moderation and good judgment with its application.

    Vote -1 Vote +1

    • Choo says:

      Isn’t this what managers are for? Regardless of the batting average nonsense, somewhere in Leyland’s memory bank were visions of Santiago putting good swings on the ball against CC. The fact that Santiago had zero walks against Sabathia shouldn’t be too shocking when you consider the matchup – aggressive hitter who rarely walks vs. aggressive strike-throwing machine

      Vote -1 Vote +1

      • CircleChange11 says:

        I think there is something to what you are saying. Managers just can’t sit around waiting for 1000 PA samples in various situations in order to make decisions.

        Point is, Santiago has made good contact against Sabathia. Leyland put him in the lineup and he continued to hit well against Sabathia.

        Would managers like more data? Sure. Do they have to make decisions based on the data they have? Yes. Was it a pretty good decision to play Santiago? Probably.

        No batter ever racks up enough PA against a single pitcher in a short enough time span to have a statistically relaible sample. I don’t understand how the conclusion is “it doesn’t matter”.

        It reminds me of BP’s article regarding what Mike Redmond knows about Tom Glavine, to the tune of being a .430+ career hitter against him. Do I, as a manager, really care if it’s Redmond’s true talent? If he gets lucky against Glavine? No. All I know is that whenever Redmond faces Glavine he gets on base a lot. Based on the information I have, and in the samples sizes I have, I can use that.

        There has to be some difference between 2-for-5 and 16-for-40, within reasonable time spans.

        Vote -1 Vote +1

    • Bill says:

      But, then how does one explain Tango’s Mano y Mano study in the Book? This study seems to indicate that the effect of any confidence boost is too small to measure.

      Vote -1 Vote +1

      • Choo says:

        I have no doubt that true talent washes out the luck and small-sample jive over time. But the human confrontation still exists. When you face a pitcher that just “feels right” it’s the exact same feeling you get when you have good chemistry with another person – that rhythm or cadence or whatever you want to call it is synchronized and completely natural.

        Do I believe Santiago can barrel up Sabathia’s entire repertoire on any given night? Hell no. But I wouldn’t be surprised if Sabathia’s unique blend of tempo, release point, velocity, break, etc. creates a “good chemistry” scenario for Santiago.

        Vote -1 Vote +1

      • Blue says:

        20 to 30 plate appearances is enough to trigger statistical significance with non-parametric tests–the notion that you need a sample of hundreds or thousands to achieve statistical significance is simply incorrect.

        If I’m a manager and I have a guy who hits .300 and is, say, 2 and 30 against a specific pitcher I’m going to be pretty inclined to give him a rest that day…and the statistics support the choice. I’m even more inclined to do so if there is some observed causal reason for the relationship (e.g., their guys best pitch is right in my guy’s hole in his swing).

        Vote -1 Vote +1

      • Blue says:

        That was meant to be a general reply.

        Vote -1 Vote +1

      • JDanger says:

        This is the thing, though. You are welcome to speculate that, sure, but you’ve offered no evidence to support your speculation.

        A number of studies, namely the Tango/MGL “Mano y Mano” chapter mentioned above in the book have shown there is little to no predictive value of batter vs pitcher matchups, especially at that sample size.

        How about a link to a non-parametric test saying 20-30 is stastically significant, or something?

        Vote -1 Vote +1

      • Blue says:

        JDanger, many medical trials use non-parametric tests and may establish robust statistical findings with 20 to 30 patients.

        Vote -1 Vote +1

      • Steven Ellingson says:

        Blue,

        there’s a difference between an experiment, where you are looking for a certain effect, and just parsing through past data.

        If you had reason to think that Santiago was going to be better against CC, and THEN you watched 30 PAs, that could be significant.

        If instead, you have two guys, one with a better OPS vs. lefties, and another with a better OPS against that pitcher in 30 PAs, you are going to be correct 90% (or something) of the time if you pick the guy with better OPS vs. lefties.

        Obviously managers know more than just OPS vs. lefties and OPS vs. pitcher, but that doesn’t mean that they are weighting all the factors correctly and making a good decision based on all the given information.

        Vote -1 Vote +1

  6. Sean says:

    What if there are extreme BB/K splits, indicating that a batter is particularly comfortable or uncomfortable vs a given pitcher?

    Vote -1 Vote +1

  7. GMH says:

    Thank you, Dave. I particluarly like your comments regarding platoon splits and an individual hitter’s success against certain pitch types. Managing lineups and pinch hitters – and its flipside, managing a pitching staff and pitching changes – boils down to trying to gain the upper hand on match ups. And Jim Leyland should know his players’ strengths and weaknesses better than anyone. The fact that a small sample size of numbers corroborates what the manager already suspects to be true is almost immaterial. One hitter may simply see the ball better off of a particular pitcher, even if the vast majority of other hitters do not. And I would suspect that a thoughtful manager like Jim Leyland would recall the quality of swings a particular player has had against a particular pitcher and rely on his memory as much as a stat sheet consisting of only 20 or so at bats scattered across disparate seasons.

    Vote -1 Vote +1

  8. Notrotographs says:

    Torii Hunter vs. Roger Clemens is my favorite mano a mano.

    0-28, 2 BB, 15 K

    Vote -1 Vote +1

    • Choo says:

      Wow, I can actually smell the pee trickling down Hunter’s leg.

      Vote -1 Vote +1

    • Bill says:

      Sample size be damned, I think that line is predictive

      Vote -1 Vote +1

      • Notrotographs says:

        Roger could probably crawl out of bed right now and make Torii look silly at the plate. Those were some of the most embarrassing plate appearances I’ve ever seen.

        Vote -1 Vote +1

      • Blue says:

        Which is the other point–the numbers can be used in conjunction with observed, qualitative data.

        Vote -1 Vote +1

    • CircleChange11 says:

      How the hell did he walk him twice?

      Vote -1 Vote +1

      • Mr. Jones says:

        The one I like is Frank Thomas v. Nolan Ryan. Thomas was 0 for 12 with 2 walks and 11 K’s. SSS be damned, I think that is saying something.

        Vote -1 Vote +1

    • Blue says:

      The chance of that being random variation is very, very small…a suitable non-parametric test would probably be significant at something like the alpha=0.0001 level.

      Vote -1 Vote +1

      • jake says:

        Not sure about the exact numbers. You might in fact be right. But given that Hunter has faced 59 guys at least 25 times in his career, his being on the extremes (amazing success and amazing failure) with a few of the pitchers might not be too shocking.

        In other words, even with 59 identical pitchers, you could get the same stratification (utter failure vs one guy, utter success vs another). I am not saying that this is definitely what happened. But it seem possible, and may not be all that unlikely. (Someone smarter than me can run the numbers if they want).

        Vote -1 Vote +1

      • Blue says:

        Depends on how bad the failure, actually. That’s the whole point of non-parametic tests. Think about it this way–if I flip a coin 40 times and it NEVER comes up heads, how secure are you in believing that the coin is fair?

        Vote -1 Vote +1

      • Steven Ellingson says:

        Blue,

        the coin analogy doesn’t quite work.

        First, if p was .31 instead of .5, it is MUCH more likely to come up zero times. I realize it was just a simple analogy, but that makes a huge difference. If it was 28 coin flips, there really is no argument against that.

        5% of all batter pitcher matchups with 28 PAs would come up “significant” at the .05 level even if the game was played by computers. If you just pick out the very extreme cases of course they’re going to have a very low p-value!

        Now, if you get the population of all pitcher-batter matchups of 25-30 PAs or something, and look at the distribution as compared to a binomial distribution, then we can have a better idea of how significant that really is.

        Vote -1 Vote +1

  9. baty says:

    I get the impression that in lots of cases, managers are maybe just using information like this to justify their instinct.

    Vote -1 Vote +1

  10. Sean says:

    David Ortiz vs Bartolo Colon is good too

    5 for 43, .116/.191/.209, 4 BB vs 15 K

    Although he did pop his one and only HR vs Colon this past season

    Vote -1 Vote +1

  11. adohaj says:

    Roy Halladay has a distinct advantage over Jamey Carroll. I dare you to find a counter example using stats

    Vote -1 Vote +1

  12. David Kelly says:

    Using an app called “Muggsy” (iTunes), I simulated the expected score of the game with Santiago and then with Raeburn batting 2nd, holding everything else constant, including CC vs Verlander. With Santiago, the Yankees would be expected to win 56.7% of the time by an average score of 4.73-3.68. With Raburn, the Yankees are expected to win 55.3% of the time by an average score of 4.75-4.04. Muggsy calculates pitcher-batter-fielder matchups, i.e., conditional probabilities, using the technique recommended by Tango (and Bill James), which basically involves the use of “likelihood ratios” to deal with sample size issues. Interesting to note that the Tigers make 2X more errors on average.

    Vote -1 Vote +1

  13. Adam says:

    Totally agree that these stats are overused. But Dave, you’re wrong. The worst use of statistics in the sports are batter/pitcher v. TEAM historically. Announcers always cite things like “Jon Lester is 14-0 vs. the Orioles” or “David DeJesus has hit .337 v. the Angels in his career” when that means pretty much nothing. It’s even more of a junk stat than hitter v. pitcher.

    +10 Vote -1 Vote +1

    • dnc says:

      Those numbers are terrible and might influence poor fantasy baseball decisions, but I haven’t seen evidence of managers basing decisions on them nearly as often as batter v. pitcher.

      I think you’re both right here.

      Vote -1 Vote +1

  14. manbearpig says:

    Dusty Baker uses this type of data way too frequently to justify his lineup construction, often with comically small sample sizes.

    Vote -1 Vote +1

    • BaseballDudeNYC says:

      Yeah… I remember Fred Lewis playing over Alonso, Sappelt and Frazier in LF, because he was 5/7 against some pitcher. I hope Dusty reads this site, but I doubt it.

      Vote -1 Vote +1

  15. CircleChange11 says:

    Let’s be realistic about this too. It’s not like Leyland kept Miguel Cabrera out of the lineup because Miggy was 2-19 against Pitcher X.

    Leyland used smallish sample data, combined with observations, to insert Santiago over Raburn.

    It wasn’t exactly a make or break decision, although it worked out well for Leyland.

    Ryan Theriot had 2 doubles against Cliff Lee. Baseball happens.

    Vote -1 Vote +1

  16. mister_rob says:

    Its just killing these guys that a team with what is perceived as an old school front office and an old old school manager ran away with their division and is about to advance to the ALCS

    Sure hope the Tigers dont win it all, that’d be two years in a row an antiquated management team brought home the hardware while the Epsteins, Beanes, and Seattles (LOL) of the world watched the postseason at home

    -11 Vote -1 Vote +1

    • KDL says:

      I know I’m not supposed to feed you, but I can’t resist.

      You know who else runs antiquated management style…Houston, Chicago (both teams), Baltimore…So, there’s that anecdotal evicdence to add to your argument.

      Vote -1 Vote +1

  17. Blue says:

    20 to 30 plate appearances is enough to trigger statistical significance with non-parametric tests–the notion that you need a sample of hundreds or thousands to achieve statistical significance is simply incorrect.

    If I’m a manager and I have a guy who hits .300 and is, say, 2 and 30 against a specific pitcher I’m going to be pretty inclined to give him a rest that day…and the statistics support the choice. I’m even more inclined to do so if there is some observed causal reason for the relationship (e.g., their guys best pitch is right in my guy’s hole in his swing).

    Vote -1 Vote +1

  18. Richie says:

    ‘Building knowledge’ and ‘a decision aid’ are two separate things. If I submit something to an academic journal showing I’ve ascertained something at a 90% level of certainty, I may get laughed at. (assuming academics laughed at professional matters, which they don’t; trust me, they’re major grumps) A currency trader who could predict currency movements with 53% accuracy would quickly be the world’s richest man.

    Ergo, “barely better than average” can still be quite the decision aid. And regarding that, to rule out usefulness Tango would also have had to test at 2 years, 1 year, a half-year. (did he? may have, for all I know) Real-life decisionmakers have to decide every day and twice on Sundays working off of small samples. So moving any distance away from a coin flip is useful.

    Vote -1 Vote +1

  19. Flip says:

    Santiago was in the game over Raburn for defensive purposes, due to that the game was at Comerica Park. Leyland even said this. He said that Raburn and Santiago were about the same vs Sabathia hitting wise, so he went with the better defensive option this time. At Yankee Stadium, he preferred the better power hitter.

    Santiago usually bats 9th, but not when Inge plays and bats 9th. In that case Santiago almost always bats 2nd.

    Vote -1 Vote +1

  20. David Kelly says:

    Just want to respond to Adam…to clarify, the results I generated were indeed Santiago vs Sabathia but not obtained by looking at the relatively few historical at-bats Santiago had against Sabathia. Instead, as described by Tango & James, Santiago’s stats were adjusted to account for Sabathia by looking at Santiago vs all lefty pitchers and Sabathia vs all righty batters. Tango and James show that this technique maps very well to actual performance for individual batters vs pitchers where the sample size is large enough. Is 20-30 at-bats enough? Dave Cameron mentions “The Book”, which goes through that analysis in pretty good detail.

    Vote -1 Vote +1

    • mister_rob says:

      so your results are faulty. Because it makes no sense to take into consideration how these guys hit junkballer lefties when deciding how they will fare against CC

      A righthander like Verlander is a more suitable like pitcher to CC than a lefty like Jamie Moyer

      Yet Im the one racking up all the dislikes. Go figure

      Vote -1 Vote +1

      • CircleChange11 says:

        [1] If you’re going to comment against the grain or against generic sabermetric thinking, then you’re going to have to tolerate the dislikes. If you feel you are correct, then state it with conviction and/or explain it more thoroughly in an attempt to get people to see a wider perspective.

        [2] In baseball, in terms of individual matchups, all we have are observations and small sample data. This is my reality as a former HS and travel coach. You have to notice things in each at bat, such as pulling out of the shoulder on breaking balls, not being able to reach the outside pitch, chasing high pitches, etc. It’s small sample data, but it’s all ya got. Actually when you combine observations of similar players over years of experience, you can get some rather large data samples.

        For example, it’s not uncommon to be able to look at a batter’s stance and swing and have a pretty good idea of what pitches he can handle and which ones are weaknesses. Same thing with pitchers. Most pitchers are going to have similar release points and movement of pitches. That’s where velocity really comes into play … the reduced reaction time and time to recognize pitches.

        If a batter cannot hit 94mph fastball-slider combos, then they probably cannot hit Zack Greinke or Adam Wainwright. The odds that a batter can hit one but not the other has to be astronomical.

        The “batter versus pitcher types” really interests me. If I hit the Little Lotto soon, I’m all over this.

        Vote -1 Vote +1

  21. David Kelly says:

    Agree that the move appears to make sense from a defensive perspective, saving .36 expected runs.

    Vote -1 Vote +1

  22. Blue says:

    One more thing…we’re assuming the “event” is a PA. But that’s not really true; the “event” is really a proffered pitch and a batter reaction. It should be possible to expand the sample size for pitcher versus batter interactions by using this lower level of aggregation.

    Vote -1 Vote +1

  23. Bryz says:

    I like to think that a significant number of strikeouts or home runs (or extra base hits in general) may be a case where you can give a little more credit to the pitcher or hitter, despite the small sample size. For example, Paul Konerko was 5 for 6 with 4 HR against Jimmy Gobble (.833/.889/2.833). Then there’s Jacque Jones vs. Mark Buehrle, where Jones was 1-27 with 12 K (.037/.037/.037).

    This is something I’d like to see tested: if a large number of XBH or K could predict that a hitter was indeed owning or being owned by a pitcher, rather than just being a result of good or bad luck. Perhaps starting with batter/pitcher pairs in the same divisions would be a good place to start.

    Vote -1 Vote +1

  24. jpg says:

    “The data tells you what happened in the past, but it shines no light on what will happen in the future.”

    My question is what statistic or data should Leyland have used? I agree with CC and others. Sometimes small samples, as unreliable as they may be, are all Jimmy’s got. Also I don’t think 20 or 30 PA against a pitcher is so small of a sample that it’s insignifigant. If one guy is 9-30 against Pitcher X, that tells me he is probably seeing the ball well against that pitcher. Using Notroto’s exmple above :

    “Torii Hunter vs. Roger Clemens is my favorite mano a mano.

    0-28, 2 BB, 15 K”

    I don’t need to see Hunter face Clemens 250 more times to know that he doesn’t have a shot against him.

    Vote -1 Vote +1

  25. Sean says:

    a specific player could have success against a specific pitcher beyond generalities if the batter was able to get a read from the pitcher. Much like in poker, some people are better at reading people and situations then others. If a pitcher was particularly easy to read then a batter who was able to pick up on certain signs and have a better idea of what pitch was coming would then in theory have an elevated level of success because against that certain pitcher they had an advantage they didn’t have against pitchers they couldn’t get reads on.

    Vote -1 Vote +1

  26. Z.W. says:

    Dave- Typically I like your stuff and I am a numbers guy, but this shows a clear lack of actually playing the game. Some pitchers, for whatever reason, you just cannot pick up. The balls seems hidden until it is on you. Maybe the pitcher crosses his body, comes from an odd angle or specializes in a pitch you don’t see particularly well. Others, the ball seems to come out of their hand on a tray. Any hitter will tell you about THAT guy — good or bad. It’s unarguable, if you have played the game before.

    Also, I think you contradicted yourself:

    “Despite looking at the 30 most extreme examples of matched-pairs where the batter had dominated the pitcher over a three-year period, the group was barely better than average in the fourth season against those same pitchers.”

    Well, isn’t that kind of the idea? You’re better than average than the league against that guy? Right?

    Then you go on to say how it shouldn’t matter anyway because of how small the sample size is. Kind of like ONE season is even smaller than the three previous. You’re disproving a point, you’re trying to use as a point above it.

    Again, typically like your stuff, but this is pretty weak.

    Vote -1 Vote +1

    • JDanger says:

      The reason statistics are used in the first place is to eliminate human emotional and sensory bias. It is very possible that you FELT you were owning a pitcher, but in fact you were just lucky. Our senses and emotions betray us all the time. That’s why we even bother with numbers– for another perspective.

      I can’t say that what you are saying isn’t true, maybe you DID own that pitcher. But there is still no statistical proof as of yet that such a thing exists. Because if it does exist, it would show up in the numbers. And it hasn’t.

      Until it shows up in the numbers, everything is just speculation. And we shouldn’t expect everyone to just believe someone because they ‘played the game’, and then just go home and call it a day. That would be irresponsible analysis.

      Statistics has verified most everything we FEEL is true about baseball, except for a few things, like specific batter/pitcher match-ups. This gives me enough reason to consider it a possibility that what we FEEL is true is infact not entirely true.

      Vote -1 Vote +1

      • CircleChange11 says:

        It is very possible that you FELT you were owning a pitcher, but in fact you were just lucky.

        Not likely. I say this humbly, but I think you’re using the word “owning” in too general of a way. When there is “ownership” of one player by another, there is generally very little dispute due to the lopsidedness of the results. Ownership is generally associated with events that the players’ skills highly influence (K, BB, HR, etc).

        But there is still no statistical proof as of yet that such a thing exists. Because if it does exist, it would show up in the numbers. And it hasn’t.

        It has shown up in the numbers, otherwise we wouldn’t be having this discussion.

        It’s a sample size issue, and from that standpoint it’s more troubling from a predictive standpoint that it is from analyzing past performance.

        Anyway, statistics have recorded the results of pitcher-batter matchups. The issue is how those stats can be used to predict future performance and at what confidence level. But, there is no doubt when looking at statistics that some players have dominated certain other players.

        Vote -1 Vote +1

      • JDanger says:

        Firstly, How are you quoting? It makes for an esthetically-pleasing argument.

        Secondly, I was using ownership in a more general manner, yes. My definition of “ownership” there was ‘lopsidedness of the results for reasons other than statistical noise’, not limited to K, BB, HR.

        The lopsidedness of the results are not in dispute, yes. But there is, however, still a dispute over the reason for those results. One might say it is in fact “ownership,” but someone else may argue that is just noise, randomness or outliers because of the sample size. Saying “not likely” one way or the other is still wild speculation.

        “It has shown up in the numbers, otherwise we wouldn’t be having this discussion.”

        In order to prove “that such a thing (ownership) exists” in the numbers, Would that not require a sufficient amount of occasions where clear ‘lopsidedness of results’ in specific batter/pitcher matchups was sustainable over an acceptable sample size? Has that ever been established?

        Or is the case against “ownership” simply one that there is no provable predictive value to it?

        Vote -1 Vote +1

      • Z.W. says:

        The problem with that argument is that sample size will never become available to the level it is needed. At some point, when numbers aren’t there to prove or disprove a point (i.e. sample size), you have to rely on first hand accounts, otherwise we are just throwing shit at the wall, like what Dave did here.

        Vote -1 Vote +1

      • Z.W. says:

        I will say that, at times, there is a luck factor. One kid I faced in college dominated me. He wasn’t particularly good by the numbers, but I could not get a hit off him to save my life. He was a submarine pitcher with a ton of run. My eyes and hands just could not correlate from point A — where I saw it — to point B — where it ended up.

        However, I got lucky that he threw me a slider, once. It was just okay and I was able to see it forever. It laid on the same plain. I hit a double. So, in that instance, I was lucky he threw me a slider. Not lucky in the send I roll over on a 2-seamer and it gets through the five hole.

        Again, I love numbers. Think they are huge. But they are one part of the game. All data, even first hand info, is important to form a complete analysis of any given situation.

        We are also neglecting the mental side of this thing, too. If you think a guy “owns” you, then, in all likelihood, he will. *Ahem* Adam Dunn *Ahem*.

        There are just some things — and not to sound like a dick or “that guy” — you just cannot explain or understand unless you have experience playing the game (or doing anything, for that matter).

        Vote -1 Vote +1

      • JDanger says:

        Agreed, the sample size will never be there.

        And If there was nothing else to go with, I could understand justifying ‘going with first hand accounts’. But if it was truly a case of Raburn vs Santiago then there was something else to go with. Ryan Raburn is a better hitter. And he is a better hitter vs LHP (.347 wOBA). So it’s not throwing shit at a wall, there is a plausible argument.

        Vote -1 Vote +1

      • CircleChange11 says:

        Another aspect that we are forgetting … VIDEO. Video is paramount in analyzing situations.

        I think I am operating on a reasonable assumption that the batting coaches watch a lot of video of their players versus other pitchers/types. If they don’t, then I have to strongly question WHY.

        It is very reasonable, IMHO, that the batting coach could have detected that Santiago just makes very good contact off Sabathia (If we wanted to see this, we could look at the video ourselves), and then made the suggestion to Leyland.

        I don’t think our assumption should be that Leyland just happened to notice that Santiago had a high BA against CC Sabthia and put him in the lineup.

        I’m guessing Leyland is the type of manager that operates more on observations, discussions, etc than he does spreadsheets and data tables.

        Vote -1 Vote +1

  27. eric says:

    Maybe it is a mental thing. Confidence plays a huge role in success and maybe leyland was trying to spark something. By making it known santiago has hit .292 career against cc, it may have raised his confidence and made him perform better.

    I think sometimes we have to remember a players mentality goes a long way

    Vote -1 Vote +1

    • JDanger says:

      Confidence plays a huge role in t-ball, certainly, but at the MLB level I would doubt it’s role is “huge” anymore. These are professional athletes who have been doing it for awhile, I’d bet skill trumps everything.

      It would make for an interesting project, to have a player rate his confidence level 1-10 before an at-bat and then measure the results afterward. (You would then have to adjust for the pitcher’s skill level.) I’d be very very interested to see that.

      Vote -1 Vote +1

  28. Hurtlocker says:

    In many player interviews the player is asked which pitchers did you see the best and have success against?? This has value in my opinion, and as stated above, confidence can make the difference. Just looking at a stats line and saying he never walked or never hit a homer doesn’t diminish the fact the player sees the pitcher well.Maybe the pitcher bears down harder knowing the guy can hit him?? (Which may also have value)

    Vote -1 Vote +1

  29. William says:

    I agree in general. But don’t you think are exceptions? For example, Scutaro and Rivera.

    Vote -1 Vote +1

  30. dwayne says:

    can we get a comment from joe maddon on this? i’d *love* to know how he picks dan johnson to bring in versus corey wade…

    Vote -1 Vote +1

  31. mister_rob says:

    Seems to me everytime I watch or listen to a ballgame and there is an ex player in the booth and this topic comes up he can, without hesitation, name the guy he really did well against, and maybe even quicker name the guys he really struggled against. And ususally the booth staff will check the numbers and its true
    So if a guy like Gary Matthews or Keith Moreland can instantly tell you the guy he hated to face 30 years after the fact, dont you guys think he was well aware of it at the time. And that it may have hurt his confidence at the time

    Vote -1 Vote +1

    • GiantHusker says:

      You really are dumb, Rob, or perhaps just obstinate.
      Of course, a player remembers whom he did well against. But that doesn’t prove any predictive value.

      Vote -1 Vote +1

  32. Mozelle's says:

    What about arm angle during a pitch delivery? It seams like certain players may see the ball better from a pitcher with an over the top delivery than a three quarter delivery etc. Additionally some pitcher’s arm angle and arm action is different for different pitches. If a player is good at recognizing these differences he may do well against pitchers where he can pick up the pitch type a split second earlier.

    Also along the line with certain players doing well against pitch types is it possible that players do well against certain pitchers because of pitch strategy. For instance, batter without much patience who swing at a lot of pitches might do well against pitchers who “attack the strike zone” and throw a lot more strikes. Since they are already swinging more they are actually swinging at more hittable pitches. But they might do worse against pitchers who throw more offspeed stuff (not for strikes) because since they aren’t walking much anyway and are just not getting a lot of strikes to swing at they don’t get on base very much.

    Just some thoughts. It’s pretty hard to measure any of this without a really large sample size though, and even harder considering strategies of pitchers and hitters change over time, both in general and in regard to the matchup.

    Vote -1 Vote +1

  33. Alby says:

    When Dave gets to manage a team, he can manage it however he likes. I go back to Earl Weaver, who was compiling such data before he had computers to help him. Seemed to work for him.

    If we want to attack small samples, how about the study Dave cites? Rather than look at ALL matchups, it only looked at the ones at the edges of the bell curve, and Dave then extrapolates from that. SABRmetrician, heel thyself.

    Vote -1 Vote +1

  34. GiantHusker says:

    Bruce Bochy relies very heavily on matchup stats when making his lineup decisions.
    The Giants have one of the worst offenses ever.
    QED

    Vote -1 Vote +1

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Current ye@r *