Baseball Events != Isolated and Guaranteed

There are a few aspects of baseball broadcasting that really irk me, including the presentation of data for a player against a specific team in his career, batter-pitcher matchups being treated as gospel (especially with under 10 such occurrences), and definitive claims based on small sample sizes. I mean, does it matter that Jamie Moyer has an XXX ERA vs. the Milwaukee Brewers over his 400-year career, when there were dozens of different iterations of the Brewers lineup? Does Alfonso Soriano going 4-7 off of Doug Davis mean anything at all if the seven plate appearances are spread over five seasons? The one that bothers me the most, though, is the idea that every event in a baseball game is isolated, and therefore guaranteed to occur regardless of the preceding circumstances.

Consider this example: Jimmy Rollins on first, Ryan Howard up to bat. Rollins gets caught stealing, Howard hits a homer. The announcers are bound to say something like – “Well, if Rollins stayed put, the Phillies would have scored two on the Howard homer.” Fans do it all the time as well, buying into this idea that Rollins being caught seemingly had nothing to do with the subsequent pitch selection, location, or anything else along these lines.

Forgive me for going all Butterfly Effect, but Rollins being caught in this example changes everything. For starters, the pitcher is throwing out of the windup rather than the stretch. With nobody on, he might be able to concentrate more on the hitter. He may decide to throw a steady supply of heaters as opposed to breaking pitches. We could go on and on about the different types of strategy inherent when such a situation shifts, but the point is that the situation DOES shift. I don’t care if Howard has a better or worse chance of hitting a homer if Rollins does/does not get caught in this hypothetical because the point remains that the situation has changed. The plate appearance is not the same, and Howard is in no way, shape or form, guaranteed to hit the home run if Rollins was not caught on the bases.

The worst part is that this is not even a tough concept to grasp, yet it gets ignored by almost everyone. It is way too easy to fall into this trap, thinking that events are not tied together, but they are, and need to be treated as such. This isn’t like trying to get announcers to use wOBA instead of BA, but rather trying to get them to understand simple logic. Events in a game are not isolated. They might not be completely, 100 percent, dependent on surrounding circumstances but they are certainly not isolated and guaranteed to occur no matter what.

Print This Post

Eric is an accountant and statistical analyst from Philadelphia. He also covers the Phillies at Phillies Nation and can be found here on Twitter.

34 Responses to “Baseball Events != Isolated and Guaranteed”

You can follow any responses to this entry through the RSS 2.0 feed.
  1. David says:

    I get your point, but I feel like the Rollins example isn’t a great one. When you have a speedy runner on first base, the pitcher will a) be pitching out of the stretch; b) have some part of his mind concentrating on holding the baserunner; c) probably throw more strikes to avoid having runners on 1st and 2nd; and d) throw fewer breaking balls. All of these things would seem to actually increase Howard’s chance of hitting a HR in your example. Meanwhile, stealing 2nd (even successfully) I would imagine lowers that chance, because now the pitcher won’t be so worried about giving up a walk and putting a slow baserunner on 1st.

    Vote -1 Vote +1

  2. David says:

    To finish my last sentence – “putting a slow baserunner on 1st, since there was already a runner in scoring position.”

    Vote -1 Vote +1

    • Eric Seidman says:

      To be perfectly frank, as long as you get the gist, the examples aren’t important. The idea is important, that events are interdependent and not guaranteed to occur. For all we know, Howard could have been hit by a pitch if Rollins didn’t get caught stealing, or IBBd, etc. There is no guarantee no matter how well you can show that his chances of succeeding increased, stayed stagnant.

      Vote -1 Vote +1

  3. Jon says:

    It may not be a tough concept to grasp but it is one that makes someone give up their convictions. When someone gets caught stealing and we think it was a bad idea, a homer provides even more justification to pile on the scorn towards the idea to steal. It is a manifestation of confirmation bias we all fall into in making all sorts of judgements.

    Vote -1 Vote +1

  4. Joe R says:

    Point understood, terrible example. Usually a Home Run is a mistake pitch anyway, it’s making the assumption that he made the mistake because no one is on.

    Better example:
    Rollins steals 2nd, Howard gets walked, announcer says “Well it’s obvious Rollins didn’t need to run”, when pretty much the reason Howard walked was when 1st became open, the pitcher began to throw outside of the strikezone, since walking Howard with a man on 2nd adds less damage than walking him with a man on 1st.

    Vote -1 Vote +1

    • Eric Seidman says:

      Another would be if a runner is thrown out at home on a questionable green light, and then the next batter hits a single. Announcers will say that if he would have held, he would have scored anyway. The single following the play at the plate is not guaranteed.

      Vote -1 Vote +1

    • Gee says:

      Actually it’s a perfectly fine example and he explained it well in this article. You’re falling into the same thought process as the announcers he’s criticizing.

      Eric, THANK YOU, this is one of my biggest pet peeves from not just announcers, but from fans too. Actually, not just in baseball either. It’s SOOO annoying.

      Vote -1 Vote +1

  5. Greg says:

    A specific batter versus a specific pitcher – at what point are the sample sizes relevant? It’s got to be smaller than batter versus generic pitchers, right? If I’m a hitter that is terrible at recognizing change-ups, I’ll do poorly against a fastball/changeup pitcher and better against a fastball/breaking ball pitcher. At least better/worse than the FIP vs wOBA would suggest. The small sample size batter vs pitcher data is a poor estimation for this concept I admit, but when can we recognize that maybe there is something there?

    Vote -1 Vote +1

    • Eric Seidman says:

      Depends how you look at it. For instance, we have pitch type linear weights here that show what pitches batters do well against, quantified in terms of runs. That is certainly relevant, but the specific matchup data isn’t. For instance, I’d rather say that Player A might struggle against Pitcher B because he has stunk vs. changeups over the last three years, as opposed to he will struggle because he is 0-5 in his career against him.

      Vote -1 Vote +1

      • Greg says:

        That assumes you know my example reasoning is causing the results, right? What if the reason is something else? If I’m a hitter that picks up pitches better based on delivery or release point? A specific pitcher that has variations in that between pitches I could ‘own’ while another batter that picks up pitches based on spin-alone will be terrible against him. I’d like to be able to have the data tell me something without having to use my biased/flawed observation skills about the matchup to identify the advantage. If I have to wait until they’ve faced each other 50 times, I lose the possibility of that advantage.

        Vote -1 Vote +1

      • Kincaid says:

        Generally, if a hitter is seeing something a pitcher is doing to tip pitches, he’ll say something about it to his team, so a manager would already know that. Things like release point and spin are tracked by Pitch F/X, so a team could do analysis on those too if they wanted. Teams could get video scouts to profile similar pitching deliveries and then get statisticians to lump those samples together. The simplest thing to do would just be to ask the player if he is seeing anything from the pitcher that lets him pick up pitches better.

        Vote -1 Vote +1

  6. Tim_the_Beaver says:

    So you’re saying if Dice-K throws a gyroball in Japan it could ultimately cause the Cubs to miss the playoffs. Boy they’re screwed.

    Vote -1 Vote +1

  7. Andy S says:

    Yeah I agree, but at the same time these occurrences are worth noting when the umpire makes a bad call.

    Vote -1 Vote +1

  8. Ewdewald says:

    Announcers probably understand the truth in these matters more than we give them credit for. They’re hired to play the role of dramatic narrator when broadcasting. They make false speculation to appease the larger fanbase. The average fan is bored by the drama that exists in facts, and is intriuged by fantasy. Drama for most is believing Jamie Moyer dominates the Brewers past and present.

    Vote -1 Vote +1

  9. Michael says:

    Really, the whole pitch selection or pitching out of the stretch thing is irrelevant. Actually, the fact that it’s baseball is irrelevant. You mentioned the Butterfly Effect, but that’s really what’s in play here. When you alter an event the subsequent events are not going to be exactly the same, whether it’s for tangible reasons or not.

    Vote -1 Vote +1

  10. Jim says:

    The problem is, if you assume everything is dependent, there is nothing to talk about. And if you assume the events are not independent, that implies that Rollins getting caught helped Howard hit a home run, which is also nuts. If you had the same situation happen 10 billion times, you would expect the probability of Howard a home run to be the same weather or not Rollins gets caught.

    I would agree on the irritating use of splits and vs records. Particularly annoying in interleague, as it’s quite possible the players have not faced each other in quite a while.

    Vote -1 Vote +1

    • Marquis Grission says:

      I think the most irritating thing is mentioning W-L records and batting averages vs. teams, and the Moyer vs. Brewers example is a great one. When Moyer first faced Milwaukee, the Brewers were still a decade away from switching leagues and had players like Greg Brock, Rob Deer, and Bill Spiers. To suggest that this has any relevance when Moyer is facing Prince Fielder, Ryan Braun, and their pitcher is insane.

      Vote -1 Vote +1

  11. LG says:

    The original example was absolutely fine. How are people not getting it? The point is simply that you can’t say rollins would have scored on that homer because you don’t know that the homer would even be hit if rollins was on base. A lot of people make claims like that and to someone who understands baseball, those claims are annoying.

    Vote -1 Vote +1

  12. Brian says:

    I’m going to send this to Michael Kay.

    Vote -1 Vote +1

  13. Fresh Hops says:

    This is exactly right.

    It’s worth mentioning, by the way, that there is a context for player vs. team discussions. If I’m a Red Sox fan, I care what Josh Beckett’s life time record against the Yankees is. Not as a matter of predicting the future, but as a matter of reveling in the past. It’s basically just trash talk. I am a UNC fan and I want to know the record of UNC-Duke match-ups before every Carolina-Duke game. That’s not what announcers are doing when they tell you “Joey Votto was 3-7 during the sixth inning in night games at home against NL central opponents last year.” (Everyone adjust your expectations accordingly!) For almost any event in baseball, you can find a description of that event where the batter was 3-7 thus described last season.

    Vote -1 Vote +1

  14. verd14 says:

    I loved this piece. For some reason I really enjoy knowing what pisses people off, and from your responses to some of the comments this is nothing short of that. It’s humorous to see all this useless debate over such a simple point. Either way keep up the good work and I wouldn’t complain if there were more of the Peter Griffen esq “Grind my Gears” columns. Here’s a suggestion for the next article…Joe Buck has his own TV SHOW!!!

    Vote -1 Vote +1

  15. dbuff says:

    The example is kind of far-fetched since Jimmy Rollins hardly ever gets on base. For someone posing as a professional his .262 OBP is laughable.

    Vote -1 Vote +1

    • Joe R says:

      That being said, watch him go .350/.475/.650 or something in September and be praised for his clutchy mcclutchness as Philly attempts to steal another MVP award.

      It’s borderline insane, every Philly fan I know praises Rollins and Howard way more than the obviously more valuable and as Joe Morgan would say “consistent” Utley.

      Vote -1 Vote +1

  16. Bill says:

    Probably the one that gets me the most is when people assume things would always be the same when things happened many innings before.

    IE – Man, if Carlos Beltran didn’t drop fly ball X in the third, the Phillies wouldn’t have scored a run, and the Mets would have won 3-2. There’s no recognition that the score and the different situations that follow a different score affects the game, for example different pitchers coming in based on the situation (up a run in the 9th, bring in the closer, tied in the ninth bring in different pitcher, down 5 in the ninth bring in terrible pitcher x).

    Vote -1 Vote +1

    • Evan says:

      And yet, ERA is based on that very assumption.

      If that error hadn’t happened, the inning would have been over following the two subsequent strikeouts and those 7 consecutive hime runs wouldn’t have happened. Ever.

      Vote -1 Vote +1

  17. Samuel says:

    I agree that pitcher vs. team statistics over a long period seem to have little relevance, but there are some uncanny cases where they seem to hold true.

    Take Randy Wolf vs. the Mets. No matter how he’s doing overall, he basically always pitches well against them. 11-5 3.30-ish ERA. Small sample fluke with no discernible causal mechanism? Maybe. But it sure is maddeningly consistent for Mets fans.

    As is Derek Jeter’s like .400 lifetime avg against them in interleague play. Is that luck of the draw or is Captain Intangibles just trascendentally predisposed to rake them?

    Vote -1 Vote +1

    • Eric Seidman says:

      You just fell into the same trap. Want to know when it matters? When it’s a Jamie Moyer vs. Marlins situation, where he has faced them like 12 times in 2.5 years, with virtually the same team in tact. What Randy Wolf did against in entirely diff Mets team in 2000 has NO BEARING WHATSOEVER on what he did against them this year.

      Vote -1 Vote +1

      • Joe says:

        That’s just not true. Maybe Wolf was spit on by Keith Hernandez on his way home from a Mets game and is reminded of the event every time he pitches. His addrenaline gets a pumpin’ and he throws a little better than usual.

        It seems odd to suggest that a professional athlete’s effort will vary from game to game, but they are human and it’s almost inevitable. I’m not suggesting it’ll be a huge obvious thing, but it certainly could have SOME BEARING WHATSOEVER. The lineup may change, but usually the parks and cities don’t. Maybe the visitor clubhouse at shea happened to be conducive to wolf pitching well that day. One can never know all these things, but one also can’t just assert they cannot exist.

        Vote -1 Vote +1

  18. Dan says:

    I absolutely agree with the common ‘fallacy of assumption’ committed by all too many broadcasters as you described. However, I also believe you have lumped 2 rather different issues together when adding in the topic of a players’ lifetime stats against a particular hitter/pitcher. On this topic, I think people get much too bogged down/concerned with sample size – especially on (an otherwise tremendous) site like this, where the focus is geared toward such. Yes, any of Jamie Moyer’s lifetime stats against a team, batter, etc. should be taken with a grain of salt, but simply dismissing them due to sample size (or how long it took to accrue that sample) is, to me, potentially even more erroneous. Yes, obviously we’re talking about different lineups over the years, but teams often build lineups in the same general structure due to their ballpark. Starting pitchers in particular can be tremendous creatures of habit; while it may not be as ‘quantifiable’ as we’d like, I don’t think it should be ignored that some guys may simply prefer a city, a city’s weather, the mound, the ballpark’s prevailing winds, etc. If we simply dismiss Randy Wolf’s success as ‘random’, and thus essentially meaningless, we run a far greater risk of noting that something is definitely THERE. The Reds’ lineup has changed rather dramatically over the last 4-5 years; think Roy Oswalt (23-1) cares at this point? Which brings us to another intangible, impossible to quantify, but perhaps the most important (“behind-the-scenes”) factor in MLB – Confidence. Statistically a guy who is 4-for-7 against Johan Santana lifetime probably hasn’t truly proved he “owns” him, or even really “hits” him; but compared to the 0-for-7 guy, who do you think brings a better, more positive approach on at-bat number 8? Yes, 2 bloop singles can make a .200 hitter into a .400 hitter within 10 at-bats, so there is (again) plenty of salt to be taken with such numbers – but, again – which stats are more likely to produce a hitter with a better, more positive approach on that next at bat…. and thus a better chance of creating success?
    Finally, another reason to discount sample size (albeit not entirely). As was alluded to before, a batter sometimes simply “sees” the ball better from some pitchers, has a better comfort level against the particular stuff thrown, etc. Now, think about the dynamics here – early in the ‘relationship’, a pitcher has a decided advantage over the hitter… hitter hasn’t seen him, doesn’t know what’s coming, etc. Now, how ’bout this for utter blasphemy: if a batter has distinct success against a pitcher, especially within the first (i.e.) 8 to 10 at-bats, doesn’t that perhaps indicate a higher potential for success against that particular pitcher in the future, for reasons that transcend ‘simple mathematics’? (However, you’ll have to step out of your statistician skin before answering, forget regression to the mean, etc., etc.). Honestly, i don’t know when such numbers or sizes become meaningful – i just know that on this ‘micro’ level, a few at-bats can be (potentially) far more telling than their sample size would otherwise, ordinarily indicate.

    Vote -1 Vote +1

  19. Joe says:

    This article annoys me far more than the broadcasters who make this mistake. Yeah, of course you can’t know that Howard would have hit the homer if Rollins stole successfully, but you also can’t know that he wouldn’t have still hit it anyway. Yes, every minor event affects another but since its so impossible to know how why get so tangled up in the specifics. What’s the big deal in speaking about what we do actually know. It may not be factually incorrect, but like I said, it is impossible to make a factually correct statement when altering prior events.

    All the broadcasters are usually trying to do in those cases anyway is highlight the ill effects of an undeniably ill effecting play.

    Vote -1 Vote +1

  20. Brian says:

    Ken Singleton just said Brett Gardner getting picked off really hurt because Cervilli hit a home run.

    Vote -1 Vote +1

  21. jpdtrmpt72 says:

    what i think was missed is the fact that if jimmy rollins was alone on base and got caught stealing, it would be the third out and howard would be leading off the second, because manuel will never bat someone other than rollins first, and someone other than howrd 4th.

    liked the article. it was a good example, but some of the people who comment on this site the most freaquently are real tight asses.

    Vote -1 Vote +1

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Current ye@r *