How Well do Players Predict Challenges?

Friday night, the A’s were playing the Orioles in Baltimore, and it was tied up in the bottom of the tenth when Adam Jones singled with Nick Markakis on second. Markakis rounded third and tried to score, but Brandon Moss managed to throw him out, Derek Norris applying the tag a millisecond before Markakis swept the plate. The Orioles challenged the ruling, and one of the broadcasters noted that Markakis didn’t really respond negatively to the call, implying he didn’t think he was safe. The ruling was upheld, the inning ended with Nelson Cruz getting thrown out trying to straight-up steal home, and the A’s subsequently won in the 11th. As close as the Orioles came, they could only reflect on missed opportunities.

Intuitively, it makes sense that players would respond more emphatically if they felt like they were wronged by a call. It follows that player response might be a worthwhile indicator of eventual replay-review outcome. Sometimes, plays are challenged after a potentially wronged player reacts demonstratively. Sometimes, plays are challenged after a potentially wronged player doesn’t do that. Is there anything we can learn from what we’ve seen to date? Let’s find out, using the phenomenal Baseball Savant Instant Replay Database.

The database has challenged-play information, and it also includes a lot of links to video. So I decided to watch a lot of videos of challenged tag plays and force plays. Not every play has a linked video, and not every linked video includes enough information, but in all, I watched and made records from more than 200 video clips. I was interested in two things: reaction of the player against whom the initial call went, and outcome of the challenge process.

At first, I had three classifications:

  1. no emphatic response
  2. mild emphatic response
  3. extreme emphatic response

I figured that there might be differences between the two emphatic-response sub-groups. It became clear, though, that I should just go with two classifications: no real response, or some kind of response. It’s still subjective, and I might’ve mis-classified a clip or three, but I think I grouped them well enough. Usually, it was obvious.

Before the data, let’s look at some examples. Four .gifs will follow.

No real response, call upheld

MarkakisNoUpheld.gif.opt

No real response, call overturned

WhiteSoxNoOverturned.gif.opt

Negative response, call upheld

DavisYesUpheld.gif.opt

Negative response, call overturned

TaverasYesOverturned.gif.opt

Bonus Carlos Gomez .gif (call upheld)

GomezYesUpheld.gif.opt

Now to get into the numbers. We’ll keep the tag plays and the force plays separate, and you’ll see why.

Tag Plays

  • No real response: 5 overturned, 16 upheld (24% overturn rate)
  • Negative response: 16 overturned, 15 upheld (52% overturn rate)

Force Plays

  • No real response: 42 overturned, 44 upheld (49% overturn rate)
  • Negative response: 38 overturned, 31 upheld (55% overturn rate)

Obviously all the samples are limited, and because the classifications are subjective there are some real error bars here, but this is mostly for fun so let’s take the numbers seriously. With tag plays, we see a possibly meaningful difference. If there’s a review, and if the unfortunate player didn’t react emphatically, there’s been a 1-in-4 chance of an overturned call. However, if there’s a review, and if the unfortunate player did react emphatically, there’s been a 1-in-2 chance of an overturned call. It’s worth remembering that a lot of tag plays happen at home, and because runs are so important, it will take less for a manager to attempt a challenge.

Still, it’s roughly a coin flip, even if the player doesn’t like it. Probably, this has to do with the fact that it’s hard to focus on both a tag and on timing of touching a base. If you’re a player sliding into home, you might not know precisely when you’re actually on the plate, instead of the dirt. If you’re a player applying a tag, you might not know precisely when the runner got his hand or foot in. Remember, a challenged play has to be close enough to warrant a challenge in the first place. They’re usually bang-bang plays, and so players serve as only so much of a reliable indicator.

It’s different with force plays. Again, if there’s a review, and if the unfortunate player did react emphatically, there’s been about a 1-in-2 chance of an overturned call. But if there’s a review, and if the unfortunate player didn’t react emphatically, there’s also been about a 1-in-2 chance of an overturned call. Straight-up, the numbers show a slight difference, but it isn’t very big so it’s hard to believe in. Player response still essentially leads to a coin flip, but in the absence of a player response, it’s still been a coin flip.

With force plays, there is no tag. There’s no direct player-player interaction, so a player doesn’t get the sensation of physical contact. It’s pure timing, and a player just has to listen for a baseball arriving in a glove, or for a cleat touching a base. For tag plays, the people who know best might be the players involved. For force plays, the people who know best might be observers elsewhere on the field or in the dugout. At least, they seem to know about as well as the involved players do. Emphatic player response still predicts outcomes as well, but the absence of such a response doesn’t mean the absence of a good overturn possibility.

In all, even when a player responds emphatically, it’s led to an overturn just over half the time. Meaning those players have frequently been wrong. With a tag play, it’s worth considering how a player responds. With a force play, it’s less so. One thing players are not is objective. They’re also not impartial observers of the plays in which they’re involved, armed with the additional benefit of slow-motion instant replay.

One last thing:

Tag Plays vs. Force Plays

  • Tag plays: 31 negative responses, 52 total (60% response rate)
  • Force plays: 69 negative responses, 155 total (45% response rate)

Players have been more likely to respond negatively to close tag plays than to close force plays. The difference isn’t enormous, but it’s there and it’s to be expected. For one thing, players apply direct tags or have tags applied directly to them, so they feel like they have more information. Also, there’s just more at stake, as most force plays occur at first, while a lot of tag plays occur at home. There’s more emotion involved and investment involved, because the leverage of the decision is higher. A close call at first might or might not change the score. A close call at home does change the score, so players are going to care more about the judgment. Certain players around the league are just non-confrontational sorts no matter what, but among the rest, you can expect a higher reaction rate at home. Not that that tells you a lot about whether or not they’re right. They just have a greater desire to be right.




Print This Post



Jeff made Lookout Landing a thing, but he does not still write there about the Mariners. He does write here, sometimes about the Mariners, but usually not.

11 Responses to “How Well do Players Predict Challenges?”

You can follow any responses to this entry through the RSS 2.0 feed.
  1. Evan says:

    Presumably there are additional plays where the player reacted emphatically but the coaching staff declined to challenge after making its video review, meaning that the rate of overturn following an emphatic reaction is even lower than what’s reported here.

    Vote -1 Vote +1

  2. Orsulakfan says:

    Don’t you think the players’ temperament is a factor? Nick Markakis is famously quiet; Carlos Gomez is famously un-quiet. That Nick didn’t respond to the call isn’t really surprising.

    Vote -1 Vote +1

    • It is some kind of factor, but I set a low threshold and many players will issue some kind of response if they feel like a call went against them. I’ve seen Ichiro do it. Included in the data set was Joey Votto doing it. A player didn’t need to respond super emphatically in order to count, by my judgment.

      Vote -1 Vote +1

    • Jon L. says:

      This is true, but differences in temperament would just water down the results, making the fact that there is still a robust distinction in tag plays significant.

      Vote -1 Vote +1

  3. cass says:

    Sometimes the replay reviewers get the call wrong or there wasn’t enough evidence to overturn. So just because a player is emphatic and the call gets upheld, it doesn’t mean the player was incorrect.

    Vote -1 Vote +1

  4. awolgs says:

    I wonder: Is there a difference between emphatically responding taggers / forcers and emphatically responding tagged / forced? For example, are fielders more accurate in their assessment of calls than baserunners?

    Vote -1 Vote +1

  5. DavidKB says:

    Interesting look at challenges! I believe MLB is releasing whether the upheld calls were “confirmed” or just “upheld” without confirmation. It would be interesting to see how that affects the picture. I could definitely see some tags being upheld due to lack of unambiguous evidence where the player was in fact correct.

    Also, for another flavor, you could look at whether the player complaining was the runner or the fielder. I wonder if one side is more trustworthy than the other.

    Vote -1 Vote +1

  6. tz says:

    “…but this is mostly for fun so let’s take the numbers seriously.”

    This should be the tagline for all your articles. Maybe for all of Fangraphs.

    +6 Vote -1 Vote +1

  7. SucramRenrut says:

    I would arguie that Adrian Gonzalez is raisin his hands to his helmet in frustration, which would be a response.

    Vote -1 Vote +1

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Current day month ye@r *