Defense and Slumps

A statistically minded baseball fan typically does not have to go far to find criticisms of the defensive metrics available. Certainly, some of that criticism is valid – defensive metrics are by no means perfect, and they particularly have flaws when looking at single seasons of data. This piece at Baseball Prospectus (subscription required) by Colin Wyers does a fantastic job of framing the problems and biases involved with defensive metrics, as well as a possible answer.

One issue presented with defensive metrics I believe is invalid, however, is the idea that inconsistencies in defensive ratings from year to year somehow render defensive systems useless, whether it be UZR, TotalZone, +/-, or any other stat. The idea seems to be based on a traditional baseball idea that while pitching and hitting can slump, speed and defense both tend to hold constant. According to this idea, seeing a player post UZRs of -3, +6, -9, and then +2 in four consecutive seasons would represent a problem with the metric as opposed to simply the ups and downs of single season. Such fluctuations in wRAA aren’t uncommon, but such things can be explained away by hot streaks and slumps over the course of multiple seasons.

Sky Andrecheck analyzed the idea of “defense doesn’t slump” last year. Based on the idea that the distribution of probabilities of outs on balls in play is bimodal – that is, most are either sure hits or sure outs – the standard error for fielders is smaller than that for hitters.

From an individual player’s standpoint, the average fielder has about 500 balls in play in his area over the course of the season (of course, this varies by position, and we can adjust accordingly) . Using the numbers above, we see that the average fielder has a standard error of about .23*SQRT(500) = 5.14 outs over the course of a season. This means that he is prone to make about 5 or so more or 5 or so less plays in a season than his true talent would usually call for. This corresponds to a difference of about 4 runs in a season. While this is fairly small, it does show that random variability can play a part in a fielder’s performance just as it can for hitters

Sky’s work, to me, effectively proves that we should expect some variability in defensive abilities. There’s also another element to the variation we see in defensive metrics that he doesn’t look at, and that’s the physical and mental aspects of the game. If a player tweaked his hamstring but doesn’t tell his manager, we could see a decrease in his range that would be unexplained by the information available. If a player is uncomfortable in a certain park, for whatever reason, he could slump if he receives an abnormal amount of chances at that park in that season. There are probably a multitude of other reasons as well.

The issue of chances is likely one of the reasons that we don’t typically recognize fielding slumps. I posed the question of why slumps can’t occur on defense on twitter, and the omnipresent Colin Wyers pointed out that it is “because a player’s ability to get to a batted ball informs our thinking about whether or not he should have made a play.” A player who is slumping on defense won’t get to balls that he may have gotten to normally, and that’s terribly difficult for fans to point out. Announcing crews aren’t going to have convenient stats like “Prince Fielder has one hit in his last 24 at bats” for defense.

Basically, I think that there’s plenty of reason to believe that defense slumps, and although we don’t really know the magnitude, I also don’t see why it wouldn’t be near that of the slumps we see for batters, or even greater due to the fact that all defensive chances aren’t created equal nor distributed at a regular schedule. Similarly, I would imagine that defense sees hot streaks as well – perhaps my focus on poor fielding is because I’m a pessimist. Regardless, as our ability to evaluate defense evolves, I predict that we won’t look at fluctuations in defensive metrics as a sign of incorrectness, and instead we will learn to accept that, for whatever reason, it’s not fair nor reasonable to accept that a player is the same quality defender against every ball in play in every game in every season.

Jack Moore's work can be seen at VICE Sports and anywhere else you're willing to pay him to write. Buy his e-book.

Guest
Jeff
6 years 1 month ago

I perfect example of the unreported injury affecting defense is Alfonso Soriano. His first two years with the Cubs, his UZR/150 (in LF) was very good at 38.7 and 25.5. He was clearly hurt throughout last year and his UZR/150 fell to -5.2. This year, he is healthy again and has posted 12.3 so far. Injury sure seems to be part of the reason at least.

Guest
MattBerger
6 years 1 month ago

I think a perfect example of this is Barmes. While he’s generally a very good fielder, I’ve seen points where he’s slumped at second base this year and didn’t make plays he usually did.

Guest
6 years 1 month ago

There is one major problem with this theory. A “defensive” slump can be 20 runs from one season to another. This “slump” goes unnoticed to the average fan. Yet the average fan is able to notice the 20-run difference between Adam Dunn and your average outfielder quite easily. So why wouldn’t he notice the slump?

Defensive metrics are quite volatile even with no change in defensive performance. I think most of what proponents of fielding metrics want to dismiss as a slump is simply volatility in the number which amounts to nothing.

Guest
Someanalyst
6 years 1 month ago

It must be possible to check this out… I hope someone manages to analzye this objectively and I hope I find that article…

Guest
DavidCEisen
6 years 1 month ago

Which 20 run defensive slumps are you referring to that were unnoticeable?

Guest
funketown
6 years 1 month ago

I can definitely tell that Corey Hart is a significantly worse outfielder than he was three years ago. 20 runs is a lot, it’s pretty hard not to notice.

Guest
Someanalyst
6 years 1 month ago

â€œbecause a playerâ€™s ability to get to a batted ball informs our thinking about whether or not he should have made a play”

This excellent statement by Colin Wyers also explains why the FSR data have virtually no value.

Thanks for the work; a very useful observation I think.To think that all across baseball there are defensive hot and cold streaks that are just noise to us today… but surely not tomorrow.

Guest
MikeS
6 years 1 month ago

And his ability to get to a ball is affected by too many variables – positioning, knowledge of what pitch is coming, how hard it is hit, spin, hop (good, bad, short, easy), runners moving, knowledge that the fielder on your right or left is a statue and cheating that way.

So it’s hard to know if he did what he should have done when he got there and it’s hard to know if he should have gotten there. No wonder defensive metrics are no good.

Quick question. Say a team employs a shift against a David Ortiz or Jim Thome. He queues one right where the third basemen usually is but it dribbles into left field for a hit, maybe a double. Is this a negative for the 3B since that ball went through his usual zone or does UZR take that into account? On the other hand, if the second basemen throws him out from 70 feet into short right field does he get UZR points even though it is hit right at him? In short, does the zone move with positioning or not?

Guest
matt w
6 years 1 month ago

Here’s an explanation. It looks to me as though in your case the third baseman isn’t dinged for the play, though it’s not quite moving the zone with the shift.

Guest
matt w
6 years 1 month ago

And when I say “here,” I mean “in a hyperlink on that sentence, which is hard to see at least on my computer.”

Guest
MGL
6 years 1 month ago

The data do not know where the players are positioned other than the fact that the “buckets” are created according to several factors that are proxies for defensive positioning, such as handedness of batter, speed of batter, and the power of the batter.

As far as “shifts” go, the data (that UZR crunches) does indicate whether a shift is on and whether it made a difference (such as Ortiz getting a base hit on a ground ball hit to the normal 3B position), and if so, the play is ignored as far as the UZR engine is concerned.

Guest
notdissertating
6 years 1 month ago

Very interesting stuff. I think (without RTFA) that Sky Andrecheck’s analysis relies not only on the binary outcome assumption but also on the assumption that a player’s ability (i.e. probability of fielding a given ball in play) is constant over the period of analysis. Isn’t this part of the reason why standard errors on hitting and fielding metrics aren’t standard? It is a nice simplifying assumption to claim that all slumps are simply due to a string of bad luck, but isn’t there something to the idea of an “adjustment at the plate?” If something similar is going on for fielding, Andrecheck’s standard error is biased upwards (too large). That said, it is an excellent starting point for understanding and interpretation of fielding metrics.

Guest
TheUnrepentantGunner
6 years 1 month ago

I am also skeptical. Injuries which might cause slumps at the plate can also cause slumps in the field, but lets be honest, the ability to hit a baseball thrown at 90 miles per hour with movement squarely is relatively rare and unique.

It requires everything calibrated, a good state of mind, etc.

Pretty much all of us who washed out at the age of 16 (or beyond) did so for this very reason (that or you beaned 12 batters your last year pitching but I digress).

To this day, there are armies of softball teams with former minor leaguers with great gloves. Those athletic skills dont really fade, and as long as you are alert you can still field relatively well. I can easily see though someone just having their timing off, or opening their stance 10 degrees too much.

Color me skeptical, but the theory is partially plausible, but I would bet heavily that batters are more prone to high sigma slumps and hot streaks batting versus fielding.

Guest
maxwell
6 years 1 month ago

“Defense doesn’t slump”…perhaps I’m operating under false or made up pretenses, but it seems as if people turn to defensive metrics for a reflection of pure skill, as opposed to some blend of skill and performance. I am well aware of the intent of wOBA/(the former EqA) and pitching metrics like xFIP/SIERA, but these are often deployed as evidence of performance much more so than are defensive metrics, which are assumed to reflect some sort of steady/static state. I know that my expectations (at least in regards to what I would like these numbers to tell us) is quite different for hitting/pitching/baserunning metrics, as opposed to defensive ones.

Guest
gdc
6 years 1 month ago

There is always psychology in fielding balls at the boundary of your range, second-guessing whether you know the wind conditions or whether the coach is going to rip you for diving and missing or playing it on the hop. But with the infrequency of these plays for outfielders the psychology behind it (e.g. playing in a dome) can come and go without several chances, so I could be in a slump for a week during a road trip where I’m afraid of tweaking my hamstring and only affect one or two plays, while I got 30 PA. But now the weather warmed up at home I’m fine again.

Probably infielders having slumps with throwing is more noticeable.

Guest
Terry
6 years 1 month ago

Bill James weighs in on the issue in an old interview:

James T: Todd Walker was a perfectly serviceable second baseman in 2002
according to his zone rating. Then in 2003 he had a very poor year by that
same measure. I think fans perceive general fielding ability as a constant.
But is it? Does fielding efficiency fluctuate as much as hitting or pitching
ability?

Bill James: I would guess that it fluctuates more than hitting
ability, because fielding depends on such a wide range of skills.
You have a hamstring problem, thatâ€™s going to effect your
fielding. You have a sore arm, a bad elbow, a sore shoulder, you
might be able to hit with it, but it will effect your fielding. If you
lose confidence, it effects your fielding. If you lose quickness, I
So I would guess that fielding is more variable, more
unpredictable, than hitting.

Another point in favor of this argument: look how much fielding
roles change over time. Bernie Williams is very much the same
hitter now that he was ten years ago, but nowhere near the
outfielder. Ruben Sierra had basically the same batting average,
on base percentage and slugging percentage last year that he
did in 1988, but he was a top defensive right fielder then. Now
heâ€™s a DH.

People think of fielding as a constant because, for good reasons,
they donâ€™t trust fielding stats, and donâ€™t monitor them from day
to day. Since fielding stats are kind of a cipher, weâ€™re not always
aware of changes in fielding performance, when, if a playerâ€™s
batting average dropped 20 or 30 points, we would certainly be
aware of that.

Guest
wobatus
6 years 1 month ago

That might explain why defense declines over time, but not why it fluctuates from year to year (aside from the injury issue).

I think it fluctuates mainly because the type of balls hit to the fielder fluctuate. Even if they are in the same general area, or buckets, there will be different velocities, trajectories and spin. I assume there are slumps, but that mostly the variability for healthy players is simply due to the variability of the balls hit to them.

Guest
Someanalyst
6 years 1 month ago

Can we imagine any data that would allow us to “describe” the mix of balls hit to a given player? If we knew something about the distribution of balls hit to the player that could be used to adjust the assumption that all players face the same conditions, perhaps the ratings could be made both more predictive of future performance and more reliable on smaller sample sizes.

In essence, I am assuming, based on what wobatus said, that the reason the sample sizes for defensive metrics have to be so large before the data behaves is that until multi-year sample sizes are reached, the variation in the distribution of balls hit to players is too large for the underlying assumptions to work.

I am sure this observation has been made many times before – does anybody reading this know of any attempts to mitigate this?

Guest
rickie weeks
6 years 1 month ago

Jack, maybe you focus on poor defense not because you are a pessimist, but because you are a Brewers fan.

Guest
funketown
6 years 1 month ago

What? The Brewers have a bad defense??? Did you see Ryan Braun’s diving catch the other day, or his catch in the all star game?

*waits for responder who failed to catch sarcasm*

Guest
CJ
6 years 1 month ago

I have always felt that defensive results can be subject to slumps and hot spells. Defense requires a high degree of focus. Constantly waiting for the next play, day after day, can take its toll on concentration over the course of the season. Why wouldn’t mental fatigue affect defensive play? Although a player’s speed may not change, the player’s first step reaction can vary, and that can be more important than speed for defensive range. This type of range impact will be less perceptible to the fan, because it appears that the ball was hit to a spot that the fielder can’t get to at full speed. Just a miniscule loss of concentration or the effect of a mental distraction might affect that first step. Players’ defense is also dependent on their positioning and preparation for the next pitch. There are so many things which may affect that component of defense, it would be hard to list. Also, it wouldn’t surprise me if defense is more dependent on the player confidence than is the case for hitting. How many times does it seem like a player’s errors come in waves, with multiple error games?

Guest
phoenix
6 years 1 month ago

i think part of the reason why it is so hard to quantify defense is that is impacted by so many variables. fatigue would obviously handicap range and that can be a real problem for a center fielder who is running all over the outfield everyday. an injury obviously to the lower half would handicap range or an arm injury could drag a throw wide. playing a shift and getting a batted ball that defies the shift can cause a momentary confusion for the fielder. and that’s all without mentioning bad hops or spin. also weather can be an issue. if it’s raining then the ball could slip out of a fielder’s hand or if it’s a windy day, and outfielder could have difficulty tracking a ball. on a cold october day, im sure fielders have less range than they did in july just because they are stiffer from the cold. imagine running around every single day in the outfield, being that tired and then having a nagging tightness in your hamstring, then it being a very cold, windy, rainy day and tell me it’s the same as playing on a warm summer afternoon after a restful off day. defense is not static.

Guest
phoenix
6 years 1 month ago

also a player who was recently traded to a team whose field they are uncomfortable or just unfamiliar with could cause them to not be as confident of how far they can run or when they should jump. especially in the outfield this can be apparent as walls and shapes differ so much. i mean playing left in fenway for a couple years then moving to citi field can be a huge change. especially if a player changes leagues to a field they have never played in.

Member
6 years 1 month ago

Rather than explaining fluctuations in defensive stats by saying that the player is slumping, can’t a lot of this just be explained by the fact that defensive stats take longer to stabilize than offensive stats? If you flip a fair coin 100 times, the proportion of heads is going to be very close to 50%, but if you flip it only ten times, the proportion of heads can vary a lot more. So if you flip a coin ten times and get 70% heads, rather than assuming there’s some intrinsic bias in the coin, you’d be better off just chalking it up to random chance. It seems to me that offensive stats are akin to flipping a coin 100 times, but defensive stats are like flipping a coin 10 times. So if you see a player post UZRs of +4, +6, -7, and +5 in four consecutive seasons, I wouldn’t necessarily say that’s because he slumped in year 3. It could be simply due to random chance (unless of course it had been reported that he had been dealing with some injury that year).

Guest
Carligula
6 years 1 month ago

@BigDaddyGunz – Good point… it may be, as well, that we’re used to describing offensive performance as runs-above-replacement, and fielding as runs-above-average, so fluctuations in the latter look bigger, even though it’s the same amount in terms of runs saved/scored. E.g., in your example, if we were talking about hitters and +20 runs=league average, we’d see that defensive performance, if it were an offensive performance, as +24, +26, +13, and +25. And I don’t think anyone would raise an eyebrow at that.

Guest
Carligula
6 years 1 month ago

Edit – yes, I realize that, on this site, runs scored are denoted as +/- average, with a positional adjustment then tacked on. I am referring to other sources which build in the positional adjustment, and where batters are therefore commonly compared to replacement level (for their position). So feel free to mock me for other reasons than that.

Guest
wobatus
6 years 1 month ago

Guest
pft
6 years 1 month ago

Performance in the field is partly due to ability, but there is also some luck at play, similar to hitters BABIP. Some games balls are just barely out of your range, and other games you just seem to get to most balls despite their being no difference in your range. You might get more or less bad hops one year to the next.

Some of this is due to positioning and the pitchers hitting their spots, runners on base, etc.

While you need 3 years fielding data to measure ability, you can use much smaller samples to assess performance (outcome), even if part of that performance is driven by luck (that’s true of hitting as well), so what. So long as you understand 1/2 years worth of data may be driven more by luck (bad or good) than ability, you won’t misuse it. In some cases it may be an indicator of injury that a team or player is not disclosing.

I tend to agree that SSS for fielding is not useless or tells you nothing, it just needs to be used with care.

Guest
pft
6 years 1 month ago

The biggest problem with being able to see slumps and streaks in fielding is the lack of play by play or game by game data, or even week by week.

Guest
