Seeing and UZR and Teixeira

This weekend I received enough e-mails about Mark Teixeira and his 2009 UZR of -0.8, that I thought it was worth discussing in a public post instead of answering each e-mail individually. I can only believe that this debate was spurred by a blog post on the New York Times website by Tyler Kepner:

[...] and his defense has been off the charts.

I say off the charts because I’m convinced there is no chart that accurately measures defense. The attempt is a noble one; defense is easily the most underrated ingredient in how games are won. But I don’t fully accept it.

People often cite Ultimate Zone Rating, a metric that tries to measure range and errors and how they affect runs allowed or prevented. But how can that statistic be valid when it says Teixeria has had a negative defensive impact?

Teixeira makes tremendous plays every game. He smothers everything near him, and his throwing arm is fantastic. Maybe he seems better than he is because the previous Yankees first baseman, Jason Giambi, was so adventurous in the field. But it would be hard to overstate the importance of Teixeira’s defense.

Kepner is quick to dismiss everything about UZR on what amounts to his own observations on one player. Then he leaves himself an opening in saying the equivalent of “maybe I’m biased because I’m not used to watching a good first baseman?”

What does UZR have to say about Jason Giambi then? He’s been -24 runs below average since joining the Yankees in 2002 (including his 2009 with the Athletics so far). Not a good defender. And what about Teixeira since 2002? He’s been +14.4 runs above average.

Well that’s strange. UZR agrees with what Kepner is absolutely sure he is seeing. That Teixeira is at the very least better than Giambi. And UZR actually thinks he’s considerably better than Giambi. I wonder what Kepner would say about that?

The quote that Teixeira has a negative defensive impact is a bit misleading too, considering he has a -0.8 UZR on the season so far. In my book, that’s pretty much average. He never even bothers to mention how negative it is and with the way he’s discrediting UZR, you’d think he was rated the very worst first-baseman out there.

In truth, Teixeira over the 2008 and 2009 seasons has been rated the #2 first-baseman by UZR at +9.8, so UZR has actually liked the guy a whole lot the past two seasons. But, I don’t want the point of this post to be for me to try and validate UZR.

Advanced baseball stats often paint a contrarian picture of baseball. Whether it be a player’s value or a player’s skill level, they often do not agree with popular and mainstream thinking. On the other hand, sometimes they do agree with mainstream thinking, but just because they don’t doesn’t mean anything is wrong with the statistic.

Imagine trying to gauge a player’s offensive value without using any stats. Do you think you’d remember all 600 plate appearances the guy had during the season? You probably wouldn’t. You might remember the big hits or the times he really screwed up and your opinion of the player would be biased based on a small sampling of what you could remember.

This is pretty much the same point I’m going to make with the state of fielding statistics. There is no way you remember every single play Teixeira or anyone else has made during the course of the entire season and you might only remember the big plays, or you might only remember the plays that killed your team. It’s also possible that Teixeira makes the easy plays look difficult and you’re just not realizing it. There’s really a number of areas where your memory of what Teixeira has actually done could fail you.

But this is not to say that what you see is completely useless. Studies like the Fan’s Scouting Report (by Tangotiger) have shown that through the wisdom of the crowds (many eyes and not just yours), you can get a good read on how a player is defensively.

If everyone out there agrees that Teixeira has been the absolute best first-baseman out there this season, then that’s fine, and there’s definitely value in that. The underlying data in UZR isn’t perfect and with time the imperfections get sanded out, but it’s perfectly reasonable to put some error bars on the 4 months of data used to calculated Teixeira’s -0.8 UZR on the year.

It’s also worth noting that UZR is not the only stat that thinks Teixeira has been basically average. John Dewan’s +/- (Fielding Bible) has him at +1 runs above average (also basically average in my book) and for those of you still holding onto Range Factor, he’s the 3rd worst qualified first-baseman.

In any event, when looking at these advanced fielding statistics, please use your brain and don’t be so quick to jump to conclusions just because your eyes tell you differently.




Print This Post



David Appelman is the creator of FanGraphs.

134 Responses to “Seeing and UZR and Teixeira”

You can follow any responses to this entry through the RSS 2.0 feed.
  1. Nathan says:

    Nice post. For me, the poster child for this syndrome is Jay Bruce. UZR has had him well positive all season (+8.8 total, +12.2/150) but every time I watch him play, he’s an adventure in right field. However, I’ve seen only a few Reds games this year and I’m sure his mishaps in the games I’ve seen in person (notably three terrible plays in the 22-1 loss to the Phils in which I was sitting fairly close to the field on the first-base side) just stick out much more in my mind and aren’t representative of his ability.

    Vote -1 Vote +1

    • Fresh Hops says:

      This is a nice way to illustrate the problem with a few, select observations: in the few games I watched him play this season, I thought Bruce looked really good in the outfield. Really, it’s just dumb luck that I agree with UZR and you disagree, because (if I understand you correctly) we both only saw him in a half dozen games or so.

      Vote -1 Vote +1

      • Chris says:

        I watched Bruce every day. He’s above average in RF. Not perfect, but better range than most, and he goes after EVERYTHING hard. Very aggressive.

        He also has an exceptional arm (combined with the hard-charging), but that’s not part of UZR, unless I’m mistaken.

        Vote -1 Vote +1

      • Joe R says:

        It is, actually; of course a lot of RF’s have good arms, and arm strength is relative to the position.

        Vote -1 Vote +1

  2. Mike K says:

    Or Nick Swisher. He looks awkward as hell and most Yankee fans – even intelligent ones – think he is *awful* in RF and next year should be the DH/4th OF. UZR says he really isn’t all that bad, and most of his negative is his arm. Move him to LF, and he’s probably average or better (definitely better than Damon).

    Vote -1 Vote +1

    • Rob in CT says:

      Absolutely. Nick Swisher just looks clumsy & goofy out there. Not graceful at all. And he’s made a couple of boneheaded (looking) plays that really stuck out. And yet, on balance, *if I look for them* I notice that he runs down balls that other guys might not catch. Overall, he’s nothing special out there, but there is quite the disconnect between appearance and UZR. Same thing, in the opposite direction, with Tex. I watch nearly every game. I know what Kepner is talking about: he looks great out there. At least some of that is simply the fact that Tex >> Giambi. We’ve gone from one of the worst to one of the best (who may, or may not, be having a bit of an off year with the glove), and the natural reaction is “WOW, this guy is GREAT!”

      Vote -1 Vote +1

      • Wally says:

        I think we’re giving too much credit to what we see here. Sure popular opinion of talent is often correct, but is it better than UZR? Is it even close? I doubt it. For one thing all we can do when we observe casually, is award some qualitative value, such as good, clumsy, goofy, or off the charts. How are we supposed to understand what that really means from one player to another, and what’s the confidence interval? Can good range from just average to great? Further more, you’re eyes are biased. If a crappy player makes a play look awesome, you give him bonus points. For example, when a slow OFer makes a diving catch, that say Andruw Jones can be waiting under, you may be tricked into thinking that poor outfielder made a good play, when really he was just getting to a ball most other players would have gotten anyway. The opposite of this is what seems to be Swisher’s case. I’ve been able to watch a lot of him being an A’s fan, and no he often doesn’t look very pretty out there. But that seems to be a product of pushing himself so hard. In the end it appears to me, his UZR of just above average for RF is dead on. He’s not the quickest guy, but he’s fast enough, he seems to get decent reads, and doesn’t do a whole lot of stupid things that would lead to errors either.

        Vote -1 Vote +1

  3. JoeyO says:

    “Imagine trying to gauge a player’s offensive value without using any stats. Do you think you’d remember all 600 plate appearances the guy had during the season? You probably wouldn’t.”

    I never understood people making an agrument based off what they see with their own eyes anyway. There is no way you saw everything that happened, it is impossible unless your entire time is spent studying one player.

    A semi-case in point; last year Fukudome hit .292/.397/.428/.825 at home in Chicago. Based off that, a Wrigley field employee might have been able to proclaim him one of the best pure hitters in the game. Unfortunately, that line was accompanied with a .225/.322/.333/.656 mark on the road, leaving us with a below average hitter overall. The employee not being in a situation to watch the road games would struggle to understand how anyone could call him below average. His eyes tell would have told him otherwise.

    Vote -1 Vote +1

    • daniel says:

      I’m so tired of this argument.

      Hitting stats are much, much, much, much, much more accurate than fielding stats. Even the most ardent aficionado of the advanced fielding metrics would deny that they are nothing more than a work in progress. And yet, they are treated as gospel by some sites (including this one).

      Here’s another thing. Fielding stats just quantify what a guy did. They don’t quantify how good a fielder he is. Sometimes a good fielder gets more tough plays than normal and looks bad. Sometimes a poor fielder gets more easy plays than normal and looks good. Sometimes guys get lucky, or unlucky. There is no way a stat calculated in the manner of UZR, etc. can truly measure a guy’s true talent level without a gazillion plate appearances, and over that time period his true talent level will have changed so we won’t be measuring the same thing anyway. If I watch a player take ground balls for half an hour, or shag fly balls for half an hour, I have an idea of what plays they can make and what plays they can’t make – I have an idea of their true ability. Sometimes luck and other factors overshadow ability when the game starts, which is why stats like UZR can be deceptive.

      Vote -1 Vote +1

      • phil says:

        the problem is that people have confirmation biases and ignore data to the contrary.

        “Fielding stats just quantify what a guy did. They don’t quantify how good a fielder he is.”

        Does that make any sense to you? How else would you determine how good a fielder is? Do you measure offensive players by batting practice? The repeated disclaimer of this site and many others is that UZR requires sample size, and not to overreact to fluctuations. Some players get lucky some players get unlucky, but no one is continually lucky or unlucky. Large sample sizes tend to minimize luck. Its called regression to the mean, its in every other post on fangraphs.

        Vote -1 Vote +1

      • JoeyO says:

        “If I watch a player take ground balls for half an hour, or shag fly balls for half an hour, I have an idea of what plays they can make and what plays they can’t make – I have an idea of their true ability.”

        That is one of the more laughable things I have probably ever read in my life. 1 day over 30 minutes worth of casual warm-up where often players are joking around and almost everything is hit directly at him. Yet you can tell us what a player is capable of based off this? Can you also tell us how many HR a player will hit this season based off his being in a Batting Cage for 30 pitches?

        Vote -1 Vote +1

      • B says:

        While you have some points that even people who note sample size issues in UZR like to use it when there isn’t a big enough sample to draw any real conclusion from, the fact is seeing things with your eyes has its problems, too. First of all – confirmation bias. We tend to see the things that fall in line with our established opinion. Secondly – bias towards the unusual – we tend to remember things that are out of the ordinary, in either direction, to a greater degree than their actual impact. Everyone will always remember Bill Buckner’s error, but I’m sure if you went back and watched that game you could find a defensive play where one of his teammates didn’t make an out some other players in baseball might have made. Third, and this one is very important, even assuming what we see is objective fact, how are we supposed to weight each component of defense? There is no way someone can see a players range, arm and other fielding abilities and properly weight each of their importance. We are simply incapable of doing that.

        Vote -1 Vote +1

      • Kincaid says:

        Hitting stats also just quantify what a guy did. We estimate how good each player is, both at the plate and in the field, largely by gathering data on what he did. It is also fully possible for hitting stats to fall victim to the same factors of luck. A poor hitter can get more fat pitches or benefit from more poor defensive plays than a normal and look good. Of course, we expect that over a large enough sample, those factors tend to even out so that we put a certain amount of stock in the stats, but for a given sample size, we have a certain level of uncertainty about how well our data measures a player’s true talent. That is true of both fielding and batting metrics. In general, it takes about twice as many games of the fielding stats to get about the same level of certainty as hitting stats. Whether that is “much, much, much, much, much more accurate” is a matter of opinion of what “much, much, much, much, much more accurate” means, I suppose. If the level of certainty you want for fielding stats requires a gazillion plate appearances, then I’m not sure what hitting stats you can trust, because none of them reach that level of certainty either.

        Even moreso than fielding metrics have more difficulty approximating true talent level than hitting stats, it is more difficult to quantify fielding value by watching than it is hitting value. I can see Derek Jeter hit a double or strike out and have an idea of how good that is and how much it is worth. If I watch enough games and pay enough attention, I can start to get a pretty good idea of how good a hitter he is. However, when I see him reach for a routine ground ball toward the hole and gracefully jump-throw the guy out at first, or when I see a ball skip by his glove up the middle, it is very difficult to gauge the difficulty or value of those plays or non-plays. I can get an idea just like with hitting, sure, but that idea is much, much, much, much, much less accurate.

        Vote -1 Vote +1

  4. This is a nice piece on a subject I thought I was already totally sick of. Thanks.

    As far as how to handle the advanced fielding metrics: While I’m not actually in the “fielding metrics suck!” camp, I don’t trust any of them all that much individually. For me, a lot of their value comes in comparing how the different systems rank players. Or, to throw back to Mike K’s comment, thus far I haven’t seen anything that quantifies the feeling I get when I see a fly ball head out toward Nick Swisher.

    Vote -1 Vote +1

  5. Derek says:

    Who is Tyler Kepner anyways??

    Seriously though. My biggest problem is with people saying Texiera should be the AL MVP this year, when he clearly isn’t. He isn’t even the best player on his team (Jeter) or the best 1B (Youkilis).

    I mean he has been great but not MVP great.

    Vote -1 Vote +1

  6. Shawon says:

    If you need 600 words to justify and explain a statistic, maybe the statistic is flawed.

    -54 Vote -1 Vote +1

    • To even suggest that you should know everything about about how a particular statistic works with 1 minute of work on your part is lunacy.

      I’m not even sure if you read the article.

      Vote -1 Vote +1

      • Shawon says:

        Lunacy? A tad dramatic, are we?

        Here’s my problem with UZR – everyone (as you did yourself in the article) admits that “it isn’t perfect” and there are “imperfections in the data.” So on the one hand, you congratulate UZR for naming Teixiera as the #2 rated 1B over the past 2 years, there’s no explanation for why UZR thinks that this year that he’s only an average defensive player, other than “hey, UZR’s not perfect.”

        I’ve read Rob Neyer use the same logic with Bobby Abreu’s epic -25.6 2006 season (to paraphrase, it’s “Abreu’s a bad defender, but not THAT bad.”). So my feeling is that if you have a statistic that’s sometimes right, but not all the time, but most of the time, it’s not that great of a stat. (And just so you know that I actually am literate and capable of reading words on the computer, I love OPS+ and I think BABIP is wonderfully useful. UZR? Not so much, because I don’t know how much I should trust the numbers.)

        -10 Vote -1 Vote +1

      • The point is that over time the imperfections don’t matter because you’re going to have a large enough sample size. That’s why I used the 2008 & 2009 numbers.

        Bottom line is you shouldn’t be looking at a single season’s UZR statistic in a vacuum. You shouldn’t be looking at any single season in a vacuum batting or fielding statistic.

        Here’s the deal: Obviously you’re drawing your conclusions about a particular player’s defense from somewhere. Whether it’s the scouts, or your friends, or what you see with your own eyes. But you’re getting your information about a player from somewhere.

        UZR is another data point for you to consider and the more years of UZR data you have, the more you should weight that data point.

        Vote -1 Vote +1

      • B says:

        I think you need to read up a bit on margin of error, Shawon. The point is defensive statistics have a larger variance than offensive stats, and a season of stats defensively will still leave us with a margin of error on a players actual performance. The reason Neyer doesn’t think Abru’s season was that bad is -25.6 is terrible, and given the margin of error on UZR, it’s likely that it was better than that by a decent amount (the concept of regression to the mean is the reason it’s more likely to be better than worse, and the margin of error is the reason it’s probably substantially better). It’s not like the margin of error is so big that we can’t reasonably conclude it was still a poor defensive year for Abreu, though.

        Vote -1 Vote +1

      • Kincaid says:

        If your feeling is that if you have a statistic that’s sometimes right, but not all the time, but most of the time, it’s not that great of a stat, then why are you bothering to look at stats like OPS+? It does not sound like that great of a stat by your definition. I mean, it’s possible something like Adrian Beltre going from 88 to 163 to 93 in three consecutive years happens without imperfections in the data, but it’s far more likely that those imperfections are there in this stat, as well as in any stat that relies on sampling (which is basically all of them). If you aren’t comfortable with imperfections in the data, then statistics are probably not for you.

        Vote -1 Vote +1

    • …yes, or maybe it’s just very thorough and intricate. The amount of explanation a statistic requires has absolutely nothing to do with how accurate it is.

      Vote -1 Vote +1

    • Not David says:

      Read more, post less.

      Vote -1 Vote +1

  7. To clarify my previous comment: Speaking as a fan of the Yankees, I don’t think Swisher is awful in the field. However, I think he makes a great case for the development of a new stat called Cringe Factor.

    +5 Vote -1 Vote +1

  8. antone says:

    I was just starting to place some value in UZR ratings until I saw this:

    Pedroia:
    193 PO/311 A/6 E /61 DP/ .988 FP/4.5 RF-G/4.5 RF-9/138 DG/251 EO/8.8 UZR

    Cano:
    231 PO/313 A/5 E /67 DP/ .991 FP/4.6 RF-G/4.7 RF-9/138 DG/251 EO/-3.3UZR

    Cano has made more put-outs, turned more double plays, and is slightly better in fielding percentage and range factor, yet UZR has Pedroia at 8.8 UZR and Cano at -3.3 UZR. Now I’m not suggesting Cano is a better fielder than Pedroia, but the stats suggest they should be ranked pretty much equal no?

    Is there something I’m missing here?

    Vote -1 Vote +1

    • phil says:

      what you are missing: “opportunities” and “context”

      Vote -1 Vote +1

    • Mike says:

      Pedroia is a great fielder so I don’t think that’s a great comparison. While I’m a big stats guy and I hate to sound like a MSM clone, but I’m shocked by his UZR as well. It’s average, but I have such a tough time believing he is an average 2b.

      First off, his arm is amazing, which is why I’m so shocked about his low DP rating for UZR. I know having a strong arm at 2b isnt like having a strong arm at C or RF, but it will help on a close play or turning two. And if I was voting on the Fans, I’d give him good rankings. He always making plays on the SS side of second base (which I thought would help with his range ratings).

      The only flaw I’ve seen with his defense is that he can’t dive. Instead of moving forward, he flops and stays in the same spot.

      Whatever, I know I probably sound like Tyler Kepner right now, but whatever. I think its got to be statistical noise this season. It’s not just him, but there are a lot of players who are typically really good according to UZR, who have looked good this season, and have had bad ratings.

      Vote -1 Vote +1

  9. antone says:

    Also…almost every 2B has a RF/9 higher than Pedroia but his RngR calc for UZR is 8.3 which is the highest of any qualified 2B….I don’t get it

    Vote -1 Vote +1

    • Erik says:

      You’re not getting that UZR doesn’t use the basic fielding stats. It uses play-by-play data

      Vote -1 Vote +1

      • antone says:

        So you’re saying they take every hit ball and try to determine if the average player would have made that play?

        Vote -1 Vote +1

      • antone says:

        So you’re saying that they take every hit ball and try to determine if the average player would make that play?

        Vote -1 Vote +1

      • Yes, that’s the basic idea of what UZR does

        Vote -1 Vote +1

      • MarkInDallas says:

        They don’t try to determine if an average fielder would have made the play as I understand it. They just count the balls that come into their area, and see what percentage they turned into outs. If they turn a higher percent into outs, they’re a better fielder, and if it’s lower, they’re worse.

        The formula is more complex than I just described, but that’s the crux of the logic.

        Similarly, if a batter hits a grounder through the infield, you don’t know whether the ball would have been caught had the second baseman not been covering for a hit and run, etc. You just know it was a hit. In the end, the percentages even out and you get a picture of how much the player gives his team a chance to win.

        Vote -1 Vote +1

  10. Bill says:

    Excellent article, but I have to admit it seems like even fangraphs writers distrust the advanced stats at times. One article that comes to mind was this;
    http://www.fangraphs.com/blogs/index.php/chase-utley-is-good

    The writer essentially dismisses Utley’s +/- to being about half of what it actually was, calling it noise in the data. The problem with that is, a 22 point swing in +/- could make an average player a horrible one or a horrible player really an average one. I understand dismissing outliers, but at some point you have to trust your data.

    Vote -1 Vote +1

  11. ThankYouMichaelLewis says:

    I love fangraphs, and the idea of UZR, but the Teixeira UZR has frustrated me all season.

    I know that a player looks better when making routine plays look difficult, but Teixeira has been amazing this season. I may have missed 5 games all year, and I can only remember one or two instances where Teixeira did not get a ball that he should have. Additionall, nearly every game I am impressed by at least one of his plays. Whether it was a tremendous scoop and stretch, or a great diving stop. Plus, I have watched enough baseball to know that those weren’t all routine plays.

    Cano is another player whose UZR shocks me this season. I don’t think any second baseman turns the double play more quickly, and his arm strength is elite, which has converted many outs on would-be singles up the middle.

    Is it possible that the Yankee rotation is affecting the UZR of the right side of the infield?

    Vote -1 Vote +1

    • Michael says:

      I have a strong feeling if MGL sees this, he’s going to come on and put up the same post he’s been putting up about the potential errors in UZR.

      Yes, there are both sampling and measurement errors involved in taking an in-season value of UZR and using it as a true-talent value for a player; I recall there being a +/-5 error in UZR/150 at a season’s worth of time. Defensive metrics need large samples; it’s the nature of the beast (or in this case, the buckets). Teixeira might be +5, he might be -5, he might be a lot better or worse in terms of true talent. But it doesn’t mean the metric is broken.

      Sure, offensive metrics are far more accurate than defensive metrics. For defense it would be better if we sampled multiple metrics, including more than a few scouting eyes, a la Fan’s Scouting Report. I don’t think anyone’s disagreeing with that and saying “UZR is the only determining factor on defense,” when it comes to talking about a player. But to ignore it as a broken metric because it doesn’t match your scouting eye makes you just as bad as the person who uses UZR only to determine defense. And to declare something like that based off evaluating ONE PLAYER, as Tyler Kepner did, is absolutely absurd.

      Vote -1 Vote +1

  12. Mike says:

    What would you (or anyone else) say about the statistical noise on the data for UZR this season?

    Vote -1 Vote +1

  13. WY says:

    Well done, Mr. Appelman. There are some really good lines in this post.

    Vote -1 Vote +1

  14. Matt says:

    Am I not supposed to be relying on range factor anymore?

    But its an actual number, with math mere mortals don’t understand. Are some of these complex formulas not fully valid?

    Vote -1 Vote +1

  15. Matt says:

    On the team I watch the vast majority of the time UZR agrees with my observations of Alexei Ramirez at SS — that’s he’s average to good.

    I’m sick of seeing in the media and the comments/letters sections how bad Ramirez is because he makes a fair amount of errors.

    Yea, he also gets to a lot of balls and turns a lot of double plays that guys with no range and no arm don’t.

    Vote -1 Vote +1

  16. David84 says:

    Correct me if I’m wrong, please, but it seems in this whole hubbub over Texeira’s UZR people are missing two important factors. First, as far as I understand, even one full season can constitute a rather fluky sample; at the very least, you need a full season and not 3/4 of one to really gauge a player’s defensive ability. Second, UZR is measured against average defense, not a more static baseline like replacement, so is it possible that first basemen around the league have simply improved while Tex has remained the same? Then, Kepner’s eyes aren’t deceiving him as far as Tex is concerned; he just isn’t seeing that the rest of the league maybe caught up to him.

    Again, I might be off base on this . . . just seems like the general argument is missing the bigger picture about UZR.

    Vote -1 Vote +1

    • Michael says:

      There’s been discussion about the average baseline before. MGL said it shouldn’t be enough to cause a huge change in UZR. But you’re right on the sampling issue. A season can still have a lot of error in UZR, as in many other metrics.

      Vote -1 Vote +1

  17. teamboras says:

    If Tyler Kepner was intelligent he could talk about how “URZ” has a little less importance when judging a first basemen.

    Because a part of a defense for a first basemen is handling fielders throws.

    So he could argue that with an avg URZ Tex could still be a plus defender..

    Vote -1 Vote +1

  18. K.B.D. says:

    First base is a tough defensive position to evaluate solely through UZR as a large part of your job revolves around receiving the ball from other fielders, which (I believe) UZR doesn’t account for.

    Vote -1 Vote +1

  19. Ian says:

    My problem with UZR is this: it attempts to rank a players defensive “runs saved” as an individual performance. Unfortunately, I dont think defense is something you can accurately quantify on an individual basis since it is more of a team effort. Thinking in those terms, some of the obvious “blips” in UZR become pretty easy not only to understand, but also predict (and thats big).

    Its always seemed obvious to me, but then again what the hell do I know.

    Vote -1 Vote +1

  20. Steve says:

    For the most part, UZR agrees with what I see with my eyes. However, one player that I just don’t get is Yunel Escobar. This is the 3rd straight season UZR has had him right about average or slightly below. I don’t agree with that. This guy covers a ton of ground and has a great arm. That’s what my eyes see at least. I understand that I can’t possibly see everything but Fielding Bible had him at +21 and #2 SS in MLB last year. UZR had him middle of the pack.

    Anyways, great article.

    Vote -1 Vote +1

  21. Edwin Nelson says:

    Well there may be an argument that UZR isn’t the best defensive metric, but if you take multiple metrics and combine them a pretty clear picture starts to reveal itself. Tex’s 2008 Dewan +24 shows that he’s a great 1B and corroborates what UZR says about him.

    I dare anyone to pick up a copy of the Fielding Bible 1+2, see the work that Baseball Info puts into measuring defense, and say it can’t be done accurately.

    Maybe that’s some of the issue. Dewan puts forth such a compelling case in his books, including an incredibly in-depth examination of his methodology, when “selling” +/- as a defensive metric, that what UZR really lacks is a comparatively fleshed out explanation of what it (UZR) brings to the table.

    Fangraphs book anyone?

    Vote -1 Vote +1

    • WY says:

      Actually, a Fangraphs book is a great idea. I know I come on here and nitpick some of the arguments and some of the writing, but I like what the writers are trying to do. I think a “best of” sort of book that took some of the most original/popular/insightful/controversial posts (and maybe edited/expanded on them and spruced up the occasional rough edges in the writing) could be really interesting.

      Vote -1 Vote +1

  22. Frug says:

    It’s probably worth noting that BP’s FRAR1 has Tex worth 10 runs above average and 18 above replacement with a Rate1 of 110.

    Vote -1 Vote +1

  23. JoeyO says:

    “First, as far as I understand, even one full season can constitute a rather fluky sample; at the very least, you need a full season and not 3/4 of one to really gauge a player’s defensive ability”

    Why has no one mentioned this yet?

    2005, -1.5 UZR/150
    2006, -1.8 UZR/150
    2007, -5.2 UZR/150
    2008, +9.3 UZR/150
    2009, -0.8 UZR/150

    If there is a “fluky” season, it seems to be 2008. But 2007 + 2008 brings us back to a fairly normal range. 2009 Teixeira, based off this, seems to be exactly the same guy as always.

    Also, to all the people talking about seeing the plays he has made and such. He always gets hit hard in Range. Too many balls consistently get by him. This might look, to a persons eye, like it is just a solid hit. But the average First Baseman statically shows a much higher rate of snagging them just the same. Even if you use the prehistoric RangeFactor, you can actually see he is 3rd from the bottom of qualified 1B. Controlling everything you come in contact with always looks great to the eye. But if you are not getting to very much compared to the average person, it may go unnoticed.

    Vote -1 Vote +1

    • Rob in CT says:

      If that’s true, my eyes are definitely lying to me. I see great range from Tex. Maybe it’s just the improvement from Giambi to decent.

      Vote -1 Vote +1

      • B says:

        Maybe he also has poor positioning. That was one of Jeter’s problems for a while, he played more shallow than most shortstops and his fielding suffered because of it. It may also be one of the reasons he’s seemingly improved despite being older this year (as it would be an easy thing to correct).

        Vote -1 Vote +1

  24. Let’s not forget that Kepner’s been covering the Yankees since 2002, which was also the start of the Giambi Era in NY. Compared to that thong-wielding lunatic’s gelatinous ass, Tex is Keith Hernandez.

    Vote -1 Vote +1

  25. ThankYouMichaelLewis says:

    so then when Fangraphs uses UZR to calculate WAR or value, why doesn’t it use a three-year average, hoping that a larger sample-size give a better picture of a player’s defense.

    Vote -1 Vote +1

    • JoeyO says:

      WAR calculates what you DID, not what you are capable of.

      If you were a -10 UZR guy because you had some bad luck, you will have a lower WAR for that season because of it. Just like an extreme BAbip over one season will create a huge WAR spike for a hitter. Over a couple seasons, this Batting or Fielding Runs will themself out.

      Vote -1 Vote +1

      • But like I point out below, WAR is inconsistent about that; it doesn’t punish pitchers for bad luck on balls-in-play, but it does punish hitters for it.

        Vote -1 Vote +1

      • ThankYouMichaelLewis says:

        I know that. My point is that fangraphs should use a three-year average for UZR so that there is a larger sample size do calculate a UZR to convert into WAR

        Vote -1 Vote +1

      • JoeyO says:

        “I know that. My point is that fangraphs should use a three-year average for UZR so that there is a larger sample size do calculate a UZR to convert into WAR”

        How does factoring in what he did 3 seasons ago tell you how valuable he was this season?

        Again, WAR is solely about what you did any specific season, it does not attempt to caculate what you are capable of. If you are a 0 UZR player, and reach that mark by being +2 in 2000 for the Mets, +2 in 2001 for the Dodgers and -4 in 2002 for the Rangers, you were more valuable to your NL teams in the first two seasons then you were the last for the Rangers.

        Vote -1 Vote +1

      • WY says:

        I think it’s fine to show the individual years as far as UZR goes. If people want to calculate three-year averages using those stats, then that is fine. But lots of players don’t have three years of data to go on, either because they have changed positions or they just haven’t been in the league that long.

        Vote -1 Vote +1

      • B says:

        The problem with your argument, JoeyO, is the inaccuracies in measuring fielding ability. UZR over one year may not fully account for what a player actually did. Simply put, it isn’t as accurate as wOBA accounting for singles, doubles, triples, walks and HR’s that we know for a fact is exactly what happened. A 3 year average is probably a better idea in theory but too difficult (and with too many additional problems) to work well in reality.

        Vote -1 Vote +1

    • Honestly, I think it might be good to start trying to do something like that as WAR is refined; I feel like the biggest inconsistency right now is determining how much you want it to measure true talent level, and how much you want it to measure actual performance. For example, pitcher WAR is calculated based on FIP, so it basically takes BABIP out of the equation, and doesn’t punish/reward a pitcher for having good/bad luck on balls in play, but it makes no such adjustment for hitters. Granted, this is at least partially because hitters have been show to have more control over their BABIP than pitchers do, but there still is definitely such thing as a hitter having bad/good luck on balls in play.

      I think this is the main problem most people have with fielding metrics; while we have a pretty good idea of how to separate out true talent level and luck for hitting and pitching, we really can’t do that much at all with fielding, at least not yet.

      Vote -1 Vote +1

      • JoeyO says:

        Pitchers suffer their BAbip differences because of the defense behind them, the same 12 or so guys fielding all the pitches. Extremely poor fielders at 2B and Short can drastically affect BAbip for a pitcher. Factoring for BAbip in pitchers puts their value if in front of the Average Fielder.

        For hitters, the BAbip goes fairly evenly among all fielders in their league. For every horrible fielder at 2B, there should be a fantastic one just the same. Their BAbip is, theoretically, the result of the Average Fielder already, no use to factor for it.

        Vote -1 Vote +1

      • Well, no, the BABIP differences aren’t just about defense, they’re about luck, too, and hitters can have bad luck just as well as pitchers, see Nick Swisher in 2008, for example

        Vote -1 Vote +1

      • In this case, the reason we use FIP for pitcher WAR is that it excludes defense. That’s exactly what it does and that’s exactly why we chose it.

        This lets us not count defense twice in a team’s total WAR.

        Vote -1 Vote +1

  26. AndrewYF says:

    From what I can gather, UZR is a statistic made from three or four people in the stands at each and every game. Basically, for each play, they all determine, with a consensus, whether a play should have been made or not.

    That’s terrible.

    The +/- system, from what I’ve read, is made from splitting the field into 64 zones, and finely interpreting video data to see in what zone which player made the play. Then, they compare it to how many players made that play, and assign a point value based on how hard (not many make that play) or how easy (almost everyone makes that play) a play was made. Hard plays give you lots if you make it, and take away little if you don’t. Vice versa for easy plays.

    This is the way to do defensive data, not some subjective eyes in the stands. UZR has been made obsolete by this much better defensive data system. And once we get ‘Hit F/x’, we can apply the +/- system that much better. And UZR will go the way of RF, and we will look back and wonder why we ever even considered UZR valuable in the first place.

    I really hope Fangraphs updates their ‘value’ system to take advantage of much better defensive data.

    Vote -1 Vote +1

  27. Aaron B. says:

    With all the defensive, err, “excitement” Fred Lewis has brought us Giants fans this season, one would think that he’d be below average in UZR. Fortunately, his plus range in left makes his defensive blunders a little more palatable.

    Vote -1 Vote +1

  28. ThankYouMichaelLewis says:

    UZR only seems to work when looking at a career rating. As a result, I have a problem using it to calculate a seasonal WAR/$ value.

    I would rather fangraphs used the player’s career UZR/150 (pro-rated by time spent at each position). This would at least stabilize a player’s UZR and in my opinion, would provide a more accurate picture of a player’s value/WAR.

    Vote -1 Vote +1

    • Davidceisen says:

      Defensive ability does increase and decrease, though. Using a player’s career UZR/150 for a 40 year old player does not make sense, for example. Nor would it make sense for finding out how valuable a player battling injuries is.

      The same argument that you are using for averaging UZR could be used to average wOBA when considering offensive value.

      Vote -1 Vote +1

    • Fresh Hops says:

      What I would like is an UZR/150* that regresses the player’s defense at a postion they’ve played to the mean, based on their last three years of play. Such a statistic would likely be very accurate.

      Vote -1 Vote +1

    • JoeyO says:

      “UZR only seems to work when looking at a career rating. As a result, I have a problem using it to calculate a seasonal WAR/$ value.”

      Why are you continually trying to make WAR something other then what they did any given season – that is its entire point! It is beyond reason. WAR is what any given player does any given season, not what they did on average over the last three!

      Here, we will give an example. Lets say I make 50 errors in the 2009 season after making 0 the previous 2 seasons, do you think I should be able to call my 2009 Defensive Value to my team was strong solely because I was amazing in 2007 and 2008? How does my 2009 team benifit from how good I was in 2007 or 2008?

      Looking at a three year span might tell me that my 2009 is an oliner year, but that does not mean I was any more “valuable” to my team. If I made 50 errors, I cost my team more then I helped, so my WAR should not take into consideration a time where my defense was much better.

      So one last time, WAR is Wins Above Replacement for any given year. It takes what you did that season, and assigns a Run/Win total. If you want 3 year averages, do WAR over a three year span. But if you were amazing one season and poor two others, you are not average or above – you were amazing one season and poor two others. WAR will show us that.

      Vote -1 Vote +1

      • ThankYouMichaelLewis says:

        I could be completely missing the boat here. But if UZR only measures what a player did and not ability, then it seems that a one-year UZR is more of a product of that team’s pitching, and therefore the opportunities created.

        Otherwise, it makes absolutely no sense that “what a player actually did” could be so erratic from year to year. It seems that someone’s true fielding talent would be more consistent than true hitting talent, and so why would what the player actually did in the field be no much more unstable?

        Take Ichiro’s UZR/150 as a RF from 2002-2009 (excl 2007): 5.7, 19.1, 12.9, 4.4, 17.4, 4.9, 8.7

        How does a great fielder have such major drop offs? I’m not sure how these vascillations correlate offensively, but I doubt great hitters are this streaky from year-to-year.

        Vote -1 Vote +1

      • JoeyO says:

        “I could be completely missing the boat here. But if UZR only measures what a player did and not ability, then it seems that a one-year UZR is more of a product of that team’s pitching, and therefore the opportunities created.”

        No, it factors what you should have done in those chances, regardless of how many there were.

        “Otherwise, it makes absolutely no sense that “what a player actually did” could be so erratic from year to year.”

        Fielding is not repeatable action, there are so many variances. What you did, or in other words how many plays you made verses that of the average fielder, is all that should be counted for a players single season value. No where do UZR say it gives a true ability ranking, it gives the your results in numbers. If you want to find ability, then factor multiple seasons for a larger size.

        If you look at Tex over a three year span, you will realize he is average. If you look solely at this season, he looks average. If you look solely at last season, he looks fantastic, but the year prior was rather poor. Prior to 2007 he was average.

        Vote -1 Vote +1

      • Kincaid says:

        Why would someone’s true fielding talent be more consistent than his true hitting talent? It may be, but I’m not aware of any compelling evidence to that end, and it doesn’t intuitively seem to me that that should be the case, at least if you account for the spread of talent in each area.

        Just looking at Ichiro’s wRAA from 2002-2009:

        12.8, 11.5, 31.4, 6.5, 12.8, 22.6, 6.6, 17.5

        I’m sure UZR varies more from year to year than wRAA, at least relative to the respective scales each is on as far as spread of talent, but not as much as it seems. UZR being expressed in runs relative to average makes shifts seem larger. Seeing a change from 0 to +10 runs looks like a huge shift. Seeing a change from .351 to .368 in wOBP doesn’t necessarily register that way, but for Ichiro from 2006 to 2007, that difference was 10 runs. Compared to the spread of talent in hitting, that’s a smaller variation, but it’s also pretty common to see larger swings in wOBA than that.

        That’s not to say that there isn’t more measurement error in UZR than in good hitting stats, but the peaks and valleys in UZR aren’t really that much more extreme than what you can see in hitting stats once you express them in the same way.

        Vote -1 Vote +1

  29. Boomer says:

    Jebus people, Tex is average this year at 1B, what’s the big deal?!? He was great last year and the year before; he’s having a down year (for him) this season. BIG WHOOP!

    It happens with the bat, it happens in the field, it happens on the mound, it happens on the basepaths, it happens in the dugout, it happens in the front office, it happens everywhere!

    Does this mean Tex is just an average defensive 1B? No. It means this year he’s *been* average.

    Vote -1 Vote +1

    • JoeyO says:

      No. He was great last year, but was really poor in 2007 and average between 05-06. Overall, one has to conclude he is merely average.

      Vote -1 Vote +1

      • AndrewYF says:

        Or, more likely, that UZR is a flawed metric and should not be taken as gospel.

        Vote -1 Vote +1

      • JoeyO says:

        “Or, more likely, that UZR is a flawed metric and should not be taken as gospel.”

        It doesnt say what you want it to so it must be flawed? Interesting…

        Look, it take what you had the opportunity to do and what you did with those opportunities then factors them against the league normal. It is not much different the OPS+, another stat which has a tendency to vary quite a bit year to year depending on what you DID. If you want better, more consistent rates – play better and more consistent. That is the only option.

        Vote -1 Vote +1

    • B says:

      “Does this mean Tex is just an average defensive 1B? No. It means this year he’s *been* average.”

      The point of the discussion is it doesn’t necessarily mean he’s been average this year. There is measurement error in UZR, and all other defensive metrics at this point. He may have been above average this year and UZR simply doesn’t reflect that because it has a lot of variance. Unlike hitting stats relying on things we know for a fact what the results for (even though the process to make that happen is uncertain), with fielding metrics we do not know for certain what the results were that the measurement is based off of. Basically, to repeat what AndrewYF said, UZR has flaws and should not be taken as gospel (especially with a sample size smaller than a season). Defensive stats these days do a good job, but they still have room for improvement.

      Vote -1 Vote +1

  30. Rob in CT says:

    Is +/- available free for 2009? If so, where?

    Vote -1 Vote +1

  31. truantbuick says:

    I don’t think defensive metrics are perfect, but I do believe Tex’s defense is incredibly overrated. The guy is a textbook “makes plays look harder than they are”. He’s not bad, but aside from last year (which was the outlier if you look at every year except it), he’s not been remarkable.

    Even as a Yankee fan, I find it sickening the kind of daily praise he gets. Almost everybody seems to think he’s the god of fielding materialized in an earthly avatar and it drives me nuts.

    Vote -1 Vote +1

  32. Evan says:

    You have to look at multiple seasons and regress to the mean with UZR, which correlates very weakly year-to-year and has more than a 10-run error band. A player can be expected to have a UZR (-12/+12) from one year to the next. Texeira could be a +11 run fielder and post a -0.8 UZR.

    The problem with writers who use numbers like these but don’t understand statistics is that they like to compare numbers from one player to another, not realizing that, frequently enough, the Derek Jeter’s of the world will post an impressive UZR, like this year’s 8.1, and a good fielder like Texeira will post a “negative” UZR like the one mentioned in this article.

    The argument in this article is obnoxious, to say the least. You rail against one erroneous way of thinking (“it doesn’t LOOK like Texeira is a bad fielder”) without actually explaining how one should look at the numbers. UZR is only useful if the context is a very large number of games played. Commenting at all on the number he’s posted this year is an exercise in futility, like saying “Randy Johnson ONLY managed a 16-14 record in 2004.”

    Vote -1 Vote +1

    • ThankYouMichaelLewis says:

      How can “the best” defensive metric evaluate what a player has done in the field yet correlate so weakly year-to-year?

      I mean, how is the correlation for wOBA?

      Vote -1 Vote +1

      • JoeyO says:

        2005, -1.5 UZR/150
        2006, -1.8 UZR/150
        2007, -5.2 UZR/150
        2008, +9.3 UZR/150
        2009, -0.8 UZR/150

        How does it translate so weakly year to year? 4 average to poor seasons, one very good season – there is only one true outliner here.

        Vote -1 Vote +1

      • Doug Melvin says:

        @JoeyO:

        David Wright:

        2004: +4.1
        2005: -5.0
        2006: -9.4
        2007: +4.9
        2008: +3.4
        2009: -7.6

        Vote -1 Vote +1

      • Brian says:

        @ Doug

        Look at the error rates, the DP rates, the exO to PO, etc. Then start looking around at the other people in the league that year, and see if you can figure out where he falls in relation.

        If you can come up with a situation showing he made the same amount of plays every single year, and that the league average didnt vary, then you will have a point. If his performance varies though, then you do not – this statistical measure points out said variances in his play.

        Does David Wright showing yearly variances mean the system is flawed, or that he just does not produced the same exact results year after year? How many players produce the exact same rates of anything year after year? Very, very few, correct? So why should we think variances would never show in statistical results?

        Vote -1 Vote +1

      • B says:

        “How can “the best” defensive metric evaluate what a player has done in the field yet correlate so weakly year-to-year?”

        First of all, there is no theoretically reason it should correlate well (not to mention the fact that we didn’t even establish a baseline for what “well” means). It could just be that baseball is just that random of a game. Second of all, defensive metrics still have a ways to go. If you look into the process you’ll find the potential for measurement errors whereas in offensive stats, a single is a single, a double is a double, etc. We measure those accurately.

        Vote -1 Vote +1

    • Evan says:

      “First of all, there is not theoretical reason it could correlate well.” Really? Here are three.

      1) Defensive play in baseball is considered a skill. One of the primary attributes of a skill, statistically speaking, is that it is repeatable. So year-over-year correlation should be strong.

      2) Baseball statistics are supposed to represent some fundamental fact about a player. If they are just random numbers attributed to a player each year, and you judge players based on that number, they there are worthless. How are we supposed to compare one player to another if these numbers are just chance. What are you doing on this site?

      3) If you compare two things that are within a margin of error of one another, your comparison is moot. How many articles on this site compare two players who have a UZR within twelve of one another? A lot. They should probably find a defensive metric with a better correlation.

      Vote -1 Vote +1

      • B says:

        Evan what you haven’t taken into account is the variance of the issue in question. Yes, it should be a skill, and you would expect the mean value from year to year to be close to the same. That doesn’t mean the actual value will be close to the same, though. With a big enough standard deviation you may see results, from year to year, that deviate strongly. My point is there’s no theoretical reason the variance should be small enough for strong correlation on a year to year basis. A year simply may not be a big enough sample size to produce those results.

        As for point #3, it may just be we don’t have a defensive metric better at this point. Sure, the comparisons might not be good, but let’s be honest here, what else would we talk about?

        Vote -1 Vote +1

      • Evan says:

        I completely agree with you that this metric needs to be measured over multiple years with the caveats stated.

        The problem is that people like to make claims such as “Player A will get you about 10 runs this year on defense, while Player B has negative defensive value.” All I’m trying to say is that making points based on UZR without giving your audience proper context is no better than using batting average, wins, or shoe size.

        Vote -1 Vote +1

      • B says:

        Well then, at this point we seem to agree, Evan. It seems the majority out there fall under the camp that “statistics are witchcraft” and don’t want to give things like UZR any credit at all, or the camp that grossly misuses statistics without fully understanding concepts like sample size. The problem is, both these groups are offbase, though as someone with a lot of interest in statistics, I’m more likely to side with the group that doesn’t understand what they’re using, because at least they’re trying, right?

        Vote -1 Vote +1

  33. truantbuick says:

    By the way, for what it’s worth, Keith Hernandez doesn’t think Tex’s defense is particularly special either, though that might just be former player hubris.

    http://www.nj.com/yankees/index.ssf/2009/06/keith_hernandez_mark_teixeira.html

    Vote -1 Vote +1

  34. Dave says:

    I grew up watching Keith Hernandez and Don Mattingly at 1B. I have suffered through Giambi, but I still know what an above average 1B looks like. Tex is a very very good defensive player. Anyone who says otherwise either has some type of bias or hasn’t watched him play enough.

    Vote -1 Vote +1

  35. walkoffblast says:

    This is one of those divisive issues that people get too caught up in the wrong parts of. The whole point of the stat is to give you some data that is NOT about what you see with your eyes. What everyone who is saying that fails to realize it is hard to see range with your eyes. You might think you see it on a play here or there but that is hardly a measure of consistent action. The issue here in particular is people not understanding how to use the stat but that has been hashed out pretty well in this thread.

    There can be an inconsistency in most stats when measuring what a players talent IS versus what they ARE doing in a given year. Just think of the babip factor. The inconsistencies in many simpler measures are due to luck not the stat being flawed. That does not mean those numbers are worthless and should be completely disregarded. It just means you have to know how to look at them in the right context which seems to be the main misunderstanding with UZR.

    Vote -1 Vote +1

  36. Nick says:

    There is a convincing argument that defensive metrics are more accurate than offensive ones. For example, the “bins” that offense is measure by, namely singles, doubles, triples and homers, are subject to a HUGE amount of luck. I you have a guy who gets robbed at the wall, and a guy who gets an infield hit, the latter will be rated better by linear weights, even though the former is a much better hitter based off of that sample. Often, that doesn’t even out over the course of a season, which is why we see such variation in BABIP.

    UZR actually does a better job of making more realistic bins, based on location and how hard the balls was hit with some pretty solid adjustments to smooth out measurement errors. The only reason UZR doesn’t appear to correlate as much as wOBA, is that UZR deals with a much smaller sample size per season.

    Vote -1 Vote +1

  37. ThankYouMichaelLewis says:

    Here’s my problem as well… and I know that our eyes and confirmation bias can deceive us.

    But when I watch Teixeira play (and I watch nearly all games), I am surprised constantly by plays I do not expect him to get. Additionally, I can scarcely recall times when he fails to make a play that I thought he should have made.

    As a result, I can’t understand how he could have played at a below average level. In other words, I am not sure how it’s possible that he missed so many plays whereas at least half of the first basemen in the league would have made more of the opportunities if in Tex’s situation.

    Vote -1 Vote +1

    • Nick says:

      Yes, because you have seen every single play that Tex has made this year. Also, he hasn’t played “below average”. UZR doesn’t profess to be that accurate. Still, over a 2/3rd of a season, it does a pretty good job of defining *value* NOT skill. It’s like that he has played anywhere from -5 to +5 runs this year. You probably think it’s the latter.

      Vote -1 Vote +1

    • JoeyO says:

      I think the biggest problem here is the fact that so many people seem to think that “hardly ever” (or “scarcely recall”, as you specifically used here) doesn’t have much affect the results. If I hardly ever strike out, it doesn’t mean I never strike out. And when the majority of players hardly ever strike out as well, my hardly ever could still be more frequent then everyone elses hardly ever.

      Really, 10 plays can make a huge difference in a season and be hardly be noticeable to the eye – it is fewer then 2 plays a month! This holds true on defensive measurements, as well as those for offense. An example

      .323/.390/.467/.857 – that is Derek Jeter this season

      .302/.371/.446/.817 – that is Derek Jeter this season with just 10 fewer singles

      See the extreme difference 10 singles can make? But are you really going to notice one or two singles being removed a month if you were to watch Jeter every single day? No, it’s impossible; your brain is not able to calculate stats like that.

      So, under the same situation, how do you think you would notice 1 or 2 defensive plays a month that the player didn’t make where the average person would? It too is impossible.

      Then, more specifically, are you able to keep track of how many balls are hit within range of Tex, and how many plays he made on this balls while simultaneously keeping track of how many balls are hit into everyone elses range plus the number of plays they make?

      You can see, this is getting harder for you to calculate in your head then even counting singles. You can not physically watch all players at once, so you can not realistically know what “average” looks like to the naked eye to begin with. You only know what you see, which is one guy maybe 60 times a season if you watch a ton of games, plus maybe another 10-15 guys you see 3-5 times each as they play against the team you are focused on – and I doubt you are paying the utmost attention to those 10-15 guys anyway, as your teams runner on 3rd is running on the play or whatever.

      So you don’t know what average looks like meaning you don’t know with any certainty the number of plays your player should make. Plus you don’t see every single play your player is involved in, its impossible unless you travel with the team and focus solely on his play. So how can you realistically argue against a statistic which points to the player being merely average with the glove?

      Vote -1 Vote +1

  38. razor says:

    One guy’s rating that has me confused is Yunel Escobar. He really does appear to me to be an above average defensive shortstop. It was mentioned that UZR uses the same data as the +/- over at Baseball Info Solutions. If anyone happens to know where does Dewan & Co currently have Yunel in the +/- ???

    Vote -1 Vote +1

  39. B says:

    Based on your assertions it sounds like they’re biased towards good teams and famous players, man who would guess they’d market those things…

    (They really are east coast biased, but whatever, who watches ESPN anyways)…

    Vote -1 Vote +1

  40. MerryGoByeBye says:

    So… If a guy watches all 162 games and sees a well above average fielder, and then sees a fielding stat (and those are not all that accurate because there is not a perfect way to measure fielding skills) saying otherwise he should go fuck himself? Really?

    The easiest way to sound dumb is to try and make the other guy sound dumb. His point was that Tex was a better fielder than his UZR says he is. That’s hard to argue, isn’t it? I guess that’s why you had to go out of your way to show how bad Giambi was with the glove, and that wasn’t even his argument.

    Vote -1 Vote +1

    • JoeyO says:

      “So… If a guy watches all 162 games and sees a well above average fielder, and then sees a fielding stat (and those are not all that accurate because there is not a perfect way to measure fielding skills) saying otherwise he should go fuck himself? Really?”

      It is impossible to know what average is if you solely watched one player over 162 games. All you would know is what that single player did, as well as some of what his opponents provided on the specific days you saw them. (also, he never actually claims Tex is above average – he just says he is convinced defense cant be measured because UZR has him at “negative defensive impact” – which you and I know just means average)

      To make the conclusion that any single player is truly above average, you would need to watch all 2430 games and focus solely on 1B (or whatever position the player you are attempting to gauge plays). You would then have to come up with some kind of way to count the number of plays, errors, DPs, assists, and so on based off the number of possibilities for every player to see time at the position. Next, you would probably need to put all those stats into a computer (since your brain cant comprehend that type of information) and have an average produced. Then, you must compare your average versus the player you are rating. Your end results would look much like the UZR given to you to begin with, as it is doing little more then taking the impossible work out of your hands.

      As far as your argument against the story. His point was that UZR must be wrong (he’s “convinced”) because it has Teixeira at a negative impact. He implies the statistic means he is a poor fielder. He doesn’t state that the stat is measured against the average. He doesn’t mention the fact that Tex was merely “average” in that final number/statistical conclusion – and I mean perfectly average of qualified defenders; 11 better, 11 worse so far this season. He doesn’t mention how he specifically gauges a players defensive value; what he compares it to or how focused and critical he has been of him. He doesn’t really say anything other then stating he has watched Tex and he makes tremendous plays. His entire evaluation is merely “He smothers everything near him, and his throwing arm is fantastic”. He doesn’t mention how many other players the very same things can be said about. He doesn’t expand on “near him”, or mention balls outside of his reach that he manages to get to (the biggest factor in Mark’s overall scores).

      He really just wrote an article attacking, or in the least attempting to discredit something, he obviously doesn’t understand while providing no evidence to his theory other then his opinion. I might as well write an article stating Honda’s are horrendous cars because, from my experience, I have seen poor ones.

      Lastly, he brings upon the conversation about Giambi with the following statement

      “Maybe he seems better than he is because the previous Yankees first baseman, Jason Giambi, was so adventurous in the field. But it would be hard to overstate the importance of Teixeira’s defense.”

      If we were to form our argument off this statement he provided, we could agree with him that Tex is extremely important to the Yankees when compared to the defense of Giambi – that just doesn’t make Teixeira a superb defender. He is average, and average means a value around -0- on a RAA scale.

      Vote -1 Vote +1

      • MerryGoByeBye says:

        I get that the writer’s argument wasn’t perfect, otherwise it wouldn’t be on the NY Times. My point is this: While I’m no scout, I’m pretty sure I can tell good defense from bad defense. If I watch every game a guy plays, I guess I’ll be able to say on my own that he is or isn’t good with the glove. I’ll watch other teams and all, but not all games, however I don’t really think I need to.

        UZR sometimes overrates (Joe Crede? Chone Figgins?) some guys, and underrates others. It’s not a perfect stat, and should not be defended as such.

        Vote -1 Vote +1

      • B says:

        “It’s not a perfect stat, and should not be defended as such.”

        Exactly. One problem you have to acknolwedge when judging with your eyes, though, is that how do you rate what impact the player is having? You can get a feel for things like range and arm strength, but trying to figure out exactly how good a player is at these things and how much of an impact it has basically requires a statistical analysis of some kind, we’re simply incapable of figuring that out on our own.

        Vote -1 Vote +1

  41. walkoffblast says:

    It seems like the real problem is people do not like or understand the type of results statistical analysis yields. They do not want to understand or know how to deal with a situation where say someone comes up to bat and the infographic says they should have a .290-.310 BA this season. UZR is not as black and white as above .300 or below .300. Just like when people get mad at pythagorean numbers because they try and interpret them as the sole prediction of wins for a season as opposed to what they actually are.

    Vote -1 Vote +1

  42. jud says:

    Hi,

    I think that Kepner has a valid point wrt Teixeira. As a Yankee fan, I can say that after reading ~7 years ago about Jeter’s poor defense, I started paying more attention to all of the horrible diving attempts on balls to his left (aka pastadiving). The opposing fielder made the same play without difficulty. What the advanced stats were saying and my eyes were seeing were completely aligned. But with Tex, I just don’t see it that way. He is constantly making above average plays.
    Advanced fielding stats have to overcome the issues of discretionary fielding (ball-hogging), poor binning and small sample size. The fact that the numbers jump wildly from year to year are also an indication of reliability issues. So while correction factors are inserted, this is still a work in progress.

    Vote -1 Vote +1

  43. UZR (and all other fielding metrics) does not take into account one of the most important jobs of the 1st baseman – catching balls thrown from other infielders.

    When it does it will be much more useful in judging defense by infielders.

    For outfielders UZR can be lowered just because another fielder caught a ball in your zone. In other words a ball hog can lower your UZR even though you were in position to catch the ball.

    ALL fielding metrics are flawed. When we have Hit f/x available for every hit ball and we can tell exactly what a fielders range is in feet that he went from where he was positioned to get to each and every ball, not by an arbitrary zone that half of players at his position got to balls in, then the defensive metrics can be well defined.

    Until then a well trained eye is just as good as the metrics.

    I still use UZR in some cases and I do not in others.

    1B is one of those I don’t use it, because it misses so much of what good defense at 1B is supposed to be.

    Vote -1 Vote +1

    • Basil Ganglia says:

      The issue of 1B contributions in receiving thrown balls have been evaluated and considered in some detail. As I recall, the answer is that it’s not at all as significant as most people think it is, and as your comments suggest you believe. Range, or lack thereof, is by far the biggest factor for a 1B.

      With the AL using the DH there is a home now for players who are atrocious defenders. This has likely had the effect of reducing the overall spread in capabilities of glove abilities at 1B. If the spread in abilities is narrowed, then the importance of that factor in run prevention will diminish.

      You might want to google around a bit to try to find them. I would post links if I had any close at hand. Perhaps other commenters might have some links close at hand.

      Vote -1 Vote +1

    • Kincaid says:

      You are confusing UZR with Zone Rating. UZR addresses the issues you have with ZR.

      Vote -1 Vote +1

  44. DC Stack says:

    I love statistics. I make my living as a statistician. Despite my life being statistics I am the first to admit people fall in love with metrics too quickly and without enough scrutiny. Fielding metrics are at an early age in their development. UZR is one of the two or three best metrics out there right now. However, treating it as if it accurately measures defense is foolish. It is a great effort at doing that but these metrics (UZR included) are so immature right now that at best they are reasonable estimates.

    When I try to evaluate a metric like this my first instinct is to see how stable the metric is for a player from year to year. The ability to defend does not disappear and reemerge suddenly from year to year. Putting UZR to this test you see there are wild fluctuations for most players from year to year, often going from reasonably large positive numbers to reasonably large negative numbers. That makes me question the value of the metric.

    I recommend all readers look at metrics like this as just one way of evaluating players abilities. Use all sources of data, including your own eyes, to determine where you believe the true mean resides.

    Vote -1 Vote +1

    • Joe R says:

      One way Bill James came up with estimating the defense of a 1B is unassisted putouts at the team level.

      Putouts at 1st base – (Assists at 2nd + Assists at 3rd + Assists are SS), which is obviously an estimate, but it does have Teixeira playing well. FRAA says he’s slightly above average (+3).

      But UZR has admitted its flaws, and that usually it’s best to view a 3 year sample, and that adding a player’s UZR/150 to their offensive season is a better indicator of the player’s real value. But even if you see, like, 15 Yankee games in a season, Teix will have, what, 40 attempts? No shot one can draw a logical conclusion to his abilities there, especially since 40 is probably high balling by a lot. Metrics are in their infancy, but they still beat the only using the “eye check”, which is tainted by our inability to remember the ordinary on its own, and of course, bias. Guaranteed if you bring in 2 groups of people who’ve never seen baseball played, tell one Teixeira is an awesome defender, and another that he sucks, and have them watch Teixeira and other 1B’s, they’ll just end up regurgitating what they said.

      And I’m jealous of your career, I wanted to go into statistics, but I got stuck in cost analysis for the time being. I’ve done all of 1 statistical study in about 5.5 months on the job.

      Vote -1 Vote +1

      • DC Stack says:

        I threw in the “including your own eyes” comment too flippantly. Because I always think statistically I almost naturally try to collect all the data without bias. As a result I am probably much less susceptible to selection bias. Although I will not back down from the notion that good observation is still am appropriate data collection method, it just has to be given the appropriate weighting when you do the meta analysis of all the data collection methods you have employed.

        The stat work I do tends to be quite different then the fun baseball analysis. I work with academics. Sometimes the research is great fun, interesting, and truly revealing…at other times it unbelievably mundane.

        Vote -1 Vote +1

      • Joe R says:

        I’d still rather do pure math than be the logistician’s fall guy in presentations.

        Vote -1 Vote +1

    • Kincaid says:

      UZR has been in development for probably about 20 years. The metric itself is not at an early stage of development. You could consider the data available to calculate it to be, though.

      Define “wild fluctuations for most players”. Are you just estimating from what you’ve seen, or have you run correlations on all the available data and compared that to stats you consider reliable? UZR has to be regressed just like any other statistic if you want to estimate a player’s talent level. You can’t just take one year of any stat and say that is how good a player is, whether it’s fielding or batting or pitching. All stats can fluctuate quite a bit from year to year, both because talent changes (for fielding as well as hitting and pitching) and because of sample errors. That’s not something unique to fielding stats. You just have to know how much weight to give your sample.

      I agree that people tend to give too much weight to stats without considering the weight of the sample, but that does not mean it is foolish to treat UZR as an accurate statistic.

      Vote -1 Vote +1

      • DC Stack says:

        Wow! Major “my bad” on my part. This is what happens when you get too cocky about your own abilities to discern fact from fiction. Right after I read your reply I realized I was doing what so many amateurs do. I was making a statement of fact based on a hand selected few players. I knew at that point I needed to come back and say I was wrong. But before doing that I wanted to see if I was wrong in my methods but right in my conclusions. Well I was wrong in both.

        I decided to do a simple test about the serial reliability of UZR. I used UZR/150 for all eligible players from 2008 and 2009. There were 83 players eligible from both years. I ran a simple correlation between their 2008 and 2009 numbers. It came back with a fat Pearson’s r of .729. I could already feel the egg on my face. I then wanted to see if this is bigger/smaller/same as other stats that are far less controversial. I did the same procedure for OPS (unadjusted). The correlation came back in the low .5s. Not only is UZR consistent from year to year, it is more consistent than OPS – at least in the two years I looked at.

        HUGE CAVEAT: This is a very small sample size. To get a better feel for the reliability of this metric this really should be done across multiple years. My analysis took me about 15 minutes and that was all I was willing to dedicate to it. If someone wants to take an hour or so to do the full proper tests I say go for it.

        Vote -1 Vote +1

      • DC Stack says:

        Correction – OPS 2008/2009 correlation was high .5s not low. I didn’t remove the catchers so it had a higher n than the UZR test, which means it wasn’t an exact one-to-one comparison.

        Vote -1 Vote +1

      • joser says:

        Kudos, DC Stack, for doing the work — and especially for graciously admitting when you were wrong (there’s not enough of that around here).

        Vote -1 Vote +1

  45. jp says:

    How does UZR handle 1b ability to handle throws from infielders? My eyes say Tex has been exceptionally good and digging balls out of the dirt. Maybe this skill is not properly accounted for?

    Vote -1 Vote +1

  46. OldYanksFan says:

    When you have a stat that is given yearly, it tends to imply that annual data is valid/accurate. If URZ can be inaccurate (too large an error factor) over a single year, maybe it should be expressed as a career average, with maybe an annual fluctuation factor?

    If a 2008 AND 2009 average says he was a VERY GOOD fielder, yet 2009 says he was AVERAGE, then in 2008, he must have been SPECTACULAR!

    Also, how does throwing come into the equation? Does it? How about scooping balls? Is that in there? And what of positioning? Certainly positioning can cost or make plays. Maybe the problem is we are weighing URZ too heavily without looking at other aspects (throw, scooping, positioning, etc) of fielding at 1B.

    I love stats and think they are fun and basically accurate. And while the MATH is always correct, a formula and/or application might have flaws. I don’t know how many different defensive stats there are, and if they cover ALL aspects of play (throwing, scooping at first, blocking the plate for a Catcher, etc). Maybe we need a stat that combines ALL defensive stats into one, so at least the flaws in any one stat are somewhat mitigated.

    Vote -1 Vote +1

  47. Brian says:

    Teixera’s defense was consistent all year last year and I remember 90% of all the plays from the Yanks season. He let some balls get by him down the line that I would like to see get scooped up and he didn’t dig out some balls that I felt I could probably handle but all in all he had a rock solid defensive year and the proof was in the Yanks team defense last year. AROD didn’t get to many balls but Jeter and Cano both were helped out by Tex’s ability to catch practically anything they threw his way.

    Vote -1 Vote +1

  48. joseph pittman says:

    David Appelman says:

    “Scoops are not included in UZR, but they are also not a major part of a 1B defensive ability. At the very most you’re looking at half a win on either side.”

    then you sir are retarded….considering that the vast majority of a 1st baseman’s plays are receiving throws from other members of the infield, if that aspect isnt considered UZR is worthless.

    imagine a 1st baseman who made every fielding play off the bat that came his Way…… but who missed even 3% of the throws of his teammates? he wouldnt be at 1st for very long.

    guess that makes you either the stupidest man alive or very close

    Vote -1 Vote +1

  49. Greg says:

    Personally, UZR has never agreed particularly well with the eye test with regards to first basemen. But even though UZR might not be perfect, I’m sure my judgment is even less so, so I’ll stick with UZR.

    Vote -1 Vote +1

  50. Jon says:

    I’m going to go ahead and agree with Tyler Kepner. I see Mark Teixeira making good to great plays and missing none of the easy ones every single game, and I grew up watching Don Mattingly and Keith Hernandez. He looked like a doofus to me when he first joined the Yankees, and I couldn’t imagine a big slugger who looked like that could field well regardless of his reputation, and then he went and surprised essentially every single time I saw him near the ball. Either he’s above average by a large margin, or the average team now fields a high-quality athletic player at first base instead of the usual mix of aging and slow-footed sluggers.

    Vote -1 Vote +1

  51. Kimbal says:

    One problem for UZR really is the ability to dig out bad throws and convert them into outs. Tex has good range and looks good in the field so he gets Gold Gloves while the slow-moving Konerko is labeled a bad fielder. But if you watch him every day you see that Paul Konerko is a vaccuum cleaner for errant throws and that adds greatly to his defensive value. If Dunn plays first instead of Konerko, watch the infield hits and infield errors on the White Sox ramp up!

    Vote -1 Vote +1

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Current day month ye@r *