FanGraphs Baseball


RSS feed for comments on this post.

  1. But is there an “asterisk” column in the fielding section on Fangraphs so we know which UZR scores we can accept at face value and which ones we need to adjust? Or will those adjustments be taken into account in some future refinement of UZR?

    Because right now if I’m trying to evaluate (say) one player vs another, I can compare their wOBA numbers and be fairly confident that has captured most of their offensive value and, therefore, I know which one is better at the plate. But — without doing further research — I don’t have as much confidence when comparing their UZRs, and since the value calculation here uses UZR as well, I can’t really trust that either. And yeah, we should all do more research and not do two-bit 30-second analysis, but that’s about all the time I have… and even if I have more time, I’m not even sure where I go to get the definitive “scout’s consensus” on any given player, let alone how to translate that into “regressing to a different mean.”

    So help a brother out: what should we be doing?

    Comment by joser — November 18, 2009 @ 2:44 pm

  2. Good work Dave, as always.

    The title though- you ended a sentence with a preposition. Bad!

    Comment by Logan — November 18, 2009 @ 2:55 pm

  3. This is exactly the kind of grammatical error up with which I shall not put!

    Comment by Dingo — November 18, 2009 @ 3:01 pm

  4. I can compare their wOBA numbers and be fairly confident that has captured most of their offensive value and, therefore, I know which one is better at the plate.

    How confident?

    One thing I would like fangraphs to add to its impressive list of statistics are the error bars, or 95 percent confidence interval. The only writer who speaks of statistical uncertainty surrounding metrics consistently and incorporates in his analysis is Dave Allen. I would like to see more authors follow suit.

    Comment by Sam — November 18, 2009 @ 3:14 pm

  5. You know, I was looking into this recently and found that almost the only reason for this rule (i.e. Thou shalt not end thy sentence with a preposition) existing in English is because it exists in Latin. Same thing for splitting an infinitive.

    Thing is, it’s literally impossible to split an infinitive in Latin, because the infinitive form in Latin (amare – to love, ludere – to play) is only a single word. Likewise for prepositions at the end of the sentence — prepositions in Latin are followed immediately (with few exceptions) by either the ablative or accusative form of the noun in question. They don’t make sense otherwise.

    Till recently, I always made a point of placing my prepositions before the corresponding nouns — so that I could broadcast my boarding school education, if for no other reason.

    Now the only way I can broadcast said education is by providing long-winded and largely uninteresting observations on English usage in the comments sections of better baseballing websites.

    Such is the struggle of life.

    Comment by Carson Cistulli — November 18, 2009 @ 3:28 pm

  6. I can live with the opinions of scouts. Lets not get scouts confused with casual fans and other baseball people. I’m looking at you Joe Morgan.

    Comment by brian recca — November 18, 2009 @ 3:39 pm

  7. I think the error bars should only be given with projections. I don’t want to see error bars on accumulated stats. I just want to see what their production was. I believe Zips projections have the error bars if you go straight to the source. I don’t think it’s necessary for Fangraphs to carry them when the reader can find them himself. Fangraphs already does enough of the hard leg work.
    vr, Xei

    Comment by Xeifrank — November 18, 2009 @ 3:46 pm

  8. It would be interesting to do an analysis of the variability between UZR and scouting and try to see if there systematic differences that can shed light on disagreement.

    What elements of defense are scouts including that can’t be captured by UZR, such as backing up other players, relay throws, 1B scooping/stretching, etc.? What things is UZR picking up that scouts might be undervaluing, such as positiong or making out of zone plays that other players can’t even attempt.

    Are scouts likely to regress their assessment too far back to a player’s previous level of performance? What’s the impact of sample size — scouts are assessing ability moreso than performance and these presumably converge given a big enough sample. So what sample size are scouts using?

    I imagine that a good portion of the differences can be explained by a few consistent gaps in approach.

    Comment by Rick — November 18, 2009 @ 3:48 pm

  9. Yup.

    I’d only add that sometimes — sometimes — not ending sentences with prepositions forces you to put the preposition next to the noun it modifies and can clear things up.

    Isn’t “put up with” a set phrase, in any event, and not subject to that rule? (I get the reference).

    Comment by Xavier — November 18, 2009 @ 3:50 pm

  10. I like what Rick said there. I think it would be very interesting to see a detailed analysis of disagreements to try to find a commonality.

    Comment by Patrick — November 18, 2009 @ 3:52 pm

  11. Or it could be that UZR is not an especially good measure of 1B defense, since much of the ability to catch errant throws from other fielders is lost in the data.

    Comment by Bill — November 18, 2009 @ 4:01 pm

  12. I just want to see what their production was.

    Are these “productions” (like wOBA) measured with certainty, like runs scored? If not, then error bars are absolutely critical. And secondly, it is more important when it comes to defense.

    These productions that you talk about, are measured based on the average run value of each event using linear weights. These average run values are random variables, and therefore the linear combinations of these random variables are themselves random variables. Therefore, there should be error bars attached to them.

    Comment by Sam — November 18, 2009 @ 4:02 pm

  13. Mr. Cameron, you are on a roll lately. I agree that the Crasnick article was really good. Your analysis of the Belichick situation and the broader issues involved was also right on the money.

    Comment by WY — November 18, 2009 @ 4:05 pm

  14. I agree — the preposition rule is one that can be broken, especially since the alternative can sound so much more awkward and uptight. It’s not using “your” instead of “you’re” of “there” instead of “their” or even “it’s” instead of “its” — those mistakes drive me nuts.

    Comment by WY — November 18, 2009 @ 4:10 pm

  15. I agree all the way. UZR seems to work great for OF and the 3 infield positions. But the results seem off for 1B, and it makes sense because such a large part of 1B defense is picking bad throws.

    I have no doubt that defensive metrics will continue to grow and eventually catch up to the offensive metrics, but they will not be able to until somebody finds a way to better quantify the defensive contribution of 1B, Catchers and Pitchers.

    Comment by jlong — November 18, 2009 @ 4:12 pm

  16. I, too, agree with the conclusion drawn here and analysis presented in favor of it.

    From a limited perspective of developing FanGraphs to more comprehensively measure the talents of players, I thought it would be interesting for the site to contract with either a current or former scout (“Scout X” so as to preserve anonymity) regarding performances of players where there is this sort of Teixera-like disagreement. One of the more statistically-inclined writers would or could collaborate, bringing into the discourse both views on what right now is a statistically-heavy analysis.

    As many on here would agree, there is more to the SABR community that numbers without any form of observation, and continuing to recognize that ultimately puts one in a better position to make the best judgment possible, as was the point here. Well done.

    Comment by Big Oil — November 18, 2009 @ 4:16 pm

  17. 3rd paragraph: “that” to “than”.

    Comment by Big Oil — November 18, 2009 @ 4:17 pm

  18. I agree with Sam. I think we’d all just like to see what the player’s production was, but that doesn’t change the fact that many of these stats (such as runs saved on defense, to cite one example) are ultimately estimates. That doesn’t make them wrong, but it does mean that there is going to be some uncertainty surrounding them. That is unavoidable.

    I agree with Sam that it would be interesting to get a sense of how large or small the error might be for some of these stats. I also think it would help rein in some of the people who get a little too casual in tossing around WAR values to the nearest decimal point as if those numbers were set in stone. These stats are useful, but they are not monolithic or perfect, nor should we expect them to be.

    Comment by WY — November 18, 2009 @ 4:17 pm

  19. Man you people scare me.

    Comment by neuter_your_dogma — November 18, 2009 @ 4:22 pm

  20. Whew . . . and to think all season I insisted that Raul Ibanez was a better fielder than Jayson Werth.

    Comment by neuter_your_dogma — November 18, 2009 @ 4:24 pm

  21. People need to get away from the “can trust/can’t trust” model. UZR isn’t perfect, but that doesn’t mean you should write it off it (or WAR). You just speak in ranges rather than with precision.

    So, if someone has a +5 UZR over the last three years, then you’d expect their UZR to be between +0 and +10 the next year.

    As for the scouts consensus, Tango’s Fans Scouting Report is a decent proxy.

    Comment by Dave Cameron — November 18, 2009 @ 4:24 pm

  22. This doesn’t really matter very much. MGL did a long post on scoops by first baseman, and the issue has been studied by a ton of people. There’s just not really much of a difference in the spread of that particular talent. It’s a couple of runs at most.

    UZR for first baseman is just fine.

    Comment by Dave Cameron — November 18, 2009 @ 4:27 pm

  23. Linear weights are presented as a context neutral statistic. Therefore, there is no “estimate” – we literally are saying we don’t care how many runs the hit actually produced. It’s an average because we’re intentionally removing the context of the play.

    Linear weights doesn’t need error bars. For UZR, we’ve repeatedly mentioned that even with a good sample, you want to estimate a player +/- five runs in either direction.

    Comment by Dave Cameron — November 18, 2009 @ 4:30 pm

  24. Linear weights are presented as a context neutral statistic. Therefore, there is no “estimate” – we literally are saying we don’t care how many runs the hit actually produced. It’s an average because we’re intentionally removing the context of the play.

    Respectfully disagree. Removing context is not equivalent to removing statistical uncertainty, which may result from factors that are unmeasurable, or factors the statistician cannot control for. Because it is an average, an expected run value from a specific event if you will, it should have error bars or 95 (or 99) percent confidence interval surrounding it. It may turn out that the standard error will be extremely low due to huge sample sizes that are used to calculate the linear weights, but there still is uncertainty.

    Comment by Sam — November 18, 2009 @ 4:43 pm

  25. Thanks, Winston.

    Comment by don — November 18, 2009 @ 5:25 pm

  26. Also note that English is a Germanic language. Anyone familiar with German will see that they regularly end their grammatically-correct sentences with “separable prefixes,” which are functionally equivalent (and usually identical) to prepositions. Yet I still prefer to keep my prepositions in their Latin place.

    Comment by Newcomer — November 18, 2009 @ 5:52 pm

  27. I like your position on this, Dave, but how does this affect WAR? We see on this site all the time claims like David DeJesus is more valuable than Jason Bay, etc. because of the huge defensive value difference based on UZR. Should this view therefore lead to a softening of WAR as the absolute that it is often portrayed as? To the DeJesus-Bay comp, the “book” on David agrees with UZR, but do scouts really believe that Bay is “dipped in a vat of rotting skunk urine” bad as UZR says? I think this is where the error bars concept is intriguing, in cases like Bay. In other words, it is theoretically possible for a fielder to cost his team 70 or 80 runs a year on defense, but is that extreme outlier really realistic?

    Comment by Paul — November 18, 2009 @ 6:15 pm

  28. When there is a discrepancy among scouts and stats, I agree that the truth generally lies somewhere in between. However, scouts give bonus points for aesthetic attributes like soft hands, nifty footwork and other skills which are vital physical components to a calculable event (intercepting a ball at x location and creating an out) but are incalculable on their own (stdev for nifty feet = ?).

    Take Adam Kennedy and Felipe Lopez for example. The two have been nearly identical in all components of UZR during the previous three seasons at 2B, yet Kennedy (fundamentally sound and dipped in molasses) is a cult hero among the scouting community while Lopez (fundamentally suspect and dipped in pico de gallo) is generally regarded as something far less. One guy makes scouts drool, the other makes them cringe, and the ultimate result (outs) is exactly the same.

    Comment by Choo — November 18, 2009 @ 7:15 pm

  29. I wonder how much of the anecdotal evidence is based upon what happened in years previous. I’d be willing to bet that a good deal of the people who talk about Teixeira being a superb defender are basing that on what they’ve heard as much as what they’ve seen. Plus, there’s a confirmation bias in play – if you, or a scout goes to a game already having it in their head that Teixeira (whom I choose because he’s one of the few guys who’s commonly perceived as a great defender), and he makes one great play, you’ve got your confirmation that he’s great. And if he makes mistakes, people are more likely to attribute it to his having a bad day.

    Yeah, defense is hard to measure, but absolute trust in either UZR or scouting isn’t the only problem facing its evaluation. I feel like we’ve also got to dispose of hearsay evidence that, especially with defense, seems to keep on being parroted for seasons on end.

    Comment by Padman Jones — November 18, 2009 @ 7:42 pm

  30. Is Yoda writing the Fangraphs headlines now?

    Comment by scatterbrian — November 18, 2009 @ 8:02 pm

  31. I notice that there is nothing really about regressing the metrics to a mean in this article – it’s more about tempering the values of two systems that give desperate information.

    Anyway, it’s never been proven that UZR is actually, you know, correct. It was built backwards starting with \I want a stat that will translate defense into runs’ and doing that caused it a lot of problems. When it was first introduced, I pointed out I think 7 logical and statistical flaws in it, which MGL to my knowledge never refuted or corrected.

    The 2004/2005 off-season gave a great way to test the value of it as a metric, because you had three shortstops swap teams. Eckstein, Renteria and Cabrera. As it turned out – by far the best predictor of how a team’s shortstop would do with UZR was not who the shortstop was, but who the team was. It’s sort of sad that no one has actually challenged it since. The general consensus seems to be ‘it’s got a lot of math in der, it must be correct.’,

    Comment by The Real Neal — November 18, 2009 @ 8:17 pm

  32. The other important point is not to confuse what scouts believe with what large mob of average joes believes. It always amused me to see people in disbelief of Jeter’s defensive issues ask scouts about it and they would say something along the the lines of if I cannot tell he sucks going to his left then I need to be fired.

    Comment by walkoffblast — November 18, 2009 @ 8:26 pm

  33. Dave will soon regress to the mean as well.

    Comment by Jacob Jackson — November 18, 2009 @ 8:55 pm

  34. I don’t want to speak for MGL, but I’ve read through much of the introductory post comments of BBTF regarding UZR, and I saw MGL carefully go through many of the valid points addressed.

    At the risk of opening a can of “stuff I don’t want out here,” what flaws did you see in the model?

    Comment by Michael — November 18, 2009 @ 10:50 pm

  35. Also note that English is a Germanic language

    Yes, and it has big chunks of Romance languages (thanks to the Norman conquest) and Scandinavian (thanks to Knute and his bunch) and Greek and Latin (thanks to the Renaissance and the Enlightenment) and little blobs of vocabulary from all over everywhere else. It’s a sprawling mongrel of a language with exceptions to every rule and a glorious tolerance for breaking them, which is why it’s been so successful.

    Man you people scare me.

    There’s nothing of which to be scared.

    Just mind your prepositions, and nobody gets hurt.

    Comment by joser — November 19, 2009 @ 1:20 am

  36. Actually Dave, UZR is probably more accurate (in terms of identifying true talent level) than wOBA. wOBA splits everything up into a few buckets. If a guy hits a rope that’s caught at second base, wOBA gives him a 0 for that play. UZR has a lot more buckets than wOBA, so it will be a better indicator of a players true ability than wOBA. That’s why we see UZR have similar year to year correlation as wOBA, despite a much smaller sample size for defense.

    Comment by vivaelpujols — November 19, 2009 @ 1:51 am

  37. Problem is, I’m not sure we want to listen to one, single scout. A consensus of a few is a lot better. And for many purposes, Tango’s Fan Scouting Report can fill in nicely in lieu of a professional.

    Comment by Sky Kalkman — November 19, 2009 @ 7:02 am

  38. On Bay, I’m not sure it’s as much the standard issue error bars as it is a potential for a systematic error with park issues.

    Comment by Sky Kalkman — November 19, 2009 @ 7:04 am

  39. Ryan Braun is the one who really sticks out in the UZR rankings. A few folks I know who watch the game carefully feel that Braun has good range and tracks balls quite well. UZR has his ranked as one of the worst LF in baseball in 2009.

    Comment by Jeff Lewandowski — November 19, 2009 @ 8:23 am

  40. The Chicago Manual of Style notes that infinitives can be split if not doing so makes the sentence sound awkward. Same goes for prepositions at the end. Also, as noted below, any phrase that naturally ends in a preposition is required to remain that way. Thus, “put up with” can end a sentence.

    Comment by DL80 — November 19, 2009 @ 8:48 am

  41. Braun also rates well with the fans. It’s one of those strange issues where it appears a player has all the tools to succeed, but maybe he’s just not using them all.

    Comment by Michael — November 19, 2009 @ 9:08 am

  42. But WHICH mean?

    Comment by JB — November 19, 2009 @ 9:35 am

  43. I still think Ellsbury is a much bigger point of contention than, say, Teixeira. Most of Ellsbury’s perceived value is his glove, but his UZR was terrible.

    And then you have some people who think he’s the best defensive CF in Red Sox history already, and others who think he plays like a blind man.

    I like Crasnick, though, his stuff is well worth reading.

    Comment by Joe R — November 19, 2009 @ 9:45 am

  44. Simple solution:

    “What Mean Do You Regress Defensive Metrics To, @ssh0le”

    Comment by ineedanap — November 19, 2009 @ 10:07 am

  45. I’ve heard that joke before.

    Comment by Logan — November 19, 2009 @ 11:34 am

  46. I have yet to see a credible source suggest Ellsbury did not let an above average amount of balls fall in front of him for whatever reason this year.

    Comment by walkoffblast — November 19, 2009 @ 1:25 pm

  47. Exactly. This isn’t actually a difficult thing to reason about. It’s not “can I trust this?”, it’s “How much can I trust this?” If you have a decent sample of UZR, the answer is +/- 5 runs; taking their three year average (if you have that much data) with a slight weight toward recent performance is a good idea. Finally, you should always expect players to be a little more average than they have been in the past–that’s just regression to the mean (or, if you have a very small sample, a lot more average.)

    Once you’ve done all that, you should use additional information you have about the player–recent injuries, recoveries, learning, aging, can all be factored in to suggest that a player will be a little different from the number we arrived at after reflecting on brute UZR data. Oh, and of course to the point of this article: there’s scouting information as well.

    Comment by Fresh Hops — November 19, 2009 @ 2:30 pm

  48. I love any article that implores people to think for themselves and not treat ANY single piece of information as gospel, whether we’re talking about baseball or anything else. So, thanks.

    Comment by CH — November 19, 2009 @ 2:36 pm

  49. I’m not really sure how true this is. I know fans are instructed not to think about defensive metrics and not to think about what other people have said, but… why would you expect that to be any more effective with a jury voting on baseball skill than with a jury voting on convictions?

    There’s no way to get a pool of people who are untainted. Given that, I’d rather rely on experts (ideally several of them).

    Comment by Paul Thomas — November 19, 2009 @ 8:27 pm

  50. Well it was five years ago.

    Off the top of my head – line drives, throwing, the way that hits through the hole get apportioned responsibility to the fielders, advanced scouting and defensive alignment, the ability of pitchers to pitch to a gameplan… and then there’s the standard misuse of statistics. You can’t say “my sample size is 500 plays” and then start applying things like park factors and pitchers factors to your sample of 500 – because what you in reality are doing is cutting up your samples so that you no longer have samples large enough to apply any confidence interval.

    Even MGL guesstimated his SD to be 5 runs. The reason he can’t give an actual confidence interval is because the statistical analysis is so fudged.

    Comment by The Real Neal — November 19, 2009 @ 10:11 pm

  51. Or it could be that Mike Cameron steps in front of Braun and Hart to catch a lot of balls that both of the outfielders could catch. Which would:

    1. Explain the disparity in Braun’s score
    2. Explain why Hart who used to play center is a ‘bad’ right fielder
    3. Explain why Cameron who is clearly in the decline phase of his fielding career, like just about every other center fielder for the last 120 years, seems to be rejuvinated by joining the Brewers.
    4. Illustrate the other flaw with UZR that I forgot above.

    Comment by The Real Neal — November 19, 2009 @ 10:27 pm

  52. 5 bucks says Dayton Moore was the “AL GM” who said that Holliday sucks in the field.

    Also, did Yoda write the title to this post?

    Comment by Aqua Narc — November 20, 2009 @ 7:52 am

Leave a comment

Line and paragraph breaks automatic, e-mail address never displayed, HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Current ye@r *

Close this window.

0.113 Powered by WordPress