FanGraphs Baseball

Comments

RSS feed for comments on this post.

  1. Bugger off, Chone.

    Comment by Jason T — June 6, 2011 @ 9:50 am

  2. Colvin is 0 for 23 on ground balls. Maybe Darwin Barney can teach him how to get those to go through.

    Comment by Jay — June 6, 2011 @ 10:15 am

  3. So… are small sample UZR measurements valid or not? It seems that we see references to these tiny 2-month sample sizes daily in one article or another, despite being admonished and warned by other authors that these sizes are essentially meaningless at this point in the season.

    Are the 146 defensive innings played by Willie Harris or 73 by Magglio Ordonez thus far in 2011 really worth using?

    Obviously various authors have different opinions, but it seems that using UZR in such limited sample sizes is erroneous and misleading.

    Comment by Hayson Jeyward — June 6, 2011 @ 10:29 am

  4. People never get this. Small UZR samples have very low predictive value, but that doesn’t make them “invalid”. They are real data and should be cited as freely as anything else. What you must do is wait much longer before drawing conclusions about a player’s ability based on UZR samples smaller than ~3 seasons.

    This isn’t different from offensive stats except in the size of the sample required. You don’t take a week’s worth of PA and draw a firm conclusion about a hitter’s ability. But you also don’t consider that data “invalid”.

    Comment by Reverend Black — June 6, 2011 @ 10:37 am

  5. I think when you’re 1) looking at extremes, and 2) not really attempting to make scientific projections, SSS isn’t such a big deal

    Comment by gort — June 6, 2011 @ 10:41 am

  6. At the ML level, I would expect poor contact (medium speed) ground balls to be recorded for outs 23 times out of 23.

    Hard hits ground balls are a different story as are very weak ground balls.

    Actually, in terms of just getting on base, you’re likely better off at the ML level hitting a weak, weak, grounder rather than a routine one.

    Colvin’s also a lefty, and if he’s hitting grounders, he’s likely “rolling over” on outside half pitches. Those balls are not hit with authority.

    Batters have demonstrated that if you hit balls hard enough, you can get groundball hits even with a “lefty shift” on.

    Just saying :ground balls” isn’t specific enough. If we watched all 23 at bats where he grounded out, would we classify ANY of them as being good at bats or good contact?

    I really like Colvin (even as a cardinal fan), but the league adjusts, advance scouts scout, etc.

    I feel bad for Bill Hall … always seemed to good guy, gives a lot of effort, would play anywhere. I know he’s had a good career, and made a living playing ball … I just hate to see veteran guys struggle toward the end. I know it’s the “circle of life” and all that … just tough to watch sometimes.

    Comment by CircleChange11 — June 6, 2011 @ 10:46 am

  7. As a Cardinal fan, should know better than anyone that just about any ground ball can go for a hit.

    Comment by Jay — June 6, 2011 @ 10:52 am

  8. What happened to Chone Figgins? Players with his skill set don’t typically fall off of a cliff like that. Defense, OBP, and Stolen bases aren’t exactly ‘old player’ skills

    Comment by j6takish — June 6, 2011 @ 10:53 am

  9. “This isn’t different from offensive stats except in the size of the sample required.”

    This isn’t right. It is not at all clear how well the defensive metrics are actually measuring what they’re attempting to measure. There’s no way to “check” and see that the data is actually measuring things correctly in small sample sizes. That’s not to say its entirely worthless in small sample sizes, but it’s very different from offensive statistics, which measure whatever they’re measuring with nearly perfect and verifiable accuracy.

    Comment by Luke in MN — June 6, 2011 @ 10:58 am

  10. Except that offensive events are much easier to classify. A single is a single and except in the relatively rare cases where a fielder dove and missed and the scoring decision is a little subjective there isn’t much more to it. A foot or two may be all that separates a sharp grounder an average third baseman would get to from one that would take an very good defensive play to turn into an out and I don’t think that the batted ball classification is quite precise enough to be sure that UZR is even a perfect measure of what has happened in small sample size.

    Comment by don — June 6, 2011 @ 11:00 am

  11. What the two commenters above me just said. Small sample sizes in UZR are going to have a lot of noise wherein some fielders will see a lot of balls more sharply hit than others within certain zones that will make the dataset messy. Over very large sample sizes, these things get smoothed out to the point that you can rely on it a bit more, but it’s a bit troubling to say that the data is perfectly accurate and acceptable at face value in small samples.

    I mean, for an example I’m familiar with, Jordan Schafer of the Braves has been in Atlanta for a total of 10 games. He’s made a total of 29 plays (19 in zone, 10 out of zone). UZR says that he’s already saved FOUR runs above AVERAGE. I fully believe he’s a really good defensive centerfielder, but to say that he’s been 4 runs better than an average centerfielder over 90 defensive innings is insane.

    Yes, I did pull out an ultra-small sample size, but that’s the point. It’s not just that it’s a non-predictive sample, it’s also a sample that I do not believe could accurately be describing what’s really going on-that somehow, a average centerfielder (by all means a good defensive player) allows 4 more runs to score on just 29 total plays.

    Comment by Bronnt — June 6, 2011 @ 11:22 am

  12. Luke – you seem confused about what the word “this” refers to in the sentence you quoted. I haven’t said that UZR isn’t different from offensive stats, I’ve spoken to the issue of sample size in relation to validity. With respect to that issue, there isn’t a difference between offensive stats and UZR.

    There are validity issues that come with metrics like UZR, but they are related to measurement, not sample size.

    Comment by Reverend Black — June 6, 2011 @ 11:23 am

  13. Don – “A foot or two may be all that separates a sharp grounder an average third baseman would get to from one that would take an very good defensive play to turn into an out and I don’t think that the batted ball classification is quite precise enough to be sure that UZR is even a perfect measure of what has happened in small sample size.”

    Shifting the goalposts a but. Offensive stats, as you had just acknowledged, aren’t going to perfectly measure (reflect) what happened either. I certainly haven’t claimed or implied here that UZR perfectly reflects anything. We are talking about validity only as it is effected by sample size.

    The issues you are pointing out are related to sample size only indirectly. In other words, they are issues related to the ‘mechanics’ of metric itself which are not mitigated significantly by a sample getting larger.

    Comment by Reverend Black — June 6, 2011 @ 11:27 am

  14. Last year, Figgins’ problems were largely a result of being asked to play a defensive position which he, historically, did not play very well. This year, his problems seem to be mostly BABIP related. He was always a guy that relied on a high BABIP and a very solid BB rate. Sometimes when guys are pressing hard, and their average goes down, they get more aggressive and don’t take as many walks as they should, which might be why his BB rate is declining. But right now he’s .121 points below his career BABIP. There might be some correlations in his batted ball data, but he’s not this bad a hitter.

    Comment by Bronnt — June 6, 2011 @ 11:30 am

  15. Bronnt – “Over very large sample sizes, these things get smoothed out to the point that you can rely on it a bit more, but it’s a bit troubling to say that the data is perfectly accurate and acceptable at face value in small samples.”

    That would be a troubling thing to say. I’m glad I’ve never said it.

    “I fully believe he’s a really good defensive centerfielder, but to say that he’s been 4 runs better than an average centerfielder over 90 defensive innings is insane.”

    With all respect, this is a conclusion for which you haven’t shown your work. The “makers” of UZR have shown theirs.

    “Yes, I did pull out an ultra-small sample size, but that’s the point. It’s not just that it’s a non-predictive sample, it’s also a sample that I do not believe could accurately be describing what’s really going on”

    Skepticism is fine, but more is required to conclude that UZR data is invalid in small samples.

    Comment by Reverend Black — June 6, 2011 @ 11:34 am

  16. Another point to what the previous commenters said is that UZR doesn’t claim to measure on-field happenings. OBA, for instance, measures how often a player reached base safely per plate appearance (not counting fielders’ choices, I think). That’s an objective measurement of what happened on the field — if someone’s OBA is .300, you know they reached base 30% of the time, no matter what the sample size is.

    UZR (rolling the tape) “puts a run value to defense, attempting to quantify how many runs a player saved or gave up through their fielding prowess (or lack thereof).” That’s not a measure of what happened on the field; it’s a measure of what happened on the field compared to what would have happened on the field if an average fielder had been playing. You need a reasonable sample size to make any judgments about what would have happened on the field if an average fielder had been playing. So in a small sample size, there’s not enough basis for saying that fielder X saved (UZR many runs) compared to the average fielder.

    Comment by matt w — June 6, 2011 @ 11:35 am

  17. R. Black-

    UZR DOES have validity concerns, for the reasons Luke specified. There’s no objective verfiable measure of comparison. I can’t recreate the circumstances of every game with the substitution of a league average defense. Even in a small sample size, I can say “This player reached base 4 times in 5 plate appearances,” and I can definitely say it’s better than average over the sample cited, even though it’s meaningless.

    Due to difficulties involving the data uptake of UZR, there are reasons to be concerned if the subjective elements aren’t coloring the data. Considering that UZR entirely ignores things like pop-ups, and line drives for infielders, and may lack the proper camera angle at various points for accurate classifications, or that you run the risk of your video scout overlaying his subjective views onto the data (which is obviously a concern, since otherwise BIS wouldn’t rotate their scouts), validity concerns increase in small sample sizes for UZR.

    Comment by Bronnt — June 6, 2011 @ 11:53 am

  18. Matt – “You need a reasonable sample size to make any judgments about what would have happened on the field if an average fielder had been playing.”

    Why do you say that? The average fielder is a useful mathematical fiction in the exact same way the average hitter and average pitcher are. (No questions about the validity of FIP- or wRC+ offered yet.) They are dynamic, season to season, but that’s not really a challenge to the validity of the metric; it’s just an important thing to remember about what the numbers are expressing.

    Comment by Reverend Black — June 6, 2011 @ 12:00 pm

  19. Bronnt – We seem to be talking right past each other now. I haven’t and won’t defend every aspect of UZR from every situation. I commented in response to the idea that it’s sample size specifically that undermines the validity of UZR data. Sample size only undermines the data’s reliability (predictive value).

    There are a handful of serious issues & limitations with UZR data in general – no disagreement there. But to say that the data isn’t valid in small samples because of them is to say that the data is never valid. Validity isn’t a sliding scale like reliability is.

    Comment by Reverend Black — June 6, 2011 @ 12:17 pm

  20. I think a lot of this data just goes to show that it’s not easy to find an average player these days. I like WAR and think it makes a ton of sense but these “replacement players” are starting to seem like a myth. You really don’t know what to expect anymore from a veteran utility player or a AAA callup, and if every team knew they would give them a mediocre performance, then no one would sign a guy like Hall for more than the minimum.

    Comment by Pat — June 6, 2011 @ 12:24 pm

  21. The small sample UZR numbers aren’t really valid. However, they do show what has happened so far, but should not be used to show what to expect in the future. E5’s UZR is terrible right now, in fact his 2011 UZR/150 at 3B is -73.6. His career UZR/150 is -13.1.

    Long story short, E5 has been absolutely horrific on defense thus far, but we can’t assume that it will stay that way for the rest of the season on the basis that he’s been a much better defender (albeit still bad) in his career.

    Comment by Bryz — June 6, 2011 @ 12:25 pm

  22. I was always amazed by the fact that David Ortiz once had -0.5 WAR in a season . . . in 25 plate appearances.

    Comment by Shattenjager — June 6, 2011 @ 12:25 pm

  23. If you want to see a replacement player, just look at half of the Tigers’ lineup on any given day.

    Comment by DO YALL WANT A HAM — June 6, 2011 @ 12:50 pm

  24. Or….His problem is he got his 36M guaranteed in 2010 and decided to retire.

    Comment by Neuter Your Dogma — June 6, 2011 @ 2:48 pm

  25. “The average fielder is a useful mathematical fiction in the exact same way the average hitter and average pitcher are. (No questions about the validity of FIP- or wRC+ offered yet.)”

    But FIP- and wRC+ don’t claim to measure the players’ performance versus what the average player would have done in their stead; they claim to measure the players’ performance versus the average of what other players actually did. FIP- can easily be expressed in terms of the average of others’ FIPs; UZR isn’t expressed in terms of the average of any statistic for other players.

    The problem isn’t the average, it’s the counterfactual nature of what’s being measured. FIP- works because it doesn’t say “An average pitcher would’ve given up this many runs compared to this pitcher,” it says “This pitcher’s FIP was a certain percentage above or below the average.” UZR says “This player made a certain number of plays more or less than the average player would have made given the balls that were hit to them.” How are you going to express that without the supposed mathematical fiction and without the counterfactual?

    Comment by matt w — June 6, 2011 @ 2:49 pm

  26. Think of this — suppose that I developed a statistic, WZR, designed to quantify how many runs a center fielder gave up in the field compared to how many Willie Mays would’ve given up in the field on the same plays. Willie Mays isn’t a fictional construct; but WZR would still be counterfactual, because we can’t compare the player’s performance to Mays’s performance on the same balls. We still need to come up with some way of measuring the counterfactual, WWWMHD (what would Willie Mays have done)? And looking at a small sample of fly balls, we wouldn’t be able to come up with an accurate measure of WWWMHD; so in a small sample WZR couldn’t be said to measure what actually happened on the field.

    That’s why the issue isn’t with the mathematical fiction of the average fielder; it’s with the counterfactual nature of the statistic.

    Comment by matt w — June 6, 2011 @ 2:56 pm

  27. Be careful not to confuse the “symptoms” for the “problem”.

    The “symptoms” is his defense, or his BABIP, etc.

    That’s not what the “problem” is. I don;t know what it is either, but low BABIP isn’t a problem … it’s the symptom of a problem.

    Pitchers seem to be in the strike zone more than in the past. Whether its due to a “new era” (if you will), the emphasis on “making them hit their way on”, or whatever. But, there’s really not a reason not to challenge Figgins on every pitch. Walking him would be the dumbest thing you could do.

    Comment by CircleChange11 — June 6, 2011 @ 3:37 pm

  28. Figgins :(

    Comment by beastwarking — June 7, 2011 @ 12:39 pm

  29. WWWMHD (what would Willie Mays have done)?

    That is exceptional!

    Comment by Joel — June 7, 2011 @ 1:44 pm

Leave a comment

Line and paragraph breaks automatic, e-mail address never displayed, HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>


Close this window.

0.207 Powered by WordPress