Aggregate Defensive Ratings
These days it’s common place to look at multiple defensive metrics to try and get a good grasp of a player’s defensive value. On FanGraphs we even carry four different defensive metrics that include Mitchel Lichtman’s UZR, Sean Smith’s Total Zone (with Location), John Dewan’s DRS, and Tangotiger’s Fan’s Scouting Report.
All of them have different methodologies and the four rely on three different data sources.
To make comparing the four easier, there is now a new stats table on the player pages called “Aggregate Defensive Ratings” (ADR), where the UZR, DRS, TZL, and FSR are given a weighted average and there’s even a standard deviation and standard error given for the sample of four defensive metrics.
The general weighting is 1/3/3/3 for FSR/UZR/TZL/DRS. For years where FSR is not available, UZR, DRS, and TZL are weighted evenly. Only players with more than 50 innings played at a position are included in ADR.
ADR stats are available in all the player pages and will be making their way to the leaderboards soon.
I love it! Thank you for making our lives easier, and hopefully giving writers a more neutral place to look when they cite defensive statistics. Any single statistic for a given player can be an outlier relative to the others, but averaging all of them is likely to tamp out that volatility and give a better picture of the player’s ability. I think there’s a decent argument to be made that ADR should be used instead of UZR for WAR calculations, but I doubt that’s in the near-term plans.
I agree that WAR needs work on the defense side. But in looking at a few of the aggregates I don’t thin this metric is it. I think Dave astutely decided to display all of them and not just add in the aggregate somewhere.
I believe it was here not long ago where a fans bias analysis appeared. It was excellent, and showed that fans rate good players better than they are, and bad players much, much worse. An example is who else but Yuni Betancourt. He actually improved a bit last year according to UZR and TZL, but fans rated him the same as before and DRS was the same or worse. His ADR improved, but it was worse than the much disputed UZR.
I like the FANS scouting report very much, but it it not a true “wisdom of crowds” metric. There will be strong biases exhibited toward some players in that system because those who participate, there are almost none who believe what’s in the paper (Yuni is good!) if they read it, and they have certainly paid close attention to other defensive metrics that show he is not good. Don’t mean this to be a critique of the FSR, just saying that if you want to improve a metric you should avoid metrics with inherent strong biases at the extreme ends.
Since they seem to at least track each other’s direction pretty well, hopefully someone here will check how an aggregate of just UZR and TZL holds up against the others.
i don’t think the biases will show that strongly in the fans scouting report. tango breaks down the components well enough to avoid just an overall run value guess. and i am pretty sure he even regresses them before publishing.
I just disagree. If you have a reporting bias problem that population will understand the components of the system well enough to produce their desired outcome. In the context of the fan projections analyzed here recently, it seems that’s exactly what happened. Yuni at -25 in FSR (the same as the previous year) versus -11 UZR and similar in TZL, both improved over the previous year, seems to bear that out.
Why would he need to regress FSR?
But I want to emphasize that critiquing FSR in this context is not a criticism in general. I could be wrong, but I don’t think Tango devised it so that it could be a component of an aggregate metric. It was and is an experiment, and I doubt that the extremes are surprising to him at all. I think using it in this context is probably fine with respect to 95% of the population. But it’s going to strongly exert itself on the overall metric at the ends. Regression will smooth it out some, but at the ends it will still show up.
All you really need to do to confirm the bias theory of the FSR is look at Mike Cameron’s 2011 scores. Not only did his amdominal injury cause his arm strength and accuracy to decline, it also robbed him of his first step quickness, range, and amazingly it cost him a 19% hit in “Instincts”. If you ask a bunch of people walking out of Batman, The Dark Knight if they love comic book movies don’t be surprised to conclude that 95% of people comic book movies.
perhaps you have never participated in the report itself.
http://www.fangraphs.com/blogs/index.php/fans-scouting-report-part-2/ and http://www.tangotiger.net/scout/ provide some info.
a few notes:
- tango limits it to teams that you regularly watch. for most of us, that would be our favorite team and perhaps an in division rival.
- he explicitly says to avoid looking at any other numbers before doing filling out the form. of course you might have an idea of the numbers already(eg yuni betancourt)
- the way the report is broken down makes it harder to keep that bias. a bias for your home team would likely have a larger effect than a knowledge of uzr, drs, etc. and even then that bias is likely to hold only for the extreme cases. he also lets you choose dont know for each player.
- i cant find the link offhand, but im pretty sure tango has said he will toss out any clear outliers (jetes with a perfect score, yuni with all 0′s) and that he throws in a set number of perfectly average samples for each player to help regress.
- most people filling out the report will be regulars of fangraphs or tango’s blog and they will only do it for their own team.
is there bias? i am sure there is some. but i think you are definitely overstating it.
I have in the past and I do understand it. I don’t dislike the metric, I just think the bias is going to be much greater than you assume, because we are not talking about asking people to count an occurrence based on a defined zone or range. I think it is reasonable to expect less bias in someone counting an occurrence that is clearly defined, and perhaps getting paid to do it and probably having their work checked for accuracy, than people expressing an opinion based on some rules that would require them to accept an ethical approach to rating a player they are most familiar with, who may have jorked their season by booting numerous routine ground balls. The narrowing of the responders along with the assumption that they will follow the rules is actually an argument against it in my opinion.
I thought the FSR was analyzed with the conclusion that fans over-rate the players on their team and under-rate players on other teams.
yeah circle i couldn’t find the thread(s?) where tango discussed it offhand, sorry. i might be off with some of what i said.
Did Betancourt improve or did he get favorable chances? My understanding of defensive metrics is that they are going to fluctuate from year to year even if a defender doesn’t improve or regress.
seconding the use of ADR as opposed to UZR in WAR calculations (plus a similar method in pitching)
Several of the variables in ADR are not updated until after each seasons ends. If we switched to ADR as the defensive metric for WAR, we wouldn’t be able to calculate it in season.
Seems like an OK compromise, considering a half-season’s worth of UZR data being used to compute WAR is basically worthless anyway.
It’s very easy to simply look at the fielding component of WAR and replace UZR with whatever number you’d like to use for defensive value. That’s the nice thing about having the values broken out – if you don’t like what UZR says, just swap it out. That way, you get what you want, and everyone else still gets in-season WAR.
Why not just average the UZR and DRS to get the defensive component of fWAR? These two are updated throughout the season.
They are not on the same scale in-season.
We’ve had a lot of discussions about these issues. This is the best way to do it for now.
DRS isn’t zeroed out during the season, so it’s harder to do a real comparison. Doesn’t mean it can’t be done, but I don’t see it happening this season.
You can even go further and use the average of UZR and DRS during the season but switch to ADR after the season. Why ask us to do this work when you can do it just as easily for all in your formulas?
i am sure the daves are very sorry that their outstanding free service isn’t tailored exactly to fit all of your needs.
“DRS isn’t zeroed out during the season, so it’s harder to do a real comparison.”
Is DRS ever zeroed out? If you sum the totals for 2002-2010, you get about 2,600 runs saved. UZR is about 10 runs over the same period of time, which can be attributed to rounding errors. I don’t think the same can be said for DRS.
Yes, the use of ADR in WAR was my second thought. The first was laughing at how Derek Jeter was the player linked at the end.
What do these numbers mean? If they are runs saved it seems to me they are not accurate with regards to the Giants I have looked at thus far.
Nevermind I see I was wrong. That said, why are you not providing something like a per 150 games scale so that they are actually useful for comparing one player to another.
there is a uzr/150.
but really, if a guy plays 50 games, extrapolating his defensive numbers out to a season’s worth is pretty useless data anyway
They are in runs, just like all the other defensive metrics. You are going to have to elaborate a little bit. Is there a calculation problem, or do you just not agree with the numbers? If it’s the latter, not much I can do; they are what they are.
What is the main difference between the standard deviation and standard error? Zimmerman’s STDEV is 7 for 2010 and his ERR is +/- 4. If someone could walk through an example that’d be great.
I get the STDEV. For Zimmerman’s 2010 season, there is a 68% chance that his ADR was between 6 and 20. But I’m still trying to figure out the standard error…
I guess where I’m confused is that I thought the SE was the distance between the sample mean and population mean and the SD was the distance between a score and a sample/population mean. So it seems like you would need a SD for each individual defensive metric and then the SE would be the SD of all the metrics combined.
This is great. It would be even greater if, as part of the custom dashboards, we could create our own ADRs with custom weights. Personally, I’d like to weight FSR at least as much as each of the other metrics, maybe even as much as half of the overall ADR.
I wonder when a new fielding stats based on legendary Field f/x system will be introduced on the page, and when it is, I couldn’t be excited more.
This is a great improvement, saves me a step having to average these out myself. I would also like to see WAR recalculated using an average of DRS and UZR, or using ADR, even if it can only be posted at the end of the year in this form.
Bravo.
Yes, yes, 1000 times yes! Thank you for publishing the standard error.
I have the same question as Brent. Any help guys?
Very much appreciated. Just got an iphone and immediately bought the fangraphs app, even though i’m always on the computer looking at the site.
Any chance we’ll be able to incorporate defensive stats on the customized dashboard? I understand the position element might make that difficult, but it would be nice to use ADR instead of merely Fielding value.