The Slash Stats
Last weekend, David Appelman posted the year-to-year correlations for UZR and wOBA by chances and plate appearances. The latter had a R^2 value of 0.30 for players with at least 500 plate appearances in both 2008 and 2009. Without a baseline as to how other offensive statistics fare, this seems like a pretty weak correlation. I similarly took all 101 batters with 500+ plate appearances in both seasons and ran the correlations on their batting average, on-base percentage, and slugging percentage to figure out where wOBA’s R^2 ranks. Here are the results:
Batting average: 0.1975 (0.444 R)
On-base percentage: 0.3673 (0.606 R)
Slugging percentage: 0.3653 (0.604 R)
Interestingly, OBP and SLG were nearly identical while BA shows just how volatile it is on a year-to-year basis. wOBA ranks just behind the two slash stats that make up OPS, so why do we use wOBA if it’s presumably less predictive on an individual basis than either of those components? Because it correlates to runs scored better than OPS. I ran the team production numbers versus runs scored and found these relationships (both R):
OPS: .958
wOBA: .960
This is why we use wOBA. It might not have the year-to-year relationship that OBP and SLG do, but it correlates with team runs scored about as well or better than anything else around. Now these numbers are not adjusted for park or league and are in their raw forms.
(For fun I used the equation given to estimate each team’s run totals in 2009. The unluckiest teams: Mets, Yankees, Astros, Mariners, and Nationals; the luckiest: Athletics, Angels, Giants, Dodgers, and Twins. Deeper analysis may or may not reveal something about those teams.)

23


I find it pretty interesting that OPS, which has been hit pretty hard hereabouts due to its over-valuation of SLG, is only very slightly worse at predicting runs scored than wOBA. Just goes to show the thin margins by which we choose stats, I guess.
OPS misses a lot more on smaller samples (players vs. teams) and outliers. Aggregate team data doesn’t tend to do crazy stuff.
OPS components are less volatile than wOBA.
What exactly does that mean?
I would like to see the comparison based on RMSE.
I agree.
“Because it correlates to runs scored better than OPS. I ran the team production numbers versus runs scored and found these relationships (both R):
OPS: .958
wOBA: .960
This is why we use wOBA.”
Surely you aren’t saying that a minuscule difference in r values (.958 versus .960) is your basis for preferring one stat over the other. That can’t be what you’re saying. What I don’t understand, then, is what this passage is supposed to mean.
Can someone enlighten me?
Maybe there is a typo? Maybe it’s .958 versus .860 or something like that?
I do think there is either a typo or else I’m misunderstanding. I’m just really bothered by the possibility of someone might actually arguing that this “difference” in correlation values amounts to anything meaningful.
That should say, “I’m just really bothered by the possibility of someone actually arguing that this “difference” in correlation values amounts to anything meaningful.”
I agree with you that RJ didn’t put that very well. wOBA is better than OPS, but not because it correlates with team runs any better.
well, slightly better is better
For me, wOBA is better simply because you can tell exactly how many runs better between two players’ wOBA
Guys, just look a little higher on the comment page. I said pretty much the exact same thing, and Sky mentioned that the correlation is that high partially because of the huge sample size inherent in looking at team totals, which alleviated my concern to a large degree.
My new concern is my penchant for god-awful run on sentences.
Kampfer: “well, slightly better is better.”
My response: Did you look at those numbers? .960 vs. .958? Do you realize how minuscule and meaningless that difference is? If you take these values to two decimal places, they’re identical. If you take the r^2 values to two decimal places, they, too, are identical. This slight difference amounts to essentially NOTHING and could very well be the result of nothing more than the effects of rounding numbers across correlations.
Please understand that I don’t have a stake in this OPS vs. wOBA debate. I understand them both, and while I realize that wOBA allows you to calculate differences in overall runs, that is not the issue here.
But I am bothered by people misusing statistics. Basically, I feel like it would be an egregious mistake if someone were to look at those r values and draw ANY conclusion from them about the superiority of one metric over the other — ESPECIALLY since it just takes into account one pair of seasons. Since I really don’t believe R.J. would do that, I was asking what was actually meant by that point.
Now, if the point is that wOBA is just as good, that’s fine. Better — that’s not fine; that’s just wrong, at least based on that particular evidence.
I think Sky said it well above. OPS generally works well for teams because there aren’t many outliers or anything that falls way off the norm. For individual players, it’s a whole different story. Consider the classic example of a .400 OBP .400 SLG player vs. a .300 OBP .500 SLG player.
OPS tells us they are the same, while wOBA tells us that the that first player is better to the tune of about .25 points, which is about 13 runs better over a full season. We “know” that wOBA is going to be about right, do to the fact that it is linearly related to how runs score in the majors; however, OPS is more of a theoretical stat. Just look at how slugging is calculated – very arbitrary.
Yeah. It’s supposed to be a rough measure of power, but triples aren’t an indicator of power, they’re more of an indicator of speed.
I agree with WY. I have no stake in this debate either, but to think that someone who works with stats for a living is pushing a .002 difference as being important? That’s disturbing. One thing that isn’t mentioned above is simplicity. Most baseball fans ‘get’ OPS. Even if they aren’t familiar with it, you can explain it to them in one short sentence. wOBA? Good luck.
OPS is simpler than wOBA so long as you don’t actually think about what it’s doing. If you actually do the crunching required to pick apart what OPS is doing (hint: it’s a lot of calculus) you’ll quickly find it’s not actually simpler at all.
There’s no “calculus” involved in OPS at all. There are calculations, and algebra, but not calculus. Calculus involves derivatives and/or anti-derivatives.
I’m not sure how anyone can argue with a straight face that wOBA is simpler, at least in the calculation sense, than OPS. OPS is basic arithmetic with numbers based on how many bases an event is worth in the game, which is observable to anyone who watches the game (i.e. – a double worth twice as much as a single). As opposed to how many runs an event is typically worth in a game, which is not.
Somewhere I have seen the correlation on runs scored with a series of popular metrics of team hitting, BA, OBP, SLG, OPS, OPS+ and then the newer statistical measures (but I can’t find it). I seem to recall being very unimpressed, as I am here, by the small improvements in correlation over and above OPS. Aside from the rare player far from the normal OBP/SLG splits, OPS is just fine.
OBP=(BB+H+HBP)/(PA)
SLG=(4*HR+3*3B+2*2B+1B)/AB
OPS=OBP+SLG
That’s pretty easy even if it doesn’t really “make sense” to add the two together mathematically.
Aweb wrote: “OPS is basic arithmetic with numbers based on how many bases an event is worth in the game, which is observable to anyone who watches the game (i.e. – a double worth twice as much as a single).”
SLG is simple arithmetic based on the number of bases an event is worth. OBA counts each event that
OPS is simple only if you assume that OBA and SLG are already calculated for you. But if they are, you don’t ever see the value of each event isolated. In order to see how the events are actually weighted in OPS overall, you have to get rid of the different denominators, and you get something like:
((H+W+HB)*AB + TB*PA)/(AB*PA)
There’s nothing intuitive about OPS when you look at it through that prism. And the value of each event is not intuitive at all–it is intuitive for OBA as a standalone statistic and SLG as a standalone statistic, but not for OBA+SLG.
For the vast majority of teams, almost anything that makes a modicum of sense is going to have a high correlation. (From 1993-2008, the correlation between at-bats and team runs scored is .64. Simply looking at home runs gives you a .75 correlation. Home runs times at bats? .81 correlation!)
The reasons wOBA is preferable to OPS for individual hitter evalution have almost nothing to do with correlation with team run scoring over a two-year time period.
R.J., instead of the straight ONB + SLG for OPS, could you compare wOBA to a modified OPS (= 1.7697 * ONB + SLG)? I would be interested to see how this mOPS does against wOBA in your comparison of team production numbers versus runs scored.
There is a method behind the 1.7697; it involves using the correlation function and goal seek in Excel.
Thanks.