Confessions of a conspiracy theorist: who’s watching the watchers?

One of the fundamental elements of the great Cardrunners debate from eariler this year – and one of the few that we have not beaten to the ground – was that of whether the fantasy community has accurately determined the value of a player’s production, even when we have the luxury of pricing it retrospectively. I think most of us just assume that when we look at our league provider’s player rankings pages that the player who is ranked first overall has outproduced the player ranked second, and so forth. However, in reality there is some sort of back-end formula taking place that is making certain assumptions and choosing what to value and how heavily so. The formula is not gospel, though it remains largely unquestioned by the fantasy-playing universe.

Returning to the quants’ point from the Cardrunners debate, perhaps they are right. If I removed the rankings from the following two stat lines, which line would you determine to have more absolute value?

Player A: .314 100 R 31 HR 112 RBI 7 SB over 620 ABs
Player B: .323 109 R 15 HR 79 RBI 32 SB over 640 ABs

Player A enjoys a substantial advantage in the power categories to the tune of 16 homers and 33 RBI. Meanwhile, Player B has a slim nine-run advantage, a slight nine-point advantage in AVG, over 20 more ABs, and a large advantage in SBs to the tune of 25 SBs. I would guess that Player B would come out ahead in the player rater, but I can’t be certain.

These are totally made up lines, so there’s no looking up the real counterparts to these lines and checking their respective rankings, and even if we could, that is not the point.

I assume Yahoo, ESPN or CBS could provide a “correct” answer or at least one justifiable by some objective standard. But just because the league providers do give us an answer, how are we to know that their conclusions are correct. Have they ever divulged their methodology and subjected it to the scrutiny of the Derek Cartys of the world?

To be clear, I’m not speaking from the “who would be more valuable to my team” perspective – we can’t expect Yahoo to know that. But, even in terms of value in a vacuum, how are we to know Yahoo’s answers are right?

I’m not going to attempt to derive my own formula, I’ll leave that to those with more statistical chops than I. But, I do want to ask some largely rhetorical questions about how Yahoo thinks by looking at the stat lines of some of its top-ranked players.

Miguel Cabrera and Robinson Cano are the two top-ranked batters and they share similar statistical profiles. As of my writing of this article, they boast the following lines:

(1) Cabrera: .351 (73/208) 40 R 17HR 52 RBI 2 SB
(2) Cano: .363 (82/226) 41 R 12 HR 45RBI 2 SB

Right off the bat, we can pretty confidently answer one question some may have. It does not appear that Yahoo considers positional scarcity/eligibility in its rankings and that Yahoo’s rankings are based on absolute value. I draw this conclusion because Cano’s production from a 2B appears as if it would be more desirable to a team than Cabrera’s production from a 1B. This is a nice and clean example because neither player can boast a specific type of production that the other is deficient in.

Here are some actual comparisons that Yahoo is forced to rule on.

(6) Alex Rios: .318 (62/195) 38R 12HR 29 RBI 17 SB
(8) Evan Longoria: .312 (68/218) 37 R 11 HR 44RBI 10 SB

So, here we see seven steals (and a tiny advantage in AVG along with one run and one homer) trump a 15 RBI advantage. That makes sense on the surface, given the relative value of an RBI versus a steal. However, Longoria is tied for fifth in all of baseball in RBIs, while 75 players have driven in more than Rios. Surely, Rios is getting credit for his across-the-board productivity here, but in this respect Longoria is actually the better balanced player.

Here’s another one.

(26) Troy Tulowitzki: .303 (64/211) 42 R 8 HR 29 RBI 6 SB
(27) Magglio Ordonez: .312 (63/200) 37 R 8 HR 41 RBI 1 SB

Mental Health and the CBA
A particular bit of language in the latest CBA could have negative consequences for some players.

Here we see five steals and five runs overtake small absolute batting average and a substantial RBI advantage. It seems the player rater is relying on a similar value system as it did above.

So far, I’ve just focused on close calls, but I presume all of these are defensible. But, combing the top-ranked players list, you do find some curious rankings, relative to one another. For example, how do we explain the chasm between Brett Gardner and Elvis Andrus?

(20) Gardner: .311 (59/190) 41 R 3 HR 18 RBI 20 SB
(65) Andrus .304 (63/207) 39 R 0 HR 16HR 18 SB

So, Gardner does have a small advantage in AVG, but Andrus’ is weightier (I assume Yahoo considers this to some degree or else Buster Posey would be considered a better AVG asset than Ichiro). He also boasts two more runs, three more homers, two more RBIs and two more steals. Now, a combined five HR/SB advantage is a legitimate relative advantage, as one homer or steal is a lot more meaningful than a single run or RBI. But is this advantage really large enough that there are 45 players ranked in between them? How many David Wrights is Yahoo fitting on the head of this pin?

One of the most curious comparisons and perhaps the one that gives me the strongest urge to question the system is Alex Rodriguez and Casey McGehee.

(43) Alex Rodríguez .294 (63/214) 33 R 8 HR 43 HR 2 SB
(63) Casey McGehee .291 (62/213) 29 R 9HR 43 HR 1 SB

In this case, the two players have almost identical statistical profiles, including hits and at-bats. A-Rod has one more hit (in one more AB), four more runs and one more steal. McGehee has an extra homer. Yet, somehow Adrian Gonzalez, Ryan Zimmerman, Andrew McCutchen, Marlon Byrd, Michael Young, Josh Willingham, Kelly Johnson, Troy Glaus, and Adrian Beltre rank between them? Really?

How does Andrew McCutchen’s totally distinct profile of .314/34/7/17/13 compare to a more slugger-oriented stat line? Well, apparently it is more valuable than McGehee’s .291/29/9/43/1, but less valuable than A-Rod’s .294/33/8/43/1.

This isn’t even considering the fact that there a 10 pitchers grouped between them.

While I have no independent mechanism by which to rank McGehee and A-Rod, I feel I’m being reasonable and justified when I say that this triggers some reaction when put to the sniff test.

To be clear, the point of this piece is not to throw unsubstantiated mud at Yahoo’s player ranker, but just to raise questions in a way that could benefit our readers. If it’s possible that these rankers are wrong, that’s a gaping opportunity to be exploited, considering just about the whole fantasy universe uses these systems to attribute relative value to known player performance.

You could either investigate the bias and rework the system which might allow you to capitalize the same one might when a league’s categories are customized while the pre-ranks remain tied to the default categories.

Or, you can scour the boards to find what seem to be bargains. Now, it’s not news to find that McGehee can be had cheaper than A-Rod, but it is interesting to know that a system concerned only with the past and not interested in predicting the future can indicate such a sizeable gap in value between two nearly identical products.

This means that if I were equally confident that this performance represents both McGehee and A-Rod’s true talents, I should target McGehee because the mechanism we are using as the price guide seems to be off….unless of course it is right on McGehee, but off on A-Rod, which is why we should probably derive our own formula so we can determine if and where the disparity is occurring.

As the resident conspiracy theorist, I’ve done my job. OK, actually intelligent guys – get on that!


Print This Post
Sort by:   newest | oldest | most voted
Seth
Guest
Seth

I have no idea exactly how it works either, but I’m guessing that a large piece is the scarcity of each stat. Especially at the beginning of the season, I could see this having a pretty large effect.

Ameer
Guest
Ameer

Great article!  I think it’s pretty obvious that Yahoo has an O-rank component in its formula, which is probably not a good idea (I’m sure that’s debatable too).  What “player rater” type system does everyone around here seem to think is a good one?

Derek Ambrosino
Guest
Derek Ambrosino
By O-rank, I assume you mean that Yahoo recognizes A-Rod’s track record, therby placing him ahead of a player like McGehee, implying that A-Rod is underachieving and McGehee is over achieving. I actually don’t think that’s the case, and for at least two reasons. 1. The sniff test again. Why then, wouldn’t A-Rod also be placed ahead of, say, Brett Gardner, who has also nothing in his history to make anybody believe he should continue to outproduce one of the greatest players in the history of the sport. How would it let Rios get past Longoria; if they’re objectively close… Read more »
batpig
Guest
batpig
I think one problem is that you are presuming there is a “sizeable gap in value” between #43 and #63, as though this ordinal ranking system were (1) absolute and (2) linear, in the sense that the “gaps” between rankings were equal. The fact of the matter is that players are going to be bunched very closely in value at the top (especially since we are still only 1/3 through the season so counting stats haven’t had time to distribute more widely).  Assuming that “behind the scenes” there is some point ranking system that quantifies the player’s value, you could… Read more »
Derek Ambrosino
Guest
Derek Ambrosino

Oh, and here’s a simple exercise that might help here. I happen to only be in Yahoo leagues this year, but for those of you who are in leagues run by numerous different providers, are, say the top 25 or 50, the same under different providers?

Derek Ambrosino
Guest
Derek Ambrosino
Batpig, I hear you, but I assure you that I am making no such assumption. In fact, I totally understand, concede, and expect the type of distrubtion you describe. But, frankly, it still fails the sniff test, IMO. To be clear, I’m only talking about sniff test standards here. But, consider these two points: 1. Even at 43 – 63, we are still toward the tip of the skew in terms of the player universe. We’re not 3+ standard deviations in, as we are with Miggy and Cano, but 43 is the 90th percentile of ROSTERED players in deep leagues.… Read more »
Derek Ambrosino
Guest
Derek Ambrosino
Oh, and sorry to be so chatty, but your hypothesis raises another point. Let’s assume it to be true. If it is, then doesn’t the ordinal ranking system seem to be a poor and likely disingenuous way of indicating relative player value. If there’s almost no realistic difference in the value of the 43rd and 63rd most valuable players, then shouldn’t our provider be showing us the raw scores instead, as opposed to using an ordinal ranking system that implicates a significant disparity? And, further, if it’s possible that there almost no absolute value difference between the 43rd and 63rd… Read more »
Andrew
Guest
Andrew

Use the ESPN Player Rater or baseballmonster.com

Joe Mama
Guest
Joe Mama

nobody cares about the cardrunners debate. get over it already

3FingersBrown
Guest
3FingersBrown

Both of my leagues are on Yahoo this year and I take their rankings with a grain of salt – and attempt to use them to my advantage when making trade offers.

Baseballmonster is a tremendous resource for ranking players, particularly if you play in a league with non-standard categories.

Derek Ambrosino
Guest
Derek Ambrosino

Yahoo this year and I take their rankings with a grain of salt

So, when you look at the top producing players, you don’t assume that the 30th ranked player (for 2010 stats to date, and nothing else) has outproduced the 50th ranked player?

Assume for a second that baseballmonster didn’t exist and that you were in a 5×5 league. How then would you even determine who your own most (absolute) valuable players are, if you are throwing out your league provider’s rankings?

Evan
Guest
Evan
pretty interesting….to add more fuel to the fire here’s a list of CBS’ top 25 hitters Cabrera, Miguel 1B DET   4 Rios, Alex CF CHW   5 Cano, Robinson 2B NYY   6 Suzuki, Ichiro RF SEA   8 Longoria, Evan 3B TB   9 Morneau, Justin 1B MIN   10 Crawford, Carl LF TB   13 Guerrero, Vladimir DH TEX   16 Braun, Ryan J. LF MIL   17 Pujols, Albert 1B STL   18 Gardner, Brett CF NYY   19 Wells, Vernon CF TOR   26 Youkilis, Kevin 1B BOS   27 McCutchen, Andrew CF PIT   29 Votto, Joey 1B CIN   34 Kemp, Matt CF LA   37 Ethier, Andre RF LA   42 Andrus, Elvis SS TEX   44 Gonzalez, Carlos LF COL   47 Wright, David 3B NYM   50… Read more »
batpig
Guest
batpig
Derek – to muse on some of your comments: 1) you may be right that the “clustering” hypothesis is not totally tenable but I think you are underrating how tight many statistics are at this point in the season.  Just as a quick example, looking at the stats leaderboard right now there are 41 players with 30-35 RBI’s and nearly 50 players with between 30-35 Runs.  2) HR’s and especially SB’s seem to be greatly weighted in Yahoo’s rankings (and rightfully so).  If you take your Gardner vs Andrus example, I think at this point in the season an extra… Read more »
Derek Ambrosino
Guest
Derek Ambrosino
Batpig, Thanks for the repsonses. And, I think that at a good idea would be for me to revisit this article at the end of the season and see if the superficial problem persists. A couple of replies to your replies: – Does value need to be defined precisely enough to differentiate between the 20th and 60th most valuable player in [fantasy] baseball? Yes. Of course! 60th and 65th, probably not. But, that’s a 20th and 60th is a huge difference, qualitatively. – What’s the connection between this and the allure of fantasy baseball? Well, if these value disparities are… Read more »
batpig
Guest
batpig
“Well, if these value disparities are so minute, then things like luck become a much bigger part of determining how things play out. I wouldn’t take this pursuit seriously if I figured it to be *largely* a game of chance.” I don’t think it follows at all that, just because there is a lack of ordinal certainty, fantasy baseball is *largely* a game of chance.  That being said, it would be tremendously ignorant to deny that luck plays a significant role in fantasy baseball.  I view it as being analogous to a stock portfolio, or better yer a poker hand—there… Read more »
Derek Ambrosino
Guest
Derek Ambrosino
I most certainly think that luck is involved. I think fantasy baseball has many, many similarities to poker. Stacking the odds in your favor is really all you can do. Regarding the one vs. dozens of choices, I was initially speaking in terms of retrospect. At the end of the season, there will only be one correct answer as to who the best ROI at draft slot 34 was (now, there may be other considerations why that player may not have been the optimal pick for you, but that’s a strategic discussion). And, of course, some of those answers, even… Read more »
wpDiscuz