- FanGraphs Baseball - http://www.fangraphs.com/blogs -

The Fans Versus The Algorithms

Posted By Dave Cameron On February 21, 2013 @ 4:01 pm In Daily Graphings | 39 Comments

Here on FanGraphs, we host several different projection systems, most of which are algorithms that take a player’s performance history and then mix in things like regression and aging curves to develop a forecast for 2013 production. But, we have one set of projections that is very different from the rest – the Fans Projections.

Instead of being based on any kind of mathematical model, these are simply crowdsourced from our readers, with you guys creating the projections with your various opinions about player performances for next year. While there are certainly some imperfections with any kind of crowdsourcing project, the widsom of the crowds has also shown to do pretty well in situations like this, and over the years we’ve done the Fans Projections, we’ve seen that the system actually holds its own when stacked up against the algorithms, though it does require one manual adjustments in order to make the system work properly: deflation.

Put simply, you guys are just too darn optimistic — I guess that’s why you’re called fans — and annually overproject total WAR by something like 20%. So, if you look at the data from the Fans Projections next to something like ZIPS or Steamer, you’ll see some huge discrepancies, but a lot of those simply have to do with the scale, and once the Fans Projections are deflated to create a more accurate overall total, many of the variances go away.

Not all of them, though. There’s one clear type of player where the Fans and the algorithms disagree, and it’s probably a telling area, given what we know about the fine line between hope irrational exuberance. That type of player? Prospects.

To illustrate this point, I pulled the hitter projections from both the Fans and the Steamer system off of the site, dumped them side by side, and then added a column for player age. I deflated the Fans totals in order to put them on the same scale as Steamer, then sorted by the age column. I then broke the 326 players on the spreadsheet into five bour bins: 24 and under, 25-28, 29-33, and then 34 and up. This gives us approximately even sized groups (both in number of players and projected plate appearances) for both young and old, and then also in pre-prime and post-prime.

Here are the total WAR projections from both Steamer and the Fans for all four groups.

Group Number PA Fans WAR Steamer WAR Fans WAR/600 Steamer WAR/600
< =24 45 23,298 115 100 3.0 2.6
25-28 118 60,827 250 245 2.5 2.4
29-33 122 66,810 298 308 2.7 2.8
>=34 40 20,064 71 79 2.1 2.4

While Steamer projects each group to produce at roughly the same rate regardless of age — keep in mind, fans generally don’t project bench players and scrubs, so we’re only dealing with the ~11 best position players on each team — the fans believe that the best performing group is actually those with the least amount of Major League service time, and the worst players are the ones with the largest data samples to pull from.

During the prime ages of 25-33, the fans and Steamer don’t really disagree much at all, at least in the aggregate. Sure, there are some established players like Miguel Cabrera where the two sides differ, but the big gaps are almost all found among the very young. Here are the 16 players that the Fans project for at least +1 WAR more than Steamer, even after deflating the total projections to put them on the same scale:

Name PA Fans WAR Steamer WAR Diff Age
Bryce Harper 678 5.1 3.0 2.1 20
Jurickson Profar 476 2.9 0.9 2.0 20
Franklin Gutierrez 518 2.4 0.7 1.7 30
Dustin Ackley 693 3.7 2.2 1.5 25
Andrelton Simmons 608 3.9 2.4 1.5 23
Desmond Jennings 673 4.2 2.9 1.3 26
Michael Saunders 586 2.1 0.8 1.3 26
Manny Machado 579 3.1 1.9 1.2 20
Jean Segura 528 2.1 0.9 1.2 23
Yonder Alonso 638 2.1 0.9 1.2 26
Alex Gordon 696 5.1 4.0 1.1 29
Brett Lawrie 609 4.8 3.7 1.1 23
Jason Kipnis 681 3.8 2.7 1.1 26
Joey Votto 654 6.7 5.7 1.0 29
Billy Hamilton 344 1.6 0.6 1.0 22

The average age of those 16 players? 24.5 years old. Gutierrez is the only guy on the list not in his twenties, and then after Votto and Gordon, nobody is over 26. Harper, Profar, and Machado are the only three 20-year-olds that the Fans projected, and all three show up on the list of guys that the fans like far more than Steamer. In fact, of the 25 players in the projection that are listed as 23 or younger, the fans are higher on 20 of them, and the entire difference between the two systems in the < =24 crowd can actually be found in those 23-and-under players, as the two systems are in almost perfect agreement on total WAR for players headed into their age-24 season.

The gap is most pronounced when talking about the game's elite prospects. On a per 600 plate appearances basis, the fans project Mike Zunino to be every bit as good as Albert Pujols. They see Manny Machado as the equal of Jose Reyes, Jurickson Profar able to match the performance of Jay Bruce, and Oscar Taveras is already as good as B.J. Upton. Steamer is not nearly as bullish on those four, grading them out as no better than average players, falling well short of the expected production of established stars.

Part of being a fan is dreaming about what could be in the future. It is much easier to dream about improvement from a talented young star-in-the-making than it is to dream about positive regression to the mean from an aging player coming off his worst season. Only one of those two things is exciting. But, as much fun as it might be to dream about how good Profar could be, it’s also useful to have a reality check like Steamer around to tell everyone to not get too carried away.

The same goes for players on the decline as well. A bad year from a player over age-34 is often taken as a sign of a marked loss of skills, and the fans expect that kind of age related decline to continue into the future. You look at the list of guys that Steamer likes more than the fans, and you find guys like Derek Jeter, Marco Scutaro, Lance Berkman, and Michael Young. While fans jump off the bandwagon when a player passes 35, Steamer still sees value in formerly good players who just aren’t quite as good as they used to be. And, again, I think Steamer is correct here.

Overall, I think the Fans Projections look very reasonable once you just take out the across-the-board optimism that inflates the overall total, but it’s also worth noting where the differences lie. It’s great to be excited about prospects, but the evidence suggests that prospect hype has probably gone a bit too far, and we should rein in our expectations of how even elite young talents are going to do in 2013.


Article printed from FanGraphs Baseball: http://www.fangraphs.com/blogs

URL to article: http://www.fangraphs.com/blogs/the-fans-versus-the-algorithms/

Copyright © 2009 FanGraphs Baseball. All rights reserved.