# FanGraphs Baseball

1. Dave,

I have not read “Wisdom of the Crowds” but is that because the fans tend to have a natural regression in place that is similar to those in the statistical models?

Comment by JDSussman — November 30, 2009 @ 12:05 pm

2. Are we not concerned that we fans will rate players on our favorite teams and not on others, therefore inducing the bias you are trying to avoid? As a Rangers fan, I feel I could give good estimations of numbers for next year for my team, and maybe some division foes, but I wouldn’t dare make those estimations of teams, say, in the NL Central with no knowledge of them whatsoever. So, if all Ranger fans (with their bias) estimate numbers for the Rangers only, this does not represent a diverse enough sample group for the crowd theory to work.

Comment by Adam D — November 30, 2009 @ 12:11 pm

3. It’s just essentially the theory that when enough people use simple past performance and observation to judge a player, eventually you get a real consensus on said player.

And that a sample size of just 165 people can come to about the same conclusion as complex analytical processes, it shows the power of simple objectivity.

Comment by JoeR43 — November 30, 2009 @ 12:15 pm

4. My guess is that David and Tom will present not only the raw results of the survey, but some “processed” results. You already have to pick your favorite team in order to rate players, which will present the opportunity for adjustment. Plus you can easily compare the overall stat lines of hitters and pitchers to make sure things even out.

Comment by Sky Kalkman — November 30, 2009 @ 12:47 pm

5. Adam, this is why we ask you to enter your favorite team. Behind the scenes there’s a lot more going on than just aggregating all the projections and I think we’ll probably have a decent sample size for non team fans. You can actually see the breakout of fans/non fans if you go any of the detailed projection pages.

http://www.fangraphs.com/fanpdetails.aspx?playerid=8370&position=2B

Comment by David Appelman — November 30, 2009 @ 12:50 pm

6. How much of a crowd do you expect/hope for? 50, 100, 200?

Comment by Jimbo — November 30, 2009 @ 1:17 pm

7. It’d be nice to get at least 100 for each player. FanGraphs has a large enough audience where I think it’ll be pretty doable and we may end up getting a lot more for some of the more popular players.

Comment by David Appelman — November 30, 2009 @ 1:23 pm

8. The phenomenon of the wisdom of the crowds is well known but not well understood. There are many hypotheses about why it works and none are perfect. Some people offer a “cancelling out” explanation: the idea is that our personal biases each other out (for every pessimist, there’s an optimist). I find this explanation partial at best, since it leaves out why the remainder after all our biases cancel tends toward reality.

Another explanation appeals to the Condorcet Jury Theorem, which is just a theorem of probability mathematics; it says that given a set of independent trials with a greater than .5 chance of getting the right answer to a yes/no question, the probability that the majority of trials is correct gets arbitrarily close to 1 as the number of trials approaches infinity. (There are ways to show that any complex question that isn’t a yes/no question can be reduced to a series of yes/no questions, so the theorem can be applied to a complex question like “How many home runs will Jeter hit?”)

There are cases where the Wisdom goes completely unwise. Here’s a fun example: if you ask a room full of people to secretly write the number of beans in a jar on a piece of paper, the average answer tends to be very close to the number of beans in the jar. If instead you ask each person, in sequence, to say out loud how many they think are in the jar so everyone hears everyone else’s answer, the average answer usually converges on the first answer that everyone heard. Crowds can be very stupid and sheepish under certain conditions. An explanation of this sort is apt to make the fans scouting report a lot like projection systems because many people will simply look at a few projection systems before offering their forecast. What is interesting is whether enough fans can beat the projection systems by knowing things that the systems don’t, e.g., injury history.

Comment by philosofool — November 30, 2009 @ 1:27 pm

9. In my experience of running the Fans Scouting Report, the Community Playing Time survey, the Clutch Project, and god-knows what else I’ve done, all you need is 20 votes per player. Even 10 votes is pretty good. Things have a tendency to settle down pretty fast.

And yes, the entire reasoning behind asking for the Favorite team is to handle the issue of home-team bias.

Comment by tangotiger — November 30, 2009 @ 2:02 pm

10. Interesting stuff. Thanks.

Comment by JDSussman — November 30, 2009 @ 2:16 pm

11. Dave-

Challenge time. You and the Fangraphs writers should make your own projections and match them up against ours. I’m guessing you’ll hit the Mariner ones square on the head, but you’ll be too cynical towards the high RBI 1B’s on big market teams, Matt K will have the most accurate figures, but you’ll have to sort through 3×10^872938676209875029879 links to find out what they are, and Carson will have some really long post loaded with humorous observations on his picks, and everyone will bitch and moan about how his writing made them laugh too much.

…and go!

Comment by Logan — November 30, 2009 @ 3:10 pm

12. It was interesting to watch what you are talking about in the last paragraph play out favorably with Predictatron. It will likely be even more interesting to see how this experiment plays out.

Comment by walkoffblast — November 30, 2009 @ 3:11 pm

13. The Wikipedia Article has a surprisingly good break down of the phenomenon and the criteria necessary for it to be successful. – http://en.wikipedia.org/wiki/The_Wisdom_of_Crowds

My biggest concern would be bias (Independence). Since this site already lists Bill James’ projections, the fan’s and in the following weeks, CHONE, ZiPS, etc. a lot of fans will probably be biased by those numbers. To be most effective WOTC predictions need to be “blind”. Unfortunately I don’t think there is any way to realistically fix this. Fortunately those projection systems are pretty good so this variable shouldn’t too negatively affect the results.

The other concern I have that has been mentioned is ensuring that everyone has good intentions. On the Internet their are always jerks who are willing to ruin a great thing. An “easy” way to correct this would be to put your money where your mouth is. Submissions could cost say \$.25-\$1 with the person whose projection was closest at the end of the year getting all of the winnings. This would incentivize people to only submit their true beliefs. Of course this would severely reduce the number of submissions as well as be a huge burden on Fangraphs to create such a system.

Overall I think this is an awesome idea (something that I unfortunately had been hoping to implement myself in the future) and as long as they can control for crazy submissions I think the final results should be quite accurate.

Comment by Toffer Peak — November 30, 2009 @ 3:25 pm

14. And so far, even home team doesn’t look all that relevant.
Examples:
Pujols, Cards fan vote: .331/.440/.633, 42 HR
Pujols, other fan vote: .330/.441/.630, 41 HR

Pedroia, Sox fan vote: .311/.383/.467, 15 HR
Pedroia, other fan vote: .310/.379/.463, 15 HR

Ichiro, Mariners vote: .336/.382/.434, 116 R
Ichiro, other fan vote: .332/.377/.422, 110 R

I’m sure some guys (most noteably young players and guys in decline) will get some home crowd bias, but so far so good.

Comment by JoeR43 — November 30, 2009 @ 3:39 pm

15. Yeah, the projections for guys like Matt Weiters and David Ortiz should be, uh, entertaining.

Comment by Joser — November 30, 2009 @ 4:06 pm

16. When it comes to jerks, the best you can hope for is that jerks washout by contradicting one another or simply get swamped by the vast majority of people that are just giving their best opinion.

Comment by philosofool — November 30, 2009 @ 5:18 pm

17. This is awesome…very user-friendly. One suggestion: can the actual value of the projection show up right under the actual stats (perhaps in a different color) so I don’t have to go to the “view my projections” screen to see the nominal output of my choices?

Comment by Shizane — November 30, 2009 @ 6:19 pm

18. Awesome stuff. Will look forward to the inevitable comparisons by Tango of these results vs. the other projected estimates.

Comment by Rudy Gamble — November 30, 2009 @ 8:27 pm

19. The thing I’m finding awesome is the obviously large number of M’s fans that read this site. Jack Wilson was the second SS to receive a projection, after Derek Jeter.

Comment by philosofool — November 30, 2009 @ 10:23 pm

20. Yeah, those crazy fans are sure to wildly inflate the Wieters projection. No projection system would ever do such a thing …

Comment by walkoffblast — December 1, 2009 @ 12:25 am

21. This looks like such fun. I hardly feel qualified to have an opinion on such matters… But it looks like a lot of fun is to be had with this.

Comment by Patrick — December 1, 2009 @ 10:00 am

Line and paragraph breaks automatic, e-mail address never displayed, HTML allowed: `<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> `