- Community – FanGraphs Baseball - http://www.fangraphs.com/community -

# Comparing 2011 Hitter Forecasts

This year, I’m going to look at the forecasting performance of 12 different baseball player forecasting systems. I will look at two main bases of comparison: Root Mean Squared Error both with and without bias. Bias is important to consider because it is easily removed from a forecast and it can mask an otherwise good forecasting approach. For example, Fangraphs Fan projections are often quite biased, but are very good at predicting numbers when this bias is removed.

Results-RMSE:

 Forecast R HR RBI AVG SB AVG RMSE(rank) RMSE(rank) RMSE(rank) RMSE(rank) RMSE(rank) RANK Marcel 22.958 (1) 8.108 (2) 24.186 (1) 0.039 (3) 7.054 (6) 2.6 Baseball Dope 24.788 (3) 8.506 (6) 26.788 (5) 0.036 (1) 6.720 (1) 3.2 Will Larson 24.474 (2) 8.066 (1) 25.113 (2) 0.042 (12) 6.729 (2) 3.8 CAIRO 25.609 (5) 8.488 (5) 25.755 (3) 0.037 (2) 6.929 (5) 4.0 AggPro-System 25.729 (6) 8.377 (4) 26.829 (6) 0.040 (4) 6.889 (4) 4.8 AggPro-Category 25.107 (4) 8.153 (3) 26.316 (4) 0.041 (11) 6.779 (3) 5.0 RotoChamp 29.162 (11) 9.032 (9) 27.048 (7) 0.040 (5) 7.746 (9) 8.2 ESPN 27.575 (8) 9.455 (12) 28.568 (10) 0.040 (6) 7.257 (7) 8.6 Bill James 28.092 (9) 8.929 (8) 28.334 (8) 0.041 (7) 7.973 (12) 8.8 Razzball 26.766 (7) 9.331 (10) 28.791 (11) 0.041 (8) 7.961 (11) 9.4 Fangraphs Fans 30.001 (12) 8.918 (7) 32.646 (12) 0.041 (9) 7.532 (8) 9.6 CBS Sportsline 28.261 (10) 9.387 (11) 28.345 (9) 0.041 (10) 7.794 (10) 10.0

For the second straight year, the Marcel projections have the lowest RMSE, as is shown in Table 1. The simple weighted formula used to create the projections was in the top 3 for each category except for SBs. Baseball Dope and my (Will Larson) forecasts were 2nd and 3rd. Rounding out the bottom were Bill James, Razzball, the Fangraphs Fans, and CBS Sporsline.

Results-Fit:

 Forecast R HR RBI AVG SB AVG Fit(rank) Fit(rank) Fit(rank) Fit(rank) Fit(rank) RANK AggPro-System 0.295 (4) 0.421 (1) 0.326 (1) 0.195 (2) 0.696 (3) 2.2 AggPro-Category 0.318 (2) 0.419 (3) 0.326 (2) 0.192 (4) 0.698 (1) 2.4 Will Larson 0.323 (1) 0.420 (2) 0.325 (3) 0.187 (6) 0.692 (4) 3.2 Baseball Dope 0.287 (6) 0.400 (6) 0.297 (4) 0.266 (1) 0.675 (7) 4.8 ESPN 0.300 (3) 0.393 (10) 0.274 (8) 0.178 (8) 0.697 (2) 6.2 Fangraphs Fans 0.227 (11) 0.419 (4) 0.272 (11) 0.194 (3) 0.691 (5) 6.8 CAIRO 0.251 (10) 0.395 (8) 0.294 (5) 0.191 (5) 0.661 (8) 7.2 CBS Sportsline 0.291 (5) 0.396 (7) 0.278 (7) 0.165 (10) 0.660 (9) 7.6 Bill James 0.278 (7) 0.413 (5) 0.274 (9) 0.179 (7) 0.640 (11) 7.8 Marcel 0.266 (9) 0.393 (9) 0.279 (6) 0.158 (11) 0.642 (10) 9.0 RotoChamp 0.212 (12) 0.375 (11) 0.273 (10) 0.175 (9) 0.678 (6) 9.6 Razzball 0.274 (8) 0.364 (12) 0.262 (12) 0.124 (12) 0.624 (12) 11.2

This table is the r^2 of the simple regression: actual=b(1)+b(2)*forecast+e.  The b(1) term captures ex-post bias, allowing b(2) to better capture the information content in the forecast. After correcting for ex-post bias, the AggPro systems ended up in 1 and 2, followed by my projections. What is interesting is that all three of these systems used weighted averages of forecasts. It is also interesting that the Fangraphs Fans went from 11th in the RMSE comparison to 6th, indicating that, as with last year, fan projections are fairly biased. Marcel is near the bottom, showing that the Marcel projections are excellent at estimating average production and minimizing bias, but are not very good at predicting player-to-player differences. For most decision-making purposes, it is these player-to-player differences that are most important to measure accurately. This makes Marcel much less useful than something like ESPN or Fan projections, which are much better at forecasting than their overall RMSE measures would indicate.

Takeaway:

Averaging works. If you want a forecast that’s easy to create, take as many forecasts as you can and average them. Extremes in each forecast will be averaged out, while still capturing variation. If you have access to more sophisticated methods, weighted averages using historic accuracy is best.

Next week I’ll do the same analysis for pitchers.

Happy forecasting!

–Will Larson

notes:

Last year, I created forecasts based on weighted averages of other forecasting systems. This is my forecast. Many commenters were skeptical that it would perform well. It seems to have worked, at least for last year.

All of the non-proprietary numbers in this analysis can be found at my little data repository website found at http://www.bbprojectionproject.com.