2014 Projection Review (Updated)

Update: The previous version of this post, published last week, contained a data error that has now been fixed. Steamer/Razzball and Pod projections have been added and the hitter sample has been corrected from the prior version of this article.

Welcome to my 5th annual forecast review.  Each year, every projection submitted to me at http://www.bbprojectionproject.com is tested for error (RMSE), overall predictive power (R^2), and is then ranked.  I present both RMSE and R^2 because both have their uses. RMSE is a standard measure of forecast error, but this metric penalizes general optimism/pessimism about the run environment, even if a forecast has low error after controlling for the bias. For instance, Marcel is very good at predicting the run environment and the FanGraphs Fans are pretty terrible, so Marcel will usually have a better RMSE than the Fans. On the other hand, R^2 serves as a better test of the relative performance of players by ignoring any general biases in the forecasts that are pervasive in the forecasting system. Marcel tends to be lower in this metric versus other systems due to its rigid formula, whereas more sophisticated methods like ZIPS or Steamer tend to do better.

Comparisons are based on the set of players that every system projected. This amounts to 70 pitchers and 141 hitters for 2014. This is certainly limiting, but there is an inherent tradeoff in the number of projection systems that can be analyzed vs. the number of players that are projected by all systems. My policy is to consider as many projection systems as possible, as long as the number of players doesn’t get too low.

Now, on to the contest!

This year certainly saw some interesting results.  By the R^2 metric, the best forecaster for hitters (Dan Rosenheck) only published forecasts for hitter categories–evidently there’s some benefit in specialization when it comes to projecting baseball players. The best pitcher forecasts came from Mike Podhorzer’s Pod forecasts.  The best composite score came from my own personal forecast brew, which is computed based on an algorithm that estimates weights of other main-line forecasts. In a sense, this is not an original forecast, so I now note forecasts that I know use other forecasts as inputs with an “*” (I realize that to some degree, most everyone calibrates their forecasts to what they see other people doing). The next two forecasts are also of this same type, the AggPro and the Steamer/Razzball forecasts. The top “structural” forecast was Pod, followed by ZIPS, Rotovalue, and CBS.

In terms of RMSE, Dan Rosenheck ran away with the hitters, and my weighted average did the best among pitchers.  The top overall performers across categories were MORPS, Marcel, Rotovalue, and AggPro.

Overall, there are a few interesting comparisons to be made between projection systems across different years. Among the open-source stats community, Steamer vs ZIPS is always interesting to watch. In prior years, Steamer has been better. This year, however, ZIPS made huge gains and beat Steamer.  Marcel, had a typical year—with a very favorable ranking on RMSE but not R^2. The FanGraphs Fans had a down year, finishing near the bottom in most metrics.  CBS Sportsline is the top forecast by a major media company, which in general, tend to do poorly. Finally, most every projection submitted beat the naïve previous-season benchmark, where the 2014 forecast is simply the actual performance in 2013.  At least we’re all doing something right.

Thank you again to all who submitted projections. I invite anyone who is interested to submit their top-line hitter and pitcher projections to me at larsonwd@gmail.com.  You projection will be put up on http://www.bbprojectionproject.com as soon as I receive it, unless you want me to embargo it until the end of the season, which some people choose to do because of fantasy baseball or other proprietary reason.  All the code (STATA) and data for these evaluations are available upon request. If I’m using the wrong versions of anyone’s projections (which can happen!), please let me know.

 

R^2 Rankings:

Place Forecast System Hitters Pitchers Average
N/A Dan Rosenheck* 1.60 1.60
N/A Beans 5.00 5.00
1st Will Larson* 6.60 5.25 5.93
2nd AggPro* 8.40 6.25 7.33
3rd Steamer/Razzball* 6.20 9.00 7.60
4th Pod 11.20 4.75 7.98
5th ZIPS 10.00 7.25 8.63
6th Rotovalue 9.00 8.25 8.63
7th CBS Sportsline 10.20 8.00 9.10
8th ESPN 9.40 10.50 9.95
9th Steamer 9.60 11.50 10.55
10th Fangraphs Fans 13.60 9.00 11.30
11th Rotochamp 7.60 15.25 11.43
12th Razzball 11.60 12.25 11.93
13th MORPS 13.20 11.00 12.10
14th Clay Davenport 14.60 11.50 13.05
15th Cairo 8.20 18.00 13.10
16th Marcel 16.60 10.00 13.30
17th Bayesball 9.80 20.50 15.15
18th Guru 16.80 14.00 15.40
19th Oliver 16.40 15.00 15.70
20th Prior Season 20.40 18.75 19.58

 

RMSE Rankings:

Place System Hitters Pitchers Average
N/A Dan Rosenheck* 1.40 1.40
1st MORPS 4.20 8.50 6.35
N/A Beans 6.50 6.50
2nd Marcel 8.00 7.00 7.50
3rd Rotovalue 8.60 7.00 7.80
4th AggPro* 7.60 8.25 7.93
5th ZIPS 9.60 7.75 8.68
6th Clay Davenport 6.60 10.75 8.68
7th Steamer 7.80 11.00 9.40
8th Cairo 4.80 14.00 9.40
9th Steamer/Razzball* 9.80 10.00 9.90
10th Will Larson* 15.60 4.75 10.18
11th Guru 7.80 13.00 10.40
12th Rotochamp 10.20 11.50 10.85
13th Bayesball 7.20 15.25 11.23
14th Pod 15.80 8.75 12.28
15th Razzball 16.20 13.00 14.60
16th Oliver 14.40 15.25 14.83
17th ESPN 18.40 11.50 14.95
18th CBS Sportsline 17.40 13.50 15.45
19th Fangraphs Fans 19.40 13.25 16.33
20th Prior Season 20.00 20.50 20.25

 

RMSE, Hitters:

system r rank hr rank rbi rank avg rank sb rank AVG
Dan Rosenheck* 19.22 1 7.07 1 20.91 1 0.024 2 6.24 2 1.40
MORPS 20.56 2 7.70 3 22.35 2 0.027 13 6.13 1 4.20
Cairo 21.55 3 7.87 6 22.53 3 0.025 9 6.30 3 4.80
Clay Davenport 21.91 6 7.92 7 23.74 8 0.025 8 6.33 4 6.60
Bayesball 22.47 9 8.24 10 24.03 10 0.022 1 6.39 6 7.20
AggPro* 22.64 12 8.23 9 23.34 6 0.024 3 6.42 8 7.60
Steamer 22.58 10 8.22 8 23.37 7 0.025 7 6.41 7 7.80
Guru 22.62 11 7.74 4 23.76 9 0.025 6 6.88 9 7.80
Marcel 21.67 4 7.62 2 22.76 4 0.027 16 7.04 14 8.00
Rotovalue 22.03 7 7.77 5 23.02 5 0.026 10 7.07 16 8.60
ZIPS 22.11 8 8.46 11 25.30 14 0.024 4 6.94 11 9.60
Steamer/Razzball* 23.87 13 8.73 13 24.75 13 0.024 5 6.35 5 9.80
Rotochamp 21.73 5 8.49 12 24.60 12 0.026 12 6.93 10 10.20
Oliver 24.67 16 9.26 18 26.86 16 0.026 11 6.94 11 14.40
Will Larson* 24.88 17 8.75 14 24.37 11 0.029 19 7.08 17 15.60
Pod 24.23 14 9.10 16 26.54 15 0.035 21 7.04 13 15.80
Razzball 24.57 15 8.90 15 27.45 19 0.027 14 7.14 18 16.20
CBS Sportsline 26.28 19 9.94 21 26.90 17 0.027 15 7.06 15 17.40
ESPN 25.88 18 9.88 20 27.25 18 0.028 17 7.32 19 18.40
Fangraphs Fans 27.20 21 9.24 17 28.98 21 0.029 18 7.62 20 19.40
Prior Season 26.56 20 9.39 19 28.77 20 0.033 20 7.84 21 20.00

 

R^2, Hitters:

system r rank hr rank rbi rank avg rank sb rank AVG
Dan Rosenheck* 0.267 1 0.329 1 0.181 1 0.373 2 0.679 3 1.60
Steamer/Razzball* 0.143 12 0.270 5 0.150 8 0.325 5 0.689 1 6.20
Will Larson* 0.162 10 0.263 8 0.165 5 0.320 6 0.676 4 6.60
Rotochamp 0.227 2 0.268 7 0.127 15 0.293 9 0.675 5 7.60
Cairo 0.166 7 0.259 10 0.165 4 0.288 12 0.659 8 8.20
AggPro* 0.129 15 0.269 6 0.141 11 0.352 3 0.660 7 8.40
Rotovalue 0.164 8 0.272 3 0.167 2 0.278 14 0.574 18 9.00
ESPN 0.166 6 0.253 12 0.166 3 0.273 16 0.656 10 9.40
Steamer 0.130 14 0.260 9 0.135 12 0.317 7 0.661 6 9.60
Bayesball 0.144 11 0.235 17 0.148 9 0.424 1 0.655 11 9.80
ZIPS 0.180 4 0.244 14 0.124 16 0.347 4 0.652 12 10.00
CBS Sportsline 0.162 9 0.243 15 0.151 7 0.266 18 0.682 2 10.20
Pod 0.183 3 0.271 4 0.128 14 0.111 21 0.641 14 11.20
Razzball 0.128 16 0.281 2 0.159 6 0.256 19 0.639 15 11.60
MORPS 0.174 5 0.217 19 0.132 13 0.288 13 0.636 16 13.20
Fangraphs Fans 0.103 19 0.255 11 0.116 18 0.289 11 0.657 9 13.60
Clay Davenport 0.134 13 0.237 16 0.143 10 0.271 17 0.622 17 14.60
Oliver 0.065 21 0.223 18 0.101 20 0.289 10 0.648 13 16.40
Marcel 0.119 17 0.250 13 0.122 17 0.275 15 0.515 21 16.60
Guru 0.118 18 0.210 20 0.109 19 0.311 8 0.555 19 16.80
Prior Season 0.094 20 0.206 21 0.093 21 0.197 20 0.525 20 20.40

 

RMSE, Pitchers:

system W rank ERA rank WHIP rank SO rank AVG
Will Larson* 4.77 2 0.992 6 0.148 10 56.62 1 4.75
Beans 4.82 4 0.983 3 0.148 11 58.88 8 6.50
Marcel 4.90 8 1.003 11 0.143 4 57.93 5 7.00
Rotovalue 4.83 6 0.978 2 0.151 17 57.26 3 7.00
ZIPS 5.06 15 0.965 1 0.139 1 60.06 14 7.75
AggPro* 4.94 9 0.992 7 0.144 7 59.18 10 8.25
MORPS 4.71 1 1.026 18 0.149 13 56.69 2 8.50
Pod 4.82 5 0.995 10 0.144 8 59.75 12 8.75
Steamer/Razzball* 4.89 7 1.004 12 0.150 15 58.20 6 10.00
Clay Davenport 4.78 3 1.015 15 0.148 12 59.80 13 10.75
Steamer 4.94 10 1.006 14 0.150 16 57.89 4 11.00
ESPN 5.40 18 0.994 8 0.141 3 63.31 17 11.50
Rotochamp 5.04 14 0.989 4 0.145 9 64.18 19 11.50
Razzball 5.25 17 0.990 5 0.149 14 62.89 16 13.00
Guru 4.96 12 1.055 19 0.144 6 61.96 15 13.00
Fangraphs Fans 5.56 20 1.005 13 0.141 2 64.09 18 13.25
CBS Sportsline 5.47 19 0.995 9 0.143 5 67.18 21 13.50
Cairo 4.96 11 1.022 17 0.170 21 58.76 7 14.00
Oliver 5.12 16 1.019 16 0.151 18 59.73 11 15.25
Bayesball 5.04 13 1.082 20 0.163 19 59.11 9 15.25
Prior Season 5.64 21 1.157 21 0.169 20 64.99 20 20.50

 

R^2 Pitchers:

system W rank ERA rank WHIP rank SO rank AVG
Pod 0.229 1 0.174 9 0.302 5 0.134 4 4.75
Beans 0.184 5 0.196 3 0.269 10 0.136 2 5.00
Will Larson* 0.194 3 0.199 2 0.269 11 0.133 5 5.25
AggPro* 0.190 4 0.190 6 0.287 7 0.121 8 6.25
ZIPS 0.137 12 0.207 1 0.331 2 0.102 14 7.25
CBS Sportsline 0.222 2 0.176 8 0.330 3 0.079 19 8.00
Rotovalue 0.158 9 0.183 7 0.242 16 0.179 1 8.25
Fangraphs Fans 0.122 16 0.161 13 0.372 1 0.125 6 9.00
Steamer/Razzball* 0.167 8 0.192 4 0.254 14 0.111 10 9.00
Marcel 0.137 13 0.146 14 0.302 6 0.122 7 10.00
ESPN 0.146 11 0.171 11 0.309 4 0.101 16 10.50
MORPS 0.181 6 0.112 18 0.236 17 0.134 3 11.00
Steamer 0.128 15 0.192 5 0.254 13 0.104 13 11.50
Clay Davenport 0.177 7 0.120 15 0.252 15 0.117 9 11.50
Razzball 0.154 10 0.174 10 0.257 12 0.097 17 12.25
Guru 0.115 17 0.106 19 0.281 9 0.109 11 14.00
Oliver 0.133 14 0.119 16 0.225 18 0.107 12 15.00
Rotochamp 0.079 20 0.170 12 0.283 8 0.037 21 15.25
Cairo 0.115 18 0.118 17 0.178 19 0.097 18 18.00
Prior Season 0.088 19 0.028 21 0.164 20 0.102 15 18.75
Bayesball 0.077 21 0.103 20 0.159 21 0.060 20 20.50

 





13 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Rudy Gamble
9 years ago

Thanks Will for adding us in for the re-run.

One note on the Steamer/Razzball projections. I’d argue they are original. Steamer is just projecting rate stats and the two variations for 2015 are: 1) FanGraph depth charts for playing time and 2) Razzball playing time. I make an adjustment on RHP/LHP PA % but I’d argue that the ‘strutural’ label is best served for aggregates.

Thanks again!

Will
9 years ago

I see what you’re saying, but I’d counter that what you’re doing is still a combination of forecasts. Yours is X=X_steamer/PA_steamer*PA_razzball, whereas a strict weighted average is something like X=.4X_steamer+.6X_zips

I also want to emphasize that forecast combination is a tried and true method of improving forecasts, and I think it is completely legit. The goal here is to predict what players will do before the season starts, full stop. So kudos for doing a great job this year.

Rudy Gamble
9 years ago
Reply to  Will

Fair enough. Thanks again for including us in the re-run!

willclark
9 years ago

At this point i assume most people in my league do some time of z-score/replacement from Steamer, Zips, or a hybrid. What’s your opinion on creating zcores by category for those projection engines that ranked highest here. Could that result in modest overall accuracy, or is it a fool’s errand?

Will
9 years ago

I’m assuming that by zscore/replacement, you mean you calculate zscores by category then re-base the zscore to be equal to zero for the replacement level. That’s the method that I use and it usually works quite well (some categories are more variable from year to year, so I deflate certain zscores, but that’s another article).

If I’m interpreting your question right, you’re suggesting using zscores from the top performer in Wins (Pod) then using the zscores from the top performer in ERA (Zips), and so on. I think this would help a bit for sure.

You may be able to improve on this by computing a simple average of all the forecasts you can get your hands on, say, in the top half. Then do the zscore of that weighted average. This would be a simpler (but similar) version than AggPro, but would probably do almost as good.

willclark
9 years ago
Reply to  Will

Yes, you interpreted it correctly. Thanks, i’m off to spend a few hours in excel trying this. I appreciate it.

jss
9 years ago

For hitters, can you add OPS?

Also I think there is a big difference between rate and counting stats, as some systems don’t really try to predict actual playing time.

Will
9 years ago
Reply to  jss

I’ve been asked that question a number of times, and right now, I’m sticking with top-line statistics to keep things simple. If it turns out that everyone is projecting OBP/SLG/OPS, then I may consider adding it for 2015 and evaluating it at the end of the season.

evo34
9 years ago
Reply to  jss

I agree. There should be two analyses: one that strictly measure playing time accuracy, and one that measures all stats as rates (per PA/IP).

Jared
9 years ago

Thanks Will! I have two questions about composite forecasts.

1. Are yours coming out soon?

2. If I’m creating my own, is there any reason not to divide all stats by their projection’s PA first (rather than averaging over the raw stats)?

Will
9 years ago
Reply to  Jared

1. I’ll be releasing mine on the last day before the start of the season. My projections use others as inputs, so I want my inputs to be as good as they can be!

2. That is a very good question, and one that I can’t answer. Academic forecasting research tends to suggest that forecasting an aggregate directly is better than disaggregating-forecasting-then aggregating, but I can’t say for sure when it comes to baseball. Give both a shot, send them in to me, and I’ll run them at the end of the season to see how they both do.

cdispoto
9 years ago

I see no indication of 2015 models for many of the projection systems mentioned: BEANS, MORPS, Marcel, Oliver, etc. Are there any updates on these?

Josephrandall
8 years ago

Does anyone know how Baseball HQ compares? I am looking for a projection system to use for Runs and RBI throughout the season, I have been using a combo of Steamer/Zips but would like to get a bit more accurate.