Projection vs Projection

It’s almost opening day, and it seems like everyone is talking about projections.

When considering a projection, there are really two questions to be answered – what is the player’s “True Talent Level” right now, and how will he perform next year? Between now and the end of next year, his talent level very well might change, as he’s a year older and might recover from or succumb to injuries. Even then, there’s still the random variance of a single season performance. In this article I’d like to explore how some of the major projection systems work when predicting different subgroups of players.

I tested the following projections: PECOTA (2006-2009), ZiPS (2006-2009) CHONE (2007-2009) and my own Oliver (2006-2009).

By wOBA

The first test was to group the yearly projections to the nearest .010 of wOBA, and then see how that group of players actually performed. There were 468 players who had projections from all four systems, and had at least 350 plate appearances in the major leagues in the following season. As 2009 is yet to be played, and CHONE is not available for 2006, these projections to next year comparisons are for the 2007 and 2008 seasons. All four projections were tested on the same 468 players. The observed results were unadjusted major league stats, so that the results of the test would not be influenced by which park factors or MLE formulas I chose to normalize stats.

To read the results, CHONE of the players would have a wOBA between .375 and .385, averaging .380, 25 of them had 350 or more PAs in MLB in the following seasons, and those 25 players had an average wOBA of .363, so at that level CHONE was .017 high. Oliver was .008 high on 21 projections, PECOTA .027 high on 26, and ZiPS .014 on 26. The last line of the table shows the root mean square error (weighted by number of players). Oliver had the lowest mean error at .006, followed by CHONE .011 and PECOTA and ZiPS at .012 each.

wOBA CHONE Oliver PECOTA ZiPS
Obs Players Error Players Error Players Error Players Error
0.250 0 0.000 0 0.000 0 0.000 1 -0.067
0.260 0 0.000 0 0.000 1 -0.041 1 -0.018
0.270 2 -0.057 1 0.001 3 -0.013 1 -0.043
0.280 2 -0.018 4 -0.036 2 -0.045 4 -0.022
0.290 8 -0.033 9 -0.017 11 -0.030 13 -0.020
0.300 14 -0.005 23 -0.010 20 -0.013 20 -0.012
0.310 29 -0.006 33 -0.002 31 -0.007 19 0.003
0.320 44 -0.005 53 -0.005 37 0.002 51 0.000
0.330 74 0.004 81 -0.002 58 0.003 56 0.000
0.340 91 0.000 87 -0.003 66 0.004 66 0.002
0.350 57 0.004 68 0.001 80 -0.004 74 0.001
0.360 50 0.009 48 -0.003 56 0.011 55 0.012
0.370 34 0.011 21 -0.004 33 0.012 36 0.012
0.380 25 0.017 21 0.008 26 0.027 26 0.014
0.390 9 0.003 10 -0.002 17 0.014 19 0.020
0.400 13 0.019 5 0.020 15 0.019 7 0.017
0.410 7 0.017 2 0.011 5 0.017 4 0.019
0.420 4 0.037 1 -0.049 5 0.027 6 0.029
0.430 2 0.047 1 0.001 1 -0.035 5 0.041
0.440 2 -0.009 0 0.000 1 0.018 3 0.023
0.450 1 0.025 0 0.000 0 0.000 1 0.026
rms 468 0.011 468 0.006 468 0.012 468 0.012

By Age

The same 468 players, same rules, but now the players are grouped by age. The combined rms error is about the same for all, at .007 for Oliver and .008 for the other three. CHONE and ZiPS are a few points of wOBA high for most ages. Oliver under projects younger (pre-peak) players at .005-.010 points of wOBA, and over projects older players about the same amount. PECOTA is the opposite, being a little high for the younger players and a little low for the older ones. Oliver shows the lowest total error (bias) of -.002, but because of it’s error correlating with age, Oliver shows the highest r2 correlation factor of .206 (for ages 21-35, which have 12 or more players each).

Age Players PA CHONE Oliver PECOTA ZiPS
19 1 411 -0.004 -0.007 0.016 -0.022
20 3 1485 0.017 0.022 0.026 0.014
21 12 6587 0.003 -0.002 0.005 0.006
22 23 12205 0.002 -0.006 0.011 0.004
23 38 21423 0.001 -0.009 0.001 0.001
24 36 20677 -0.002 -0.009 0.002 0.002
25 37 20538 0.000 -0.006 0.001 0.002
26 39 21891 0.011 0.005 0.014 0.016
27 44 23580 0.003 -0.004 0.007 0.007
28 35 19038 -0.010 -0.011 -0.008 -0.005
29 34 17434 0.010 0.001 0.006 0.008
30 32 18491 0.007 -0.006 0.003 0.004
31 37 19013 0.020 0.008 0.011 0.015
32 24 13975 0.002 -0.004 -0.004 0.000
33 18 9702 0.003 0.004 -0.001 0.003
34 17 8545 0.005 0.004 0.012 0.014
35 13 7063 -0.001 -0.003 -0.003 -0.002
36 7 3714 0.000 0.001 -0.004 0.001
37 5 2295 0.007 0.010 -0.011 0.005
38 5 2580 -0.010 0.009 0.008 0.000
39 6 2699 0.009 0.008 0.023 0.007
40 1 548 0.026 0.031 0.037 0.036
41 1 434 0.016 -0.061 -0.005 0.011
rms 468 254328 0.008 0.007 0.008 0.008
bias 468 254328 0.004 -0.002 0.004 0.005
r2 468 254328 0.031 0.206 0.037 0.033

In the final part of this series, I’ll look at how minor league performances are evaluated.





Brian got his start in amateur baseball, as the statistician for his local college summer league in Johnstown, Pa, which also hosts the annual All-American Amateur Baseball Association. A longtime APBA and Strat-o-Matic player, he still tends to look at everything as a simulation. He has also written for StatSpeak and SeamHeads You can contact him at brian.cartwright2@verizon.net

19 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Paul Scott
15 years ago

Just a few suggestions:

I think two other “projection” systems should always be added to these “projection surveys.” The first is just a mean of each players last three years. This would serve as a baseline test. Secondly, Marcels should be included, given it’s simplicity. My feeling on this one is that if you can’t do better than Marcels by a reasonable degree then you really need to evaluate whether it is worth your time to do the projection.

Finally, though I understand this is much harder, I like the look you are taking on your age correlation. Since all of these systems are very likely to be close in overall accuracy, the most interesting and meaningful factor is bias. You should consider breaking out more categories – a recent (if dense) study (linked to from and discussed on The Book Blog) indicated PECOTA likely has a bias overvaluing speed. I’d like to see a lot more done in evaluating biases of projection systems as this sort of thing could lead to an understanding of some real effect in baseball.