FanGraphs Baseball - Comments on How Good Is That Projection?
RSS feed for comments on this post.

## Leave a comment

Line and paragraph breaks automatic, e-mail address never displayed, HTML allowed: `<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> `

0.065 Powered by **WordPress**

Cool. Some good stuff in here that I haven’t read/seen previously. That factor of .7 thing is pretty interesting too.

When will we get park factored wOBA (the one used on this site that includes baserunning etc) on the projection page? This information could be useful.

Comment by bikozu — March 17, 2009 @ 3:51 am

Reminds me of one of the points I need to lead the comparative article with – Oliver’s wOBAs are park adjusted, representing the player in a neutral park. Each batting component (SI, DO, TR, HR, BB, SO) is normalized by ballpark, league and age, then reassembled into a batting line, then finally calculate the new rate stats, including wOBA. To the best of my knowledge, ZiPS and CHONE are adjusted to the player’s home park at the time the projection was made. Marcel does not park adjust. Oliver will have lower wOBAs than ZiPS or CHONE for players in Colorado, Milwaukee, Philadelphia, Cincinnatti, etc, but I am also working on a new formula for HR factors which will not dock the premier power hitters as much, but the medium and lower guys more so, when they plau in a small park.

Comment by Brian Cartwright — March 17, 2009 @ 11:05 am

What we have here is an equation x – y = z. We have worked to reduce the mean error of x in order to reduce z – but y is still the same. The result of a computation is only as accurate as its least accurate input. No matter how perfect any one projection system is, as long as itâ€™s mean error is less than that of comparing any two consecutive seasons, that increased accuracy will be masked by the noise inherent in a single season measurement.This statement seems ridiculous to me. If you have x – y = z where x is year 1 performance and y is year 2 performance and you know the error range for a single years performances is +-10 then the error range for z is going to be +-20. If you instead use multiple years, and adjust for aging to try and more closely have x represent true talent and you are successful so that your error range of x is decreased to +-5, then the error range of z is going to be +-15. The actual mean error will depend on the distribution curves for each equation, but you should end up with a mean error of the 2nd example being about 75% of the first. That you did not is not a result of “the increased accuracy being masked by the noise inherent in a single season measurement”. It is a result a result of the particular manipulations that you performed to give you your projection having failed to provide you with any better predictor than the previous year’s stats.

Comment by Peter Jensen — March 17, 2009 @ 2:07 pm

What I did not explicitly state was that in the first two charts, when comparing two rates, the sample size is the lower of the two. So it was 350 or more PA in year 1 compared to 325-375 PA in year 2, then 400 or more PA in year 1 compared to 375-425 PA in year 2, etc. This is standard procedure to express the results by the lower of the two sample sizes. General matched pairs methodology does it the same way, the counting stats of the larger sample are scaled down to the size of the smaller sample, or to the harmonic mean of the two sample sizes, which will not be too much larger than the smaller. This is because the level of unreliability of the smaller of the two samples will determine the outcome. I was saving comparing projections to a later article, but I could have shown that ZiPS and PECOTA gave just about the same results in chart 3. For the various subgroup tests that I have run, the rms error is always just about .030 wOBA, regardless of which of the three projections. Therefor, I’ve concentrated on total error, looking for any biases.

Comment by Brian Cartwright — March 17, 2009 @ 5:44 pm

Brian – Perhaps I have misunderstood what are you are doing. Could you explain again what exactly you are comparing in chart 2 and how it differs from chart 3?

Comment by Peter Jensen — March 18, 2009 @ 12:52 am

Are Oliver projections available for previous seasons? If so, where can I find these?

Comment by Ben — March 27, 2009 @ 1:07 pm

Oliver projections for previous seasons have not been posted on the Internet, but I can run them for all the players currently in my database, generally to 1998. However, the further back, there are a decreasing percentage of the players represented. Soon (weeks) I will have a complete set of minor league stats from 2005-2008 (GameDay data), and will add that to complete major league records 1998-2008, with scattered records for both before those dates.

I will be glad to send you an email with projections for previous seasons.

Comment by Brian Cartwright — March 27, 2009 @ 1:33 pm