Projection Systems

In an attempt to predict the future, Projection Systems use a player’s past statistics, age, home ballpark, and other variables to estimate how well a player will perform in the upcoming season. Since projections cannot account for luck and random variation, they are never 100% accurate. Instead, these systems are best viewed as an estimate of a player’s current, underlying true talent level. If a player is projected to hit for a .370 wOBA, that suggests he has a talent level of a .370 wOBA hitter. It’s therefore likely that he will finish with a wOBA somewhere around that total, but it’s not a certainty.

There are around seven different projection systems worth knowing about. Each system uses slightly different inputs and weights, so you will see variations between the systems. Here are the major systems that you should know, including general pointers on each:

● Marcel - Developed by Tom Tiger, Marcel is a simple projection system that is still quite reliable. I’ll let Tango do the explaining:

“The Marcel the Monkey Forecasting System (or the Marcels for short) is the most advanced forecasting system ever conceived.  Not.  Actually, it is the most basic forecasting system you can have, that uses as little intelligence as possible. So, that’s the allusion to the monkey. It uses 3 years of MLB data, with the most recent data weighted heavier. It regresses towards the mean. And it has an age factor.”

Theoretically, projections that do more work than Marcels (like ZiPS, Bill James, CAIRO, Oliver, PECOTA) will be more accurate, but in the past, other systems have only added a small increase in accuracy. Even though it is very basic, the Marcel system is still quite accurate and serves as a good reference point when looking at other projections. It can be found on FanGraphs, and you can also calculate your own quick-and-dirty Marcel projections using this calculator.

● Bill James - Created by Baseball Info Solutions, the Bill James projections uses at most eight seasons of data per player, with a strong focus on the previous three. While the exact methodology is proprietary, the Bill James projections are based on past performance, age, home park, and expected playing time. His projections tend to be the most optimistic of all the major systems, especially with young players. It can also be found on FanGraphs.

● ZiPS - The work of Dan Szymborski over at Baseball Think Factory, the ZiPS projections uses weighted averages of four years of data (three if a player is very old or very young), regresses pitchers based on DIPS theory and BABIP rates, and adjusts for aging by looking at similar players and their aging trends. It’s an effective projection system, and is displayed at FanGraphs for off-season and in-season projections.

● Fans - During the off-season following the 2009 season, FanGraphs began the the Fan projections, which rely upon a “wisdom of the crowds” approach at evaluating a player. Fans are asked to fill out ballots on various players, ranking how they expect those players to perform in the upcoming season. Ballots are they compiled and averaged for each player, giving us their Fan projection.  These projections are normally quite optimistic, but in some cases they can add real value about players that may follow an unusual career path. They’re also a good way to estimate a player’s potential playing time, which is a variable that most projection systems struggle with. These can, obviously, be found on FanGraphs.

● Oliver - This system was created by Brian Cartwright and is available over at The Hardball Times. It’s a comparatively simple projection system – using weighted averages of the past three seasons of data, and adjusting for aging and regression – but it calculates its major league equivalencies (MLEs) in a different way than most systems, taking the raw numbers and adjusting them based on park and league. Since most projection systems simply try to adjust for the transition between each minor-league level, Oliver’s projections are better when showing how young players will perform at the major league level. This is also the only projection system to include a fielding and WAR component.

● CAIRO - A system developed by the folks at Revenge of the RLYW, the CAIRO system starts with a basic Marcel projection model, but then includes minor league statistics, adjusts for park and league effects, adjusts the aging curve depending upon the statistic, takes age and position into account when regressing a player’s performance, and uses four years of data instead of three. These projections are then put into the Diamond Mind simulator, and team projections are estimated using the results of 50,000 simulations.

● PECOTA - Developed by Nate Silver and Baseball Prospectus, PECOTA is one of the more complicated projection models, using a player’s statistics and historical statistics of similar ballplayers to arrive at a projection. PECOTA also does projections on a team level and creates a list of comparable historical players for each projection. You can find PECOTA at Baseball Prospectus.

● CHONE - Developed by Sean Smith, this system used four years of data for hitters and three years for pitchers. It adjusted for park, league, and aging effects, and it also uses batted ball data and minor league statistics. CHONE was widely considered one of the most accurate projection system, but it is no longer available to the public.

Links for Further Reading:

Looking at Baseball Projection Systems – FOX Sports

Rich Hill and 50th Percentile Projections – FanGraphs

2009 Forecast Evaluations – Steamer Projections