The Projection Rundown: The Basics on Marcels, ZiPS, CAIRO, Oliver, and the Rest
Now that football season is over and baseball is once again close at hand, Projection Season is well underway. Fantasy players, analysts, bloggers, and plain ol’ fans – everyone turns to projections to help them this time of year. The Hot Stove has cooled down and Spring Training has just started, so really…what else is there to do?
With that in mind, I’ve got a handful of posts on projections in the works for the next week. This is the first one, and in it I deal with a basic question: what are the different projection systems available, and how are each of them calculated? In order to know how to properly use each projection, it’s always a good idea to understand what data is taken into account and how it is used. Remember: there is no one “gold standard” for projection systems. Each system will tell you something slightly different, so whenever trying to draw conclusions from projections, it’s best to use as many sources as possible.
You can also find this new information on the Library’s Projections page, so it’ll always be available there as a reference.
– Marcel – Developed by Tom Tiger, Marcel is a simple projection system that is still quite reliable. I’ll let Tango do the explaining:
“The Marcel the Monkey Forecasting System (or the Marcels for short) is the most advanced forecasting system ever conceived. Not. Actually, it is the most basic forecasting system you can have, that uses as little intelligence as possible. So, that’s the allusion to the monkey. It uses 3 years of MLB data, with the most recent data weighted heavier. It regresses towards the mean. And it has an age factor.”
Theoretically, projections that do more work than Marcels (like ZiPS, Bill James, CAIRO, Oliver, PECOTA) will be more accurate, but in the past, other systems have only added a small increase in accuracy. Even though it is very basic, the Marcel system is still quite accurate and serves as a good reference point when looking at other projections. 2011 Marcels projections can be found here and on FanGraphs.
– Bill James – Created by Baseball Info Solutions, the Bill James projections uses at most eight seasons of data per player, with a strong focus on the previous three. While the exact methodology is proprietary, the Bill James projections are based on past performance, age, home park, and expected playing time. His projections tend to be the most optimistic of all the major systems, especially with young players.
– ZiPS – The work of Dan Szymborski over at Baseball Think Factory, the ZiPS projections uses weighted averages of four years of data (three if a player is very old or very young), regresses pitchers based on DIPS theory and BABIP rates, and adjusts for aging by looking at similar players and their aging trends. It’s an effective projection system, and is displayed at FanGraphs for off-season and in-season projections.
– Oliver – This system was created by Brian Cartwright and is available over at The Hardball Times. It’s a comparatively simple projection system – using weighted averages of the past three seasons of data, and adjusting for aging and regression – but it calculates its major league equivalencies (MLEs) in a different way than most systems, taking the raw numbers and adjusting them based on park and league. Since most projection systems simply try to adjust for the transition between each minor-league level, Oliver’s projections are better when showing how young players will perform at the major league level. This is also the only projection system to include a fielding and WAR component.
– CAIRO – A system developed by the folks at Revenge of the RLYW, the CAIRO system starts with a basic Marcel projection model, but then includes minor league statistics, adjusts for park and league effects, adjusts the aging curve depending upon the statistic, takes age and position into account when regressing a player’s performance, and uses four years of data instead of three. These projections are then put into the Diamond Mind simulator, and team projections are estimated using the results of 50,000 simulations. 2011 projections can be found here.
– Fans – During the off-season following the 2009 season, FanGraphs began the the Fan projections, which rely upon a “wisdom of the crowds” approach at evaluating a player. Fans are asked to fill out ballots on various players, ranking how they expect those players to perform in the upcoming season. Ballots are they compiled and averaged for each player, giving us their Fan projection. These projections are normally quite optimistic, but in some cases they can add real value about players that may follow an unusual career path. They’re also a good way to estimate a player’s potential playing time, which is a variable that most projection systems struggle with.
– PECOTA – Developed by Nate Silver and Baseball Prospectus, PECOTA is one of the more complicated projection models, using a player’s statistics and historical statistics of similar ballplayers to arrive at a projection. Colin Wyers has done work in recent years to improve PECOTA’s accuracy, and a stripped-down version of PECOTA has been shown to be as effective as the Marcels projection system (implying that the full PECOTA would be slightly more accurate). PECOTA also does projections on a team level and creates a list of comparable historical players for each projection. You can find PECOTA at the Baseball Prospectus website.
– CHONE – Developed by Sean Smith, this system used four years of data for hitters and three years for pitchers. It adjusted for park, league, and aging effects, and it also uses batted ball data and minor league statistics. CHONE was widely considered one of the most accurate projection system, but it is no longer available to the public.
For more on the accuracy of each projection system, I recommend reading Tom Tango’s recent study.
Piper was the editor-in-chief of DRaysBay and the keeper of the FanGraphs Library.
I just use the average of all these systems, except for Fangraphs fans because for deep leagues they do not account for players who are close-to or at replacement level.
Is there anywhere to find the accuracy of each system from previous years?
The best you’ll find is the study I linked at the bottom of the article. Tango just did some excellent research: http://www.insidethebook.com/ee/index.php/site/article/testing_the_2007_2010_forecasting_systems_official_results/
Really, you can’t go wrong with whatever you choose. The most recent incarnation of Oliver is pretty good, especially with young players, and Marcels actually came off looking pretty darn good considering its simplicity.
Great article – thanks Steve!
Wonderful article Steve! Why isn’t CHONE available anymore? Does Sean charge for it?
Thanks! I’m glad you enjoyed it.
Sean got snatched up by a team this off-season, so I believe the common assumption is they want him to keep data like that private. CHONE’s not available anywhere, pay or no pay, which makes me very, very sad.
That said, the other systems are still quite awesome. Marcels is good with established players, and the Oliver system is really the tops with the youngins.
What’s the deal with the Rotochamp projections included on the projections page? How does that system stack up?
Oh geez, I’d forgotten about them. Here’s a link to their full description: http://www.rotochamp.com/Help/ProjectionsFAQ.aspx
Basically, they don’t give out too many details on how they calculate their projections. All I got from that page was that they take a weighted average of a player’s performance over the past three years, try to adjust minor league stats for players with less than 400 major league PAs, and regress players based on their expected LD, GB, and FB rates.
Thanks — doesn’t sound like they’ll add much unique value, but it’s always nice to have more options.
So is ZIPS the only one then that takes batted ball (e.g. BAPIP) and DIPS theory into account when projecting?
Shameless self-promotion: See my articles below
http://www.fangraphs.com/community/index.php/comparing-2010-pitcher-forecasts/
http://www.fangraphs.com/community/index.php/comparing-2010-hitter-forecasts-part-1-which-system-is-the-best/
http://www.fangraphs.com/community/index.php/comparing-2010-hitter-forecasts-part-2-creating-better-forecasts/
Awesome!
Great article! Just one thing, in the Marcel section you accidentally wrote Tom Tiger instead of Tom Tango. Other than that, great job!
http://kratki-proven.pl/