I've written a few articles using player projections to forecast team and player performances for the upcoming season, but I'll be the first to admit that I am far from being a statistics or projection expert. I thought it might be interesting to take a look at player projections, not the projections themselves, but the history of predicting a player's performance and methods to obtain those predictions, and to talk with SEAN SMITH a leading player projection creator (who also happens to be a big Angels' fan).
I first noticed player projections in the mid-'80's while reading the first Bill James / Stats Inc. Player Handbook. I was a die-hard fantasy baseball addict and noticed the book had published James' projection for the forthcoming season. Any tool to help dominate fantasy baseball is worth looking into, especially one that could predict which players were going to breakout or suffer a decline in production. Having accurate player projection information would absolutely give a person an advantage in league domination. James' method for player projections was based on the similarity scores of comparable players. Basically, James' system works by looking at previous player's stats that are comparable to the player being projected at the same age. By analyzing the results of future seasons for the comparable players, career trends are identified and used to project performance. To quote the Wikipedia entry for similarity scores,
"...players often follow similar career trajectories to their most similar players, so the historical similar players' performance in years after the active player's current age should be a good predictor of that active player's future production."
Another method to predict future player performance is one used by Sean Smith who been doing player projections since November of 2005, "Casey Kotchman was my first projection, when the team was contemplating signing Paul Konerko. Casey didn't do anything in 2006, but the projection and commentary fits well with his 2005-2008 seasons (link to story)". Now Smith's projections are one of three projections featured on the popular website FanGraphs. Sean's method (named CHONE projections after Angel's player Chone Figgins) uses previous players performance to predict future stats.
"I don't look at comparable players. I've never been sold on the advantage of doing so, as I don't think that the stats we use can really determine how similar two players are, and even if they did whether we can really say that a player will follow the trajectory of his similar players." Sean stated when asked about the two methods. He added, "I'd hate to penalize a player just because he has similar statistics to a couple of players who snorted their way out of the league with cocaine. I think the people who use similarity scores try to get the sample large enough to reduce this problem, but I don't trust the method, and my results are as accurate as anyone's, so it's not like I need to embrace the idea to keep up."
The CHONE system takes into account many factors, "I look at the last 4 years of stats, adjust for ballparks and age, apply regression, and use some external characteristics (player weight and speed scores) to fine tune the results" said Sean, adding "I am considering adding batted ball data (grounders/ flyballs / line drives / popups into the equations for hitters. I already do this for pitchers. Last year I did that for major league pitchers, this year I'm doing it for minor league pitchers as well."
Anyone with a spreadsheet can come up with projections, but getting the numbers right is the hard part. Based on an unbiased study of four projection systems (CHONE, Baseball Prospectus’s PECOTA, Hardball Times, and Dan Szymborski’s ZiPS) done by Tom Tango (who also creates a player projections called Marcel), the CHONE system proved to provide the most accurate projections, "By this measure, all three systems beat PECOTA. And Chone simply trounced PECOTA. It’s hard to tell with the absolute errors, but this head-to-head, which is REALLY what we want anyway, makes things crystal clear", stated Tango on his website.
However, basing a systems accuracy on which players the projections got right or how many missed can be misleading, "Being right or wrong on a particular projection means little. If you project over 1000 players as I do (this year I've done about 3500) you are going to be dead on ninja accurate on some, and horribly off on others. Like the Kotchman projection from 2005-2006. I judge the system as a whole."