September 05, 2009
Correlation Between Prior and Current Year Performance
First off, quick blog note: I'm going to move all of the technical statistical stuff after the "jump". Before the jump, I will try to summarize the statistical finding in layman's terms. After the jump, I'll throw all of the stats work. Onward.
Today's Question: What is the correlation, if any, between a current year's performance and a prior year's performance. How good of a predictor is the prior year? If it was 100%, then the prior year's performance would exactly predict the current year. If it was 0%, then there would be absolutely no correlation (and you should ignore all prior stats when drafting)
The Answer: Roughly between 50-65%. So, when drafting a player, only 50 to 65% of his performance this year can be predicted by previous year's stats. This seems pretty low. There are a few relevant factors in the modern NFL that contribute to this:
- High Prevalence of Injuries. Think Tom Brady in 2008.
- Variation in Team Performance. The league is configured, such that, the "playing field" is pretty level. So teams may perform well one year and terribly the next.
- Fairly Small Sample Size. They only play 16 games per year. Compare this to the NBA (82), MLB (162) or NHL (82).
- Plaxico Burress / Michael Vick Effect. Guys get in off-the-field trouble
What does this mean for your draft strategy?
- Prior year performance IS relevant but should be discounted to about 50 or 65% of your choice
- Pay attention to non-performance metrics: Is the guy healthy? Who does he have around him? Where is the team positioned? What is their schedule like? Does the QB have a good offensive line?
To get these results, I compiled a list of the top 300 players in 2008 and examined the correlation between their 08 and 07 performance, 07 and 06, and so on. Obviously, since it was the top 300 players in 2008, as we get further away the results will be less accurate. The results of the correlation test are (as represented by the 95% confidence intervals and all are significant to at least the .001 p-value)
2007->2008 = 0.5029116, 0.6530814 [p-value < 2.2e-16]
2006->2007 = 0.5120332, 0.6600707 [p-value < 2.2e-16]
2005->2006 = 0.5514062, 0.6899482 [p-value < 2.2e-16]
Beyond 2005, the data won't be very accurate.
Posted by haydenth at September 5, 2009 09:41 AM
Comments
I would love a copy the statistics in CSV format.
email: fsjay@hotmail.com
thanks
frank
Posted by: fsjay@hotmail.com at November 4, 2009 11:44 PM
Login to leave a comment. Create a new account.