Welcome to the first in a series of posts that will review the 2009 season (as well as 2008) from an adjusted stats perspective. Some of you will read these posts, fewer will read all of these posts, and only HenryJames will print them out and post them on his wall.
The first several posts will be based on analyses that I've run using the data. As always, any thoughts on the studies or especially ideas for new topics are appreciated. Some of these, such as today's first part, will be a little lengthy. The studies that I've put in the schedule below have already been run, but suggestions for new studies are welcome and I will run the ones that appeal to me and write on them after the initial ones have been published. Later posts will refer back to ideas or results from the earlier ones, so keep that in mind.
Tentative Publication Schedule
Part I: Which Stats Correlate Best to Winning? - 6/24/10
Part II: Drilling Down and Regression Models - 6/28/10
Part III: Passing Efficiency Formula - 6/30/10
Part IV: Conference Strengths and Pace - 7/2/10
Part V: Testing Conventional Wisdom - 7/16/10
Part VI: Team Matchups - 7/20/10
Part VII: Year-to-Year Changes - 7/22/10
Part VIII: Points per Yard Efficiencies - 7/26/10
Part IX: Data Dump - Team Rankings - 7/28/10
Part X: Data Dump - Team Reports - 7/30/10
Which Stats Correlate Best to Winning?
I touched on this issue last year based on an initial effort by the Texas coaching staff. But now I have more data from 2008 as well as a second full season in 2009 to use in the analysis. I won't repeat their results which can be read at the provided link. Instead I'll fast forward to the point where I used point per game statistics as a control group to verify the methodology and proceed from there.
The Control Group
The adjusted stats data includes points scored, points allowed, and point margin data on per game, per possession, and per play bases. This leaves us with nine statistics to correlate to winning, all of which should be at the top of the list of all stats in correlation to winning. As expected, this was indeed the case as these statistics occupy the top nine positions in the correlation study.
Rank | Statistic | 2009 Coefficient | 2008 Coefficient | Average |
---|---|---|---|---|
1 | Point Margin per Game | 0.943 | 0.935 | 0.939 |
2 | Point Margin per Possession | 0.937 | 0.936 | 0.936 |
3 | Point Margin per Play | 0.930 | 0.916 | 0.923 |
4 | Points per Play | 0.836 | 0.831 | 0.833 |
5 | Points Allowed per Game | -0.836 | -0.811 | -0.823 |
6 | Points per Game | 0.824 | 0.816 | 0.820 |
7 | Points Allowed per Play | -0.824 | -0.801 | -0.812 |
8 | Points Allowed per Possession | -0.804 | -0.810 | -0.807 |
9 | Points per Possession | 0.786 | 0.788 | 0.787 |
The statistics are sorted by the absolute value of their average correlation coefficient between 2008 and 2009; this effectively ranks them by the strength of the correlation. Negative coefficients signify relationships where a higher value for one variable correlates to a lower value in the other. As an example from the above results, the higher a team’s points allowed per game statistic, the lower their winning percentage is expected to be. The absolute value was taken after averaging the coefficients, as we’ll see later this moves statistics down the ranking list that oscillated between positive and negative correlation. This is desirable, however, as such inconsistent correlative behavior indicates a weak or non-correlation.
Per Game Stats
These points-related statistics have been presented here merely for reference. We should have expected a highly positive correlation between point margin and winning, of course; wins are assigned based on the point margin of each game. What may be more elucidating are the more detailed statistics and their individual correlations with winning. The next table shows the results of using the same stats used in the initial study with the exception of Scoring Offense and Scoring Defense, which are included with the points-related results above.
Rank | Statistic | 2009 Coefficient | 2008 Coefficient | Average |
---|---|---|---|---|
1 | Rushing Yards Allowed per Game | -0.751 | -0.764 | -0.757 |
2 | Yards Allowed per Game | -0.741 | -0.693 | -0.717 |
3 | Yards per Game | 0.620 | 0.607 | 0.613 |
4 | Turnover Margin per Game | 0.636 | 0.421 | 0.528 |
5 | Rushing Yards per Game | 0.509 | 0.483 | 0.496 |
6 | Passing Yards Allowed per Game | -0.348 | -0.137 | -0.243 |
7 | Passing Yards per Game | 0.190 | 0.236 | 0.213 |
Once again we see rush defense ranking at the top of the list. Furthermore, it is somewhat confusingly ahead of the total yardage statistics. Additionally we see pass defense rated very near the bottom of the list ahead of only pass offense. Is defending the run really that much more important than defending the pass? Or is it much more likely that teams that are winning simply face fewer rush attempts as their opponents play catchup? We may be looking at causation in reverse when it comes to rush and pass defense on a per game yardage total basis.
Rate Stats
Let’s look at the previous chart using relevant rate stats instead. I have utilized turnover margin per possession instead of per play as it would seem more pertinent to a rate-based view of the game. Furthermore, I have substituted total passing and total rushing in place of their standard statistical equivalents. This moves sacks and sack yardage from rushing to passing to better reflect performance in the run game versus the pass game.
Rank | Statistic | 2009 Coefficient | 2008 Coefficient | Average |
---|---|---|---|---|
1 | Yards Allowed per Play | -0.730 | -0.696 | -0.713 |
2 | Total Passing Allowed per Attempt | -0.711 | -0.682 | -0.697 |
3 | Total Passing per Attempt | 0.730 | 0.660 | 0.695 |
4 | Yards per Play | 0.686 | 0.646 | 0.666 |
5 | Total Rushing Allowed per Carry | -0.643 | -0.616 | -0.629 |
6 | Turnover Margin per Possession | 0.589 | 0.521 | 0.555 |
7 | Total Rushing per Carry | 0.468 | 0.432 | 0.450 |
Some interesting changes have now taken place. Total defense has now moved to the top of the list very slightly ahead of the passing game, both offensively and defensively. The running game has dropped down the list with rushing offense being the least correlated to winning of the listed statistics. The coefficient values have increased and now indicate that all of the stats are significantly correlated to winning.
Miscellaneous Items of Interest
Below is a table containing a handful of statistics which yielded results that were either interesting to me, counterintuitive, or antithetical to standard football thought.
Statistic | 2009 Coefficient | 2008 Coefficient | Average |
---|---|---|---|
Net Kickoff Average | 0.543 | 0.469 | 0.506 |
Punt Return Average | 0.448 | 0.435 | 0.442 |
Kickoff Return Average | 0.417 | 0.266 | 0.342 |
Time per Offensive Play | 0.309 | 0.346 | 0.327 |
Net Punting Average | 0.289 | 0.298 | 0.294 |
Time of Possession | 0.235 | 0.301 | 0.268 |
Penalties per Game | 0.010 | -0.083 | -0.037 |
Plays per Game | -0.052 | 0.031 | -0.011 |
Penalty Yards per Game | 0.028 | -0.041 | -0.007 |
Possessions per Game | -0.011 | 0.020 | 0.004 |
The special teams stats at the top of the above list bring up another question of causation. Do teams win more because they’re better at kickoff coverage or do they become better because they get more practice (i.e., they’re scoring more points)? Perhaps the reason you would rather be better at kickoff than returning kickoffs is because you want to have more practice at the former. Similarly, being good at returning punts is more strongly correlated with winning than being good at covering them.
Overall pace of a team’s games is basically uncorrelated to winning. Whether measured in plays per game or possessions per game, the coefficient is close to zero.
Where a correlation exists, however, is with the statistics that are related to holding onto the ball. Both time of possession and time per offensive play (measured in seconds per play – a higher value indicates more possession time) show somewhat significant correlations to winning. As was the case with rushing yards per game, though, we should be careful drawing too many conclusions with this relationship because the causal direction may be the opposite of what we first consider. Once again, teams that are winning are more likely to run the ball in an effort to burn clock. Teams that are losing are more likely to pass the ball in an effort to score quickly and conserve time.
Correlation coefficients tell us nothing about which variable is the cause and which is the effect. We merely know that both time of possession and time per offensive play are positively correlated with winning. The numbers don’t tell us whether ball control causes winning or winning causes ball control. If we were to measure home fan happiness during games we would find an extremely positive correlation between happiness and winning. Obviously it’s not that a generally jovial fanbase causes team success but instead that team success creates happy fans.
The next (large) table lists the rest of the available statistics’ correlation coefficients.
Statistic | 2009 Coefficient | 2008 Coefficient | Average |
---|---|---|---|
Passing Efficiency Allowed | -0.707 | -0.723 | -0.715 |
Yards Allowed per Possession | -0.726 | -0.699 | -0.712 |
Passing Efficiency | 0.728 | 0.669 | 0.698 |
Yards per Attempt | 0.699 | 0.662 | 0.681 |
Yards Allowed per Attempt | -0.703 | -0.631 | -0.667 |
Yards per Possession | 0.640 | 0.638 | 0.639 |
Sacks per Game | 0.586 | 0.609 | 0.598 |
Interceptions per Game | 0.586 | 0.537 | 0.562 |
Yards per Carry | 0.566 | 0.501 | 0.533 |
Turnover Margin per Play | 0.640 | 0.421 | 0.531 |
Turnovers Forced per Play | 0.514 | 0.541 | 0.527 |
Red Zone Scoring % | 0.552 | 0.490 | 0.521 |
Turnovers per Play | -0.543 | -0.453 | -0.498 |
Rushing Yards per Game | 0.509 | 0.483 | 0.496 |
Turnovers Forced per Game | 0.472 | 0.505 | 0.488 |
Sacks per Attempt | 0.464 | 0.465 | 0.465 |
Turnovers per Game | -0.518 | -0.410 | -0.464 |
Turnovers Forced per Possession | 0.440 | 0.469 | 0.454 |
Turnovers per Possession | -0.499 | -0.394 | -0.446 |
Interceptions Thrown per Game | -0.475 | -0.407 | -0.441 |
Interceptions per Attempt | 0.451 | 0.427 | 0.439 |
Sacks Allowed per Game | -0.488 | -0.381 | -0.435 |
Red Zone Scoring % Allowed | -0.418 | -0.435 | -0.426 |
3rd Down Conversion % Allowed | -0.709 | -0.137 | -0.423 |
Field Goals per Game | 0.417 | 0.307 | 0.362 |
3rd Down Conversion % | 0.557 | 0.164 | 0.360 |
Interceptions Thrown per Attempt | -0.365 | -0.329 | -0.347 |
4th Down Conversion % Allowed | -0.274 | -0.382 | -0.328 |
Total Passing Allowed per Game | -0.402 | -0.228 | -0.315 |
Fumbles per Carry | -0.346 | -0.236 | -0.291 |
4th Down Conversion % | 0.249 | 0.332 | 0.290 |
Field Goal Percentage | 0.313 | 0.261 | 0.287 |
Sacks Allowed per Attempt | -0.322 | -0.221 | -0.272 |
Total Passing per Game | 0.241 | 0.268 | 0.254 |
Forced Fumbles per Carry | 0.141 | 0.315 | 0.228 |
Touchback Percentage (unadjusted) | 0.171 | 0.175 | 0.173 |
Fumbles per Game | -0.236 | -0.088 | -0.162 |
Forced Fumbles per Game | -0.040 | 0.143 | 0.051 |
Defensive Plays per Game | -0.034 | 0.066 | 0.016 |
Offensive Plays per Game | -0.036 | 0.032 | -0.002 |
Thoughts and comments are welcome if you've made it this far. As stated above, the listed studies in the schedule have already been completed so it's possible that some of the questions that people may have will be addressed in the future posts.