Texas Longhorns Football

Adjusted Stats 2009 Year in Review Part II

By Huck L Berry Jun 29, 2010, 7:28am CDT

This is the second in a series of posts that will review the 2009 season (as well as 2008) from an adjusted stats perspective.

Tentative Publication Schedule

Part I: Which Stats Correlate Best to Winning? - 6/24/10
Part II: Drilling Down and Regression Models - 6/28/10
Part III: Passing Efficiency Formula - 6/30/10
Part IV: Conference Strengths and Pace - 7/2/10
Part V: Testing Conventional Wisdom - 7/16/10
Part VI: Team Matchups - 7/20/10
Part VII: Year-to-Year Changes - 7/22/10
Part VIII: Points per Yard Efficiencies - 7/26/10
Part IX: Data Dump - Team Rankings - 7/28/10
Part X: Data Dump - Team Reports - 7/30/10

Drilling Down and Regression Models

While correlation coefficients have been provided between individual statistics and adjusted winning percentage in the previous post, I have not delved into them in more detail because I believe that the best way to attack the idea of what contributes most to winning is on a level-by-level basis. The first level is comprised of the wins and losses themselves. The second level is points scored and points allowed because these are the statistics that contribute directly to winning and losing. So instead of attempting to correlate everything directly to winning, what statistics correlate best to scoring points and best to allowing (or preventing) points? Let’s start with scoring points on a per game basis. Leaving out the other point scoring calculations, here are the offensive stats and their correlations with scoring points (note that I have left out conversion percentages and other statistics that are a result of the below contributing stats):

Rank	Statistic	2009 Coefficient	2008 Coefficient	Average
1	Yards per Play	0.870	0.880	0.875
2	Yards per Game	0.860	0.880	0.870
3	Passing Efficiency	0.854	0.858	0.856
4	Total Passing per Attempt	0.836	0.857	0.847
5	Yards per Possession	0.818	0.844	0.831
6	Yards per Attempt	0.778	0.832	0.805
7	Yards per Carry	0.665	0.604	0.635
8	Total Rushing per Carry	0.603	0.558	0.580
9	Rushing Yards per Game	0.545	0.493	0.519
10	Turnovers per Play	-0.525	-0.495	-0.510
11	Total Passing per Game	0.453	0.556	0.505
12	Total Rushing per Game	0.521	0.474	0.497
13	Sacks Allowed per Game	-0.520	-0.434	-0.477
14	Passing Yards per Game	0.403	0.519	0.461
15	Interceptions Thrown per Attempt	-0.479	-0.406	-0.443
16	Turnovers per Possession	-0.452	-0.416	-0.434
17	Turnovers per Game	-0.448	-0.418	-0.433
18	Sacks Allowed per Attempt	-0.452	-0.399	-0.425
19	Interceptions Thrown per Game	-0.472	-0.358	-0.415
20	Punt Return Average	0.373	0.410	0.391
21	Kickoff Return Average	0.529	0.147	0.338
22	Fumbles per Carry	-0.326	-0.308	-0.317
23	Fumbles per Game	-0.172	-0.148	-0.160

It’s immediately apparent that the per game stats are less correlated than the rate, or per play, stats. Yards per play appears the top of the list as we should have expected. It may seem that we should treat yards per play as the next level and then determine correlations to that number. However, I am going to skip that level as it is a straightforward product of the total passing per attempt and total rushing per carry numbers along with a team’s play selection ratio. Another thing worth mentioning is that passing efficiency, included above, is simply a combination stat of others already included – specifically Yards per Attempt, Interceptions per Attempt, and the non-included completion percentage and touchdowns per attempt. Determining a better Passing Efficiency formula is a study of its own.

And how would we attempt to improve that formula? Regression analysis is a quick and easy way thanks to available technology. The first place I wanted to use it, though, was to generate a formula to approximate the number of points per game a team should expect to score based on the most important contributing statistics: Total Passing Yards per Attempt, Total Rushing Yards per Carry, and Turnovers per Play. Here are the results for the past two seasons alone and combined:

2009 – Points per Game = 3.69*TPYPA + 2.02*TRYPC - 249.46*TOPP
2008 – Points per Game = 4.10*TPYPA + 1.85*TRYPC - 267.42*TOPP
Combined - Points per Game = 3.88*TPYPA + 1.93*TRYPC - 254.40*TOPP

All of the regression results showed good confidence values; for the combined version with a population of 240, the P-values for TPYPA, TRYPC, and TOPP were 1.53E-36, 3.87E-15, and 1.91E-35, respectively along with t-statistics of 15.10, 8.41, and -14.77, respectively. Using this information, we would expect to find that teams scoring reasonably above or below their predicted number of points per game should have extraordinary defensive numbers, pace numbers, special teams numbers, or a combination thereof. Here are the defensive numbers, both correlation and regression:

Rank	Statistic	2009 Coefficient	2008 Coefficient	Average
1	Yards Allowed per Game	0.916	0.909	0.912
2	Yards Allowed per Play	0.904	0.893	0.899
3	Passing Efficiency Allowed	0.869	0.866	0.867
4	Yards Allowed per Possession	0.874	0.844	0.859
5	Total Passing Allowed per Attempt	0.872	0.830	0.851
6	Rushing Yards Allowed per Game	0.821	0.873	0.847
7	Total Rushing Allowed per Game	0.810	0.867	0.838
8	Yards Allowed per Attempt	0.857	0.788	0.823
9	Yards Allowed per Carry	0.800	0.845	0.822
10	Total Rushing Allowed per Carry	0.756	0.793	0.774
11	Sacks per Game	-0.653	-0.604	-0.628
12	Interceptions per Game	-0.645	-0.599	-0.622
13	Turnovers Forced per Play	-0.581	-0.588	-0.585
14	Net Kickoff Average	-0.564	-0.522	-0.543
15	Total Passing Allowed per Game	0.626	0.457	0.542
16	Sacks per Attempt	-0.585	-0.497	-0.541
17	Interceptions per Attempt	-0.549	-0.520	-0.535
18	Turnovers Forced per Game	-0.528	-0.526	-0.527
19	Turnovers Forced per Possession	-0.502	-0.513	-0.507
20	Passing Yards Allowed per Game	0.573	0.375	0.474
21	Net Punting Average	-0.277	-0.320	-0.299
22	Fumbles Forced per Carry	-0.213	-0.273	-0.243
23	Fumbles Forced per Game	-0.010	-0.075	-0.042

2009 – Points Allowed per Game = 3.68*TPAPA + 2.42*TRAPC - 318.12*TFPP
2008 – Points Allowed per Game = 3.62*TPAPA + 2.89*TRAPC - 348.70*TOPP
Combined - Points Allowed per Game = 3.61*TPAPA + 2.70*TRAPC - 331.02*TOPP

The most readily apparent issue is that the offensive and defensive results differ. What’s important to keep in mind, then, is that this is not a game-by-game analysis. What we have done here is to attempt to put a formula together that will predict season-long scoring performance on a per game basis. That prediction is based on the strength of a team’s individual passing, running, and ball protection capabilities. Admittedly, though, it does raise the question regarding a game-specific correlation and regression analysis. Using game data, then, here are the results for each season and both together, keeping in mind that the results for points scored and points allowed will necessarily be identical here:

Rank	Statistic	2009 Coefficient	2008 Coefficient	Average
1	Yards	0.756	0.762	0.759
2	Yards per Play	0.739	0.743	0.741
3	Passing Efficiency	0.671	0.684	0.678
4	Yards per Possession	0.650	0.644	0.647
5	Total Passing per Attempt	0.641	0.637	0.639
6	Yards per Attempt	0.597	0.597	0.597
7	Rushing Yards	0.543	0.531	0.537
8	Total Rushing	0.525	0.514	0.520
9	Yards per Carry	0.524	0.514	0.519
10	Total Passing	0.444	0.462	0.453
11	Total Rushing per Carry	0.450	0.443	0.446
12	Passing Yards	0.408	0.426	0.417
13	Turnovers per Play	-0.328	-0.354	-0.341
14	Sacks	-0.350	-0.305	-0.327
15	Turnovers per Possession	-0.309	-0.334	-0.322
16	Interceptions	-0.296	-0.296	-0.296
17	Turnovers	-0.280	-0.309	-0.295
18	Sacks per Attempt	-0.286	-0.276	-0.281
19	Interceptions per Attempt	-0.274	-0.260	-0.267
20	Fumbles per Carry	-0.196	-0.185	-0.190
21	Punt Return Average	0.112	0.188	0.150
22	Kickoff Return Average	0.187	0.092	0.140
23	Net Kickoff Average	0.132	0.095	0.113
24	Fumbles	0.095	0.080	0.087
25	Net Punting Average	-0.071	-0.051	-0.061

2009 – Points = 2.72*TPYPA + 2.65*TRYPC - 127.05*TOPP
2008 – Points = 2.84*TPYPA + 2.78*TRYPC - 142.41*TOPP
Combined - Points = 2.78*TPYPA + 2.71*TRYPC - 134.61*TOPP

Generally speaking we see lower correlation values across the board when we look at the games individually. My first thought on that is that in a single game we see more variation and effect from other inputs such as a team’s defense and its effect on points scored. This includes not only defensive and/or special teams scores but also field position, turnovers forced, etc. These factors may wash out over the course of the year such that season-long performance stats are more directly related to a unit’s own capability with less of a correlation to other units’ performance, although the other units clearly will still have an impact.

Another interesting note is that turnovers cost teams more points in their season averages than they do in an individual game. Similarly, efficient passing games create more points over the long haul than in a single matchup. I’m not completely sure what to make of that aspect of the results, but intuitively it seems that perhaps turnovers can be overcome in a single game but teams that have a consistent problem with gifting possession to their opponents find it difficult to always overcome the issue. On the passing game side of the coin, perhaps the efficient passing games statistically lead to fewer points on their own as leading teams revert to shorter passes and their run game to milk the clock. That one is a bit trickier to understand.

Football will remain the most difficult of popular sports to analyze statistically based on its requirement of continuous action between 22 players on the field at the same time. All of these actions take place completely in parallel on any given play leading to a tremendous amount of interdependency as the players accrue their individual statistics. This helped lead me to make my attempt at tackling the team statistics first; they are far simpler to handle as the interaction between two teams in discrete events be they plays, possessions, or games. Is it impossible to rate football players with better statistical analysis? No, but the amount of data acquisition would be staggering and even that can come only after careful consideration of what interactions we should track. For now I am resigned to stop at the team level and hope that we can glean at least some sort of useful conclusions there.

Thoughts, comments, requests, and suggestions welcome.

Adjusted Stats 2009 Year in Review Part II

Share this story

Share All sharing options for: Adjusted Stats 2009 Year in Review Part II

More From Barking Carnival

Share this story

All sharing options for: Adjusted Stats 2009 Year in Review Part II