Many ratings systems are based on the idea that a sports team's (or athlete's) performance is normally distributed about their true strength. This true strength is estimated, then, to be the mean of the normal distribution and is what the system will spit out as the team's final rating. This idea that a team has a true strength around which their performances are normally distributed is both logical and convenient for the computers.
The question is raised, though, about the standard deviation of performances. A consistent athlete or team should show a low standard deviation in their performances while an inconsistent team, of course, should have a high standard deviation. As usual, I will begin by issuing a few warnings regarding the information I'm going to present. In order to calculate a team's standard deviation (or consistency), you have to assume that their opponent in each game was playing at its mean. With the limited number of data points in a college football season, it seems entirely possible that a team may be fortunate enough to catch more teams on bad days than good days or vice versa, and to do so in a high enough percentage to affect their consistency rating. And that warning includes what should always be considered when dealing with statistical analysis of college football - there aren't enough data points to be completely sure about most things we care about. Sure there's enough data to tell me that Florida should practically never lose to North Texas, but delineations between teams of similar strength don't come with a great amount of confidence in this sport. (Playoff, anyone?)
I calculated each team's consistency rating based on their power score for each game played. This power score can essentially be viewed to be the margin of victory in each game compared to the margin of victory an average opponent playing an average game would have against that opponent when the opponent played an average game. That's a lot of average; I almost feel like I'm discussing Texas A&M recruiting. For this analysis I focused only on Division 1A teams and matchups.
|Most Consistent Teams|
|Least Consistent Teams|
|2||San Diego St.||20.49|
|7||North Carolina St.||18.10|
|Top 10 Power Teams w/ Consistency|
A few items about the tables:
- Texas A&M - the most consistent team in the nation. I'm sure this will become a point of pride, but as I'll discuss later, a bad team doesn't really want to be consistent.
- Georgia - a little surprising they were this consistent, but then you realize that they beat the teams they should have and got whipped by the teams they shouldn't have.
- Texas Tech - No surprise at all.
- Ohio State - Normally viewed as a consistent team, they are slightly less consistent than average. I chalk this one up to the Pryor and Wells situations.
- As you can see, every team in college football had a standard deviation of more than a touchdown, all the way up to over three touchdowns for Tulsa.
Pros and Cons
Normally when people discuss college football teams, everyone talks about wanting to be a consistent football team. What they mean, or should mean, is that they want to be a consistently good football team. When you consider the problem of maximizing your team's wins during the season, if you are a good football team whose opponents are all or nearly all worse than your team (the mean of the distribution) then you will want to be extremely consistent. This is because upsets are more probable when an inconsistent team is involved. By the same token, if you are a bad football team that will be playing mostly opponents that are better than your team, then consistency isn't really something you should strive for if you want to win the most games possible. A team that is bad and plays at a consistent level will not get up for an upset. A team that is bad and plays inconsistently might put out a real stinker, but they are also more likely to upset a better team.
Take Texas A&M for example. They were a bad football team this year, but their performances were the most consistent in the nation. If we know that the Aggies are team whose performances averaged out to their power rating of 19.48, doesn't it make sense that they would rather play half their games at a 34.48 level and half at a 4.48 level instead of all at their average strength? When playing games against a schedule that averaged 35.45 in terms of power ratings, a team with a 19.48 rating will win more games by being inconsistent than by being a consistent squad. The bottom line is that if you're going to suck, you want to do it erratically.
On the other hand, there's Texas Tech. A team with a relatively high power rating will want to play consistently against lower-rated teams. I won't elaborate here as the reasons should be obvious. However, it should be noted that a team's power rating is less important to whether or not they should be consistent than their schedule. Tech, while #6 in the power ratings, actually played two teams with higher ratings. Being inconsistent helps against those two teams as far as increasing the expected value in terms of wins. And in reality, that's what happened with the Red Raiders this year. With no standard deviation, they would have been expected to lose to both Texas and Oklahoma by comfortable margins. Instead they upset Texas in a close game and got destroyed by Oklahoma due to inconsistency (at least partially known as the Lubbock factor in Tech's case).
Expected Wins in a True Round Robin
My power ratings only display the mean of a team's distribution. Taking the distribution of all the team's power ratings, then, and normalizing each team's specific rating enabled us to give a percentage chance that one team would beat another. This method, though, did not account for the consistency of each team and we've shown that this will have an effect. Now that we have identified each team's rating as a normal distribution of its own we can recalculate that percentage in each case thanks to the property of the sum of normally distributed variables.
Furthermore, thanks to my favorite toy, Excel, we can simulate a complete round robin of Division 1A football using this calculation. There are many things we can look at after doing this, including seeing how changing a specific team's consistency changes their expected win total out of the 119 matchups. We can also compare a team's rank in expected wins to their rank in power rating. We would expect that highly rated teams with poor consistency would fall in the rankings and that lowly rated teams with poor consistency would rise.
So let's begin with the Aggies and Red Raiders. In a full round robin, based on their current rating and standard deviation, Texas A&M's expected value in wins would be about 45.5 according to this calculation while the Red Raiders would have an EV of 94.6 wins. If we simply switch their standard deviation values, the Aggies EV goes up to 48.2 wins while the Red Raiders' goes up to 101.0 wins. This makes sense based on the theory that consistency helps a good team's expected wins and hurts a bad team's expected wins.
Below is a table of the Top 25 in the power ratings along with their power rating, power rating rank, expected winning percentage in a true round robin, and their expected winning percentage rank. Thankfully it just so happened that nobody went into or out of the Top 25.
According to this model, Florida is a little over a one-point favorite tonight, has a 52.2% chance of winning, and has a 39.6% chance of covering the current 6-point line.