One of the things that pops up annually in college football discussions is how terrible the computer ratings are. We are now in the fifteenth season of computer input in determining the college football national champion and each year the conversation follows a nearly identical path over the season's lifespan.
Early in the season somebody somewhere will point out a ludicrous result from a computer ratings set (hey, who is this jackass that has Cincinnati #1?!?!?) based only on early season results. Others join the conversation and point out other aspects of the ratings set that don't agree with common sense and for the rest of the season that rating algorithm is considered untrustworthy and absurd. Nobody bothers to investigate why the ratings look like that early in the year, and certainly nobody can be bothered to articulate an objective reason why they shouldn't look like that based on the inputs to date.
Now don't get me wrong, I disagree with the methodology of some computer ratings out there, but disagreements should be based on the structure or design of the system, not on a snapshot of the results when the season is still young. In order to help you get a feel for the data the computers are working with and how it changes throughout the season, here are FBS schedule graph images for the 2012 season. We obviously start at the top with the teams laid out in a circle and progress one week at a time with some comments at important parts of the season. Each image will add a week's worth of games to the image before it.
The image after week 3 directly above is where we currently stand. This is the point at which the graph is connected for 2012, so I was able to run a basic power rating set.
This is where I start to consider the ratings fairly stable or reliable, but there is still a long way to go. Before this they are basically worthless for analytical purposes.
The first release of the BCS standings takes place. You can already see the huge difference between the data available when computer ratings start to trickle out and when the BCS numbers are first published.
After Thanksgiving weekend. It's interesting to me to notice that the periphery of the graph is much more affected this week than the central portion. Why? I laid the teams out in alphabetical order so during this week when Georgia plays Georgia Tech, Florida plays Florida St., etc., many of the games are between teams close to each other on the graph.
After conference championship weekend. I included the scheduled games for that week as well as presumptive conference championship games here but did not include the Army/Navy game the following week. This is the schedule graph for the data available to the computers when the selections are made.
Anyway, I thought it was interesting and it reinforced some ideas about college football and computer ratings. One is that there really is a lot of data by the end of the season and computer ratings are a useful tool because they are able to consider all of that data when arriving at a conclusion. The other somewhat contradictory idea is how little data we really have when trying to pick just two teams for a championship game. That final graph may look extremely busy, but in this context it's really not. The perfect schedule is a full round robin because in that case you can simply compare teams based on their win/loss records with no bias or subjectivity necessary. Below is what the schedule graph would look like if the 124 FBS teams played a full round robin. As expected, it's a big black blob at this zoom level.
So here's a zoomed in look for reference.
That's all, just a visual aid to help us all better understand the change in the amount of data available throughout the season.