clock menu more-arrow no yes

Filed under:

Designing a New BCS Computer Ratings System

New, 18 comments

As everyone should know by now, I am a computer ratings geek. I have my own set of ratings, but I recently decided to set out to design a computer ratings system that accommodates the factors that either must or in my opinion should be reflected in a system used by the BCS. With that in mind, here are the constraints and logical requirements I started with when designing the system:

  1. No score information, only who won and who lost the game. This constraint is required by the BCS.
  2. Road Win > Neutral Win > Home Win > > > > > > > > > Road Loss > Neutral Loss > Home Loss. While the score of the game is not allowed to be used, the location is fair game.
  3. Beating a team that you are considerably higher rated than should have no effect on your rating. All 6 of the systems used by the BCS (I'm only 90% sure of this for Sagarin, though) violate this constraint. In some of them, your rating improves when you win a game, no matter the opponent. In at least one of the systems, it is possible for your rating to decrease if you beat a bad enough team. This should also never happen as a team can do no more than win a game.
  4. Losing to a team that you are considerably lower rated than should have no effect on your rating. Simply the converse of #3. Also conversely from #3, losing a game should never increase a team's rating.
  5. A corollary to #3 and #4 is that games against teams of relatively equal strength should have a much greater impact on your rating than games against teams vastly superior or inferior to your team.
  6. Losing to a bad team is worse than losing to a good team. This should be obvious, but most of the current systems treat every game equally to a fault, IMO. And that includes my standard ratings. In fact, you can go to the Colley website and see that if you change the season so that Texas beat Texas Tech, but lost to Florida Atlantic, we become #1 in the ratings.

So starting with these constraints I began to design the inputs for the system. As you will see later, the first constraint is extremely important and is largely responsible for the specific results in any season. The best ratings systems are iterative ones that with each game analysis compare the expected result to the actual outcome. In a system where the actual score of the game is not allowed to be considered, the actual outcome input for each game is very limited in its possible values. Without game location or date information, in fact, this input must be binary (1 for a win -1 for a loss, for example). However, as shown in #2 above, I added to the variation in that input by considering game location. How did I arrive at specific numbers? Well, historically speaking, and I only looked at the last 4 years but it's very consistent, home teams win a little over 57% of all college football games. At first, then, it would seem we could use 1.14 for a road win, 1 for a neutral win, and 0.86 for a home win as our game outcomes. Instead I used 1.1, 1, and 0.9 for the values. I did this because I consider that in college football better teams play more home games and therefore this contributes a little to the 57% number, so I called it 55% for my purposes. This is obviously a rather arbitrary determination and later I might revisit this figure and determine a more accurate value. The game outcome values, then, are as follows: 1.1 for a road win, 1 for a neutral win, 0.9 for a home win, -0.9 for a road loss, -1 for a neutral loss, and -1.1 for a home loss.

Using these outcome values, I then decided to use a system loosely based on the ELO rating methodology. This requires, as discussed above, an expected outcome and an actual outcome for each game. For these calculations, the expected outcome for a Team A playing Team B is initially set at Team A's rating minus Team B's rating all divided by 100. This value is capped, though, depending on the game location. For a road game, Team A's expected value is capped at 1.1, 1 for a neutral game, and 0.9 for a home game. This processing step maintains our #3 and #4 requirements above. Working with a neutral site game, if a team is more than 100 points more highly rated than their opponent, winning the game will affect neither team's ratings. Losing the game, however, would cause an adjustment.

The second step in the ELO method for each game is to adjust each team's rating based on the comparison between expected value and actual outcome. So in this system, after each game an adjustment score is tracked for each team. This ratings adjustment is equal to 100 times the actual outcome minus the expected value. So, for example, if Team A is rated 150 points more highly than Team B and wins at a neutral site, there is no adjustment. The expected value was capped at 1, and the actual outcome was 1. Now, if Team A is rated 50 points lower than Team B and wins at a neutral site, then Team A's rating adjustment score would be +50 and Team B's would be -50. These adjustment scores for each team are applied at the end of the season iteration after being summed and divided by the number of games the team has played.

The algorithm then iterates as many times as necessary until the ratings stabilize to a suitable degree.

With all those things in mind, here are the Top 25 results of the system for the 2008 year-to-date (here's the full list). I have made some comments on these results below the ratings:

1    Texas             804.23
2 Texas Tech 800.90
3 Utah 797.57
3 Oklahoma 797.57
5 TCU 707.57
6 Oklahoma St. 656.13
7 Boise St. 638.30
8 BYU 617.57
9 Florida 604.23
10 Alabama 595.50
11 Southern Cal 579.98
12 Penn St. 575.56
13 Georgia Tech 565.15
14 Cincinnati 558.74
15 Ohio St. 557.04
16 Virginia Tech 554.32
17 Florida St. 553.72
18 Boston College 550.13
19 Missouri 546.13
20 Georgia 540.59
21 North Carolina 540.57
22 Pittsburgh 529.27
23 Nebraska 528.32
24 Oregon 528.30
25 Clemson 527.82
  1. A reminder that margin of victory is ignored per the BCS requirements. Obviously the mid-majors wouldn't be as high if it were included. But it's not.
  2. Losing to a bad team kills you in this system, as I think it should. Essentially your rating then has to stay in range of the team you lost to. Within 110 if you lost on the road, 100 at a neutral site, 90 at home although this isn't a hard and fast rule and can be violated with enough data. This is why Florida and Southern Cal are so low. They lost to "bad" teams and the other top teams did not. Ole Miss is #33 with a 509.36 rating and Oregon St. is #30 with a 519.83 rating. (Iowa's #43 and 494.21 rating keep Penn State down)
  3. As you can see, an undefeated team's rating is essentially determined based on their best win. Boise State beat Oregon in Eugene. Their rating is 110 points higher than the Ducks.
  4. Also, a group of teams that is undefeated outside of the group has their ratings essentially set based on the group's best outside win. These two aspects show why Utah and Oklahoma are tied. Utah's best win was TCU at home. Similarly, the best win of the Texas/Texas Tech/Oklahoma trio outside that group was Oklahoma's win over TCU at home. They are tied 90 points ahead of the Horned Frogs. The logical conclusion, then, is that if Texas had beaten Oklahoma in Austin, all four teams would be tied at the top. Keeping in mind that we can't consider the scores of the games, this is logically reasonable. All we know is that Utah has beaten everyone they have played, and that they beat TCU at home. Their rating then stabilizes at 90 points above the Horned Frogs.
  5. Remember that the intent of this system is to rank teams for the BCS. This means that I intend it to reflect who has accomplished the most based on what has already happened within the logical constraints above and that predicting the strengths of the teams for future contests is completely ignored. I have my power ratings for that. Before anyone asks no I do not think that this list is in order of how good each team is.

Because I'm sure there will still be confusion about what the system is designed to do and what it means, here is the forecasted ratings set after bowl season if all teams favored in Vegas win:

1    Texas           679.27
2 Texas Tech 676.77
3 Oklahoma 659.98
4 Florida 657.70
5 Alabama 648.57
6 Utah 620.12
7 Oklahoma St. 601.06
8 Southern Cal 597.76
9 TCU 593.09
10 Boise St. 575.01
11 Cincinnati 571.97
12 Georgia Tech 564.94
13 Penn St. 562.11
14 Georgia 557.90
15 Florida St. 554.37
16 Ohio St. 552.03
17 Boston College 546.61
18 Virginia Tech 542.53
19 Missouri 541.71
20 North Carolina 539.63
21 Oregon St. 538.69
22 Clemson 534.15
23 Oregon 527.56
24 Wake Forest 521.18
25 California 519.26

BYU falls completely out for losing to Arizona. The ratings all get closer together because of the "upsets" which leads to things like TCU falling 4 spots despite beating Boise State. Interesting that Oklahoma stays ahead of Florida, but they now have two losses to #1 and #4 and a win over #2 and #7. Florida has a loss to #33 and a win over #3 and #5.

I think the system is extremely logically sound and the only screwiness it has is because of the requirement not to use margin of victory. I would be interested in any specific questions about certain teams' ratings that anyone has because I haven't seen one that is not defensible based on the logical constraints above. I have posted the Top 10 outputs both pre- and post-bowl season for each season since the BCS started here. I would suggest reviewing those lists to get a better feel for the ratings. The 2008 season has been fairly crazy, so only looking at the one output won't tell you everything. Two things to remember about the raw values for each season are that the linked page's ratings sets used only Division 1-A teams, explaining why the raw numbers are so much lower across the board, and also that you can't compare raw numbers from year to year.

Anyway, now anyone that actually read this far can see the steps taken in putting together one particular computer ratings system. I will attempt to answer any reasonable questions in the comments section.