Today we'll look at an adjusted and expanded version of the analysis posted in Part I. If you missed it:
Over the last couple days I've made some significant adjustments to the dataset and the approach.
The new dataset (in xlsx format) with component worksheets can be downloaded:
If you're not keen on the details, skip this section in blockquote:
- First I rescaled the two Talent (Rivals, Scout) and two Quality (F/+, Sagarin) components so that their ranges all go from 40 to 100.
- Then I averaged the components to get team Talent and Quality scores, and adopted the baseline assumption that without coaching effects, the relationship between Talent and Quality should be 1:1. So if your scaled team Talent score is an 80 and your scaled team Quality score is an 85, the raw coaching score will be +5. Easy peasy.
- The Coaching scores are then collectively scaled from 40-100 so that they can be evaluated alongside Talent and Quality.
- The CxT is then compiled as follows: Coaching score times Talent, then rescaled to the 40-100 range, with "100" being the top overall C score (Boise State, 2010) times the top overall T score (USC, 2007), and "40" being the worst C times the worst T. This compresses most of the CxT scores between 55-85 but gives coaches at the top end ample room for improvement (and coaches at the bottom some room to get worse).
The dataset has been expanded to include all FBS teams, plus the year 2005 and the year 2013 (through the games of November 17th). The database now has 1087 observations (school-years) instead of 466. I've also reformatted the indicators from colored arrows to a red-yellow-green color scale. This should illustrate smaller gradations better at a glance.
I've also added info for conference affiliation and win/loss record, and I replaced the awkward punctuation-mark-for-early-tenure with a separate column of data for length of tenure. This allows me to do a more felicitous short run/long run analysis, which I'll explain below.
I've refined the Talent calculation to account for different recruiting class sizes from year to year. A 24-person, 4-star-avg class and a 12-person, 3-star-average class will now cumulatively average to 3.67 stars rather than 3.5. Also, the issue of grade inflation has been handled a little more dexterously, weighting each year's scores until they all have the same mean, whereas previously I settled for a quick and dirty fix. Not sure if it's made much of a difference...well actually I'm very sure it didn't. But enthusiasts for rigor should be pleased.
The methodology has also been simplified. Previously I used a regression to estimate Coaching contribution to team Quality, given team Talent. In this version I've scrapped the regression to adopt an even simpler approach:
Why 40-100? Primarily because you get oddball arithmetical outcomes when you're comparing ratios and the bottom end of the denominator range is close to one. Scaling everything to at least 40 before creating any ratios prevents that from being an issue. Also, it's a roughly-normally-distributed variable, so scaling from 40-100 produces average scores at about 70, solidly above-average scores at 80, and elite scores at 90. So it's easy to read. EA Sports' NCAA Football series (RIP) did it that way too, and for the same reasons I'm guessing.
Why get rid of the regression? Because it was adding too many little quirks that I didn't like. Mostly I was trying to give coaches at the top of the rankings a harder time for underachievement, but in the process coaches in the mushy middle were getting a boost I couldn't justify while coaches at the bottom end were getting unduly hammered. Plus, I like transparent analysis and this methodology is much easier to understand. The raw Talent and Quality scores are very similarly distributed - a bit fat-tailed, a bit fat-bottomed, but mostly normally distributed - so there's not much practical harm in making the math massively simpler. And finally - and as a wannabe statistician, this is cheating, and I should be ashamed - the results didn't change much. So I'll follow Occam's Razor and adopt the simpler model.One consequence of this decision is that while the zero-coaching-effect curve has been flattened, it's also been raised to a 45 degree angle, so the coaching "penalty" for good teams is actually steeper than in the previous analysis. This is a good thing; it accomplishes what I tried to do the first time around without the side effects and unneeded complexity. But if you analyze the Coaching numbers alone you'll see that coaches who have a lot of talent on their teams will have even lower Coaching scores than in the first version, which were already lower than one's football knowledge will suggest.
If you understand the approach, this makes sense: the "Coaching" score may be our proxy for coaching ability, but it's actually a measure of over/underachievement. Excellent talent does not need to overachieve to succeed, and when they do overachieve the denominator (talent) is so large it'll seem insignificant. But in reality a talented team that even modestly overachieves is the brass ring of coaching performances.
So yes, Jim Leavitt will have better "Coaching" scores than Nick Saban in this system. This is why I like the CxT Index as our primary indicator, since it effectively negates the penalty put on coaches of talented teams when comparing quality-to-talent ratios. The results should speak for themselves in any case.
Topline Results...on hold
Well, I developed all these faincy graphics, but the Fanpost editor hain't let me load 'em going on 48 hours now. So I'll present the topline results another time and post results for specific coaches in comments (where I can post photos hosted on Photobucket instead of needing to uploading them to the SB Nation servers).
I have small infographics prepared for twenty current FBS head coaches that have been discussed around BC the last couple of weeks as potential Mack Brown replacements. if all goes well, they'll be in comments below, soon enough. After the holiday I'd like to do the same for ten more coaches who may not be experienced enough for the Texas job but are worth keeping an eye on over the next few years if our guy doesn't pan out. Perhaps I'll devote a later edition or two to the "Transcendent QB" question and examine the predictive power of first-year coaching outcomes.