Brief Thoughts on Sports Power Rankings

Discussing sports has always appealed to me more than the actual viewing experience, and thus sports “Power Rankings” are a guilty pleasure of mine. They distill the discussion of the state of a league into a simple and clear list, and avoid the wishy washy generalities that plague most sports discussion. For basic background context, some example power rankings might be the weekly ESPN’s weekly NFL power ranking, or reddit.com/r/NBA’s biweekly community power ranking, or Bleacher Report’s power ranking of 14 distinctive NBA hairstyles. Generally, power rankings are roughly intended as a measure of how strong a team is at that moment in time, something with a team’s simple win loss record does a very imprecise job of measuring. Of course, the downside of sports power rankings is that they are poorly defined. Any argument over the ranking of teams must necessarily devolve into an argument over the definition of a “power ranking”.

However, reading discussions of power rankings shows that this is not the only factor at play. Significant attention is paid to a team’s “body of work”, so the team’s actual accomplishments are given substantial weight even if they aren’t particularly predictive of future results. For example, if the second place team in basketball beats the first place team off of a miracle 3 point shot from halfcourt, surely that binary result is not particularly predictive of future matchups. Yet, it is very likely that such a loss would vault the second place team over the first place team in the next power rankings. Another example of the complexity of defining power rankings is the effect of injuries and absent players. In the NFL, when a team loses their franchise quarterback for the season, they do invariably take a hit in the next power rankings, but it is generally a much smaller drop than one would imagine given how much worse their team has become. Indeed, it generally takes a few losses for the full extent of the damage to the team to be realized in the power rankings. Head to head matchups also take particular weight in most power rankings. Obviously, it is impossible to rank all teams ahead of opponents they have beaten, and that is a nonsensical proposition given the random nature of sports. However, head to head matchups are given particular weight when teams are otherwise very close to each other, and there is great resistance to ranking a team just above a team it recently lost to (generally, head to head results have less significance when the teams are far different in record, it is essentially a form of tiebreaker).

The simplest way to examine these idiosyncracies is through looking at the Vegas betting line for upcoming games. Vegas undoubtedly has some inefficiencies, but it is at least directly predictive, and the inefficiencies are likely to be small in magnitude (and if one can prove a sizeable and consistent inefficiency, then they should immediately start placing bets based on that insight, as there is good money to be made there). In sum, casual following of power rankings shows without a doubt that they are not nearly as sharply reactive as the Vegas predictions, as they do lend weight to the past accomplishments of a team. Quantifying these effects would make for an interesting project, i.e., showing that power rankings are “stickier” than Vegas odds, and that they give credit to teams accomplishments rather than solely predict future potential. However, this is obvious enough that I will take it for granted (one can simply look at any discussion of a casual power ranking to confirm that the rankers follow this mixed criteria, rather than a purely predictive approach).

It’s worth noting that there are exceptions to power rankings existing solely as mere blog posts for prompting discussion, as in some leagues they can have serious ramifications. For example, American College Football (CFB), a poll that is essentially a power ranking is used to determine crucial end of season results. In the BCS era, a mixture of polling and computer rankings were used to determine the best two teams in the country, who would then play for the national championship. As of 2014, CFB has moved to a playoff system, where the Playoff Selection Committee selects the four most deserving teams in the country to play a single elimination playoff for the national championship. While it certainly varies based on who you ask, I think College Football is relatively more open about the fact that a team’s “body of work” has a significant impact on their ranking, and not just a simple prediction of their current strength.

Of course, what makes the issue thorny is that many teams fail to see the crucial distinction between these two. This is an issue fundamental to all sports; the results of an individual game are highly random, and do not provide definitive statements on a team’s skill. While defining what it means to be the “better team” is no simple matter, if the “better team” always won, surely we would not see so many NBA playoff series go to 7 games. What else could make it possible for teams to trade blowouts so frequently, when little else has changed between games? Thus, in College Football, when a game between two top teams is determined by the result of a 45 yard field goal, the result of that kick says very little about the relative strength of these two teams, yet the game result will have long lasting ramifications when it comes to the playoff ranking. Many fans do not make this distinction, sticking to the belief that the result of an individual game is a final memorandum of who was (and will be) the better team. I think casually following the Vegas odds will quickly confirm that such a belief is nonsensical. Another such example is the effects of losing to a top team. The #4 ranked team can lose to the #1 ranked team on a coinflip field goal, and due to the significance of an extra loss in their record, they will almost certainly drop in their ranking. However, surely this game result does nothing to disprove the previous ranking, in fact the result indicates they were very nearly as good as the #1 team.

More topically, this issue will come to a clear head in the next few weeks, as the committee decides which four teams make the playoffs. Ohio State was ranked as the #2 team, and beat the #3 team in the country in their final week (Michigan), capping off a 1 loss season against an extremely difficult schedule. However, as that single loss was to Penn State, and Penn State managed to match their conference record, they won the tiebreaker, and will play Wisconsin in the conference championship. Should Penn State win that game, the committee will be forced to choose between Ohio State and the conference champion Penn State. It should be immediately clear that Ohio State is the superior team. They performed much better against a much harder schedule. They were ranked #2, and beat the #3 team. There is little argument to be had that they aren’t the second best team in the country. However, many casual fans are very stuck in the idea that the result of sports determines the quality of teams, and that the fact that Penn State “won” the conference due to their head to head victory should trump Ohio State’s superior season. Likely, Ohio State will be picked even if Penn State wins, and that will likely anger many fans who fervently believe in the simplistic idea that “if Ohio State was better, they would have won the head to head, and won the conferene”.

These are just long winded ways of discussing how the ranking of college teams is not the most accurate way of predicting future results, as it places emphasis on past results. However, the Playoff Committee is not charged with making the most accurate predictions about future strength, like Vegas is, but rather selecting the “best” or “most deserving” teams. In that sense, it is clear why past accomplishments are given great weight, as giving credit to the team who came through with the wins “when it mattered” is a perfectly reasonable definition of “deserving”.

This provides a brief introduction to the complexities of sports power rankings (or, more accurately, my rambling thoughts after spending time on break browsing sports discussion sites). I thought an interesting project might be to compile a dataset of NBA power rankings over the course of a season. This would easily lead to a few interesting analyses. For instance, if power rankings are used as a naive measure for predicting the outcomesd of games, how does that compare to other simple prediction methods, such as the records or point differential of the teams? The power rankings I will focus on are the /r/NBA community rankings on reddit. These are compiled by a poll of 30 rankers, and thus tend to represent a pretty standard aggregate power ranking, whereas individual sports site rankings tend to take more dramatic stances to promote discussion. You can read some rudimentary analysis of this topic here.

Written on November 24, 2016