Introduction: Consider devising a rating system in which matches consisting of an arbitrary number of “players” are “played.” After each match, the rating of each player should reflect the position of that player among all competitors.
Solution: Obviously, this is only an idea. There’s not a set “solution” to his. Let the players be with corresponding ratings and volatilities . We will let the initial values of be and the initial values of be . Basically, the performance of player will be a random variable with normal distribution with mean and standard deviation . The performance distribution will be
We let be the cumulative distribution.
Assume players participate in a match. We will recalculate ratings according to the following steps:
(1) Find the weight of the competition, depending on the total rating of the participants. We would like it to take on values between zero and one, so try something like
for a suitable constant . Basically, this is small when there are not a lot of players, but increases rapidly as the rating of the participants approaches the total rating.
(2) For a player , we will calculate his expected rank based on the normal distribution. Let be the probability that loses to . It turns out to be
So if we sum up the probabilities that will lose to each player (including himself), we get the expected rank of , or
where the is to shift the best ranking to .
(3) The actual rank of is , and we want to base rating changes on the difference . If is the number of competitions has participated in, calculate the new as
where is a suitable positive constant for equating differences in ranking to differences in rating.
(4) We now update the volatility . If the player’s performance is relatively close to the expected performance, we want to decrease to imply stability of the player. On the other hand, if the player performs exceptionally well or poorly, we want to increase to reflect inconsistency. We propose the following change:
where are suitable positive constants. determines the magnitude of volatility change and determines the threshold for which performance is considered relatively constant.
All in all, this rating system depends a lot on experimentally finding the constants to put into working use. Otherwise, it seems like it would work.
Comment: Rating systems are always controversial and people are always trying to devise ways to get around it. On a lighter note, many restrictions can be added like a rating change cap or a volatility cap. Stabilizers for higher ratings are also a possibility (i.e. reducing the weight of each match). Any thoughts?