Ratings v. 1.1

Hey, it looks like you have clicked on the link to my ratings explanation. Well, here’s what goes into the world curling ratings:

If you want the basic nuts and bolts of my ratings philosophy and some of the design choices in v. 1.0, along with the differences between my ratings and the World Curling Tour’s rankings, go here.

Now, let’s talk about the changes in v. 1.1.

Reference team: The 25th-best team is given a rating of 10 and all other teams are rated relative to that. Previously, the 100th-ranked team was the reference team and given a rating of 10. However, the lack of play during the pandemic caused problems with this approach. Especially on the women’s side, when at one point there were fewer than 100 women’s teams that met the current criteria for minimum games played (and that would have been fewer had I not included a bunch of Russian sub-tour events).

Minimum games: Teams must have played a minimum of seven games in the last 12 months to qualify for a spot in the ratings. Actually, it’s not that simple – games from the previous season are increasingly discounted as we get farther into the current season. In practice, it works like this: If a team plays seven games in the current season, it will be rated. If it played fewer, it can still be rated if it played games in the previous season (and also within the last 12 months). However, that possibility approaches zero as the current season comes to an end.

In theory, I’d like to increase that minimum. And prior to the pandemic it was higher. But we’ll keep the low bar until things get back to normal, and maybe beyond, mainly because it’s pretty important to rate new teams as soon as possible.

Time-decay weighting of games: In v. 1.0, games from the past two years were included and the influence of each game decayed linearly with time. But now it’s a logistic decay which looks like so:

Data can actually be included going back forever since the weight actually doesn’t ever get to zero no matter how far back in time we go. In practice, data beyond two years doesn’t have much impact, and even that only applies to teams that have been active since that time.

Future plans

Measure of game dominance: Only wins and losses matter for now. That said, I plan to add some small measure of game dominance before next season.

Better regression of results for new teams: Undefeated teams really break the Bradley-Terry algorithm. The current ratings handle this problem in a better way than it did in v. 1.0 but it still needs work.

The off-season discontinuity: Because of the way the time-weighted decay works, once new ratings are cranked out in August, after a break of a few months with no games, teams can get shuffled around a bit as older events lose importance. This isn’t the best look and I hope future work will lead to a solution to add stability between the last ratings of one season and the first ratings of the following season.

Other questions you might have

Why doesn’t (certain team) appear in the ratings? See the minimum games explanation above.

What comprises a team? A team is considered the same if it includes the same skip and one other player, or if the three non-skips are the same. For the purposes of naming the team, the player that has skipped the most games in the current season (or the previous season in the event the team has not played any games in the current season) is listed as the skip of the team.

Thus, at the beginning of the 2021-22 season, Chelsea Carey is listed as the skip of the team normally associated with Tracy Fleury. But Carey skipped most of the games for the team in 2020-21. Once Fleury takes the ice with the team in 2021-22 the listed name will change (though for ratings purposes the team is viewed as the same.)

(Certain team) hasn’t played in a while. Why are they rated? Teams are rated that have played roughly seven games in the last 12 months. So theoretically, a retired team can hang around in the ratings for a while. At the start of the 2021-22 season, Elena Stern’s team is ranked 15th in the world, and yet Stern announced her retirement before the season. Under the current system, her team would continue to be rated until sometime into the new year. I hope to have a way to automatically remove teams like this once other players on the team have played for other teams.

In fact, a skip can only be rated with one team, so if a skip ends up skipping a different team to start the season, the skip’s old team is removed.

Where do you get data? CurlingZone does the scoring for the vast majority of important events and this project would not be possible without their work. For WCT events, and Canadian and world championships, I get the data from their respective sites. For most other cases, I get the data from CurlingZone.

What events do you use? For the most part, it’s whatever events I can get scores for. However, I do a draw a line. The event needs to have something at stake. Either a path to a provincial/national/world championship or WCT points or barring those two things, a significant cash prize. I do not include something like Curling Night in America, where I assume teams get paid mostly for participating and there are no WCT points involved. In addition, I don’t include games played in skins format. On a related note, I require that end-by-end scoring is available for an event to be included.

One final restriction is that at least one team in an event must have played in another event used in the ratings. Otherwise, there is no way to connect the teams to the current set of rated teams.

OK, there’s actually one more restriction. I don’t include games between men and women. I do include games between teams of the same gender in events where men’s and women’s teams play. It wouldn’t be that difficult to change this and I may do that before the season ends. It’s just that these games happen so rarely, and usually in under-15 junior events, so it’s not a priority at the moment.