Ranking Feature
Behind the Tennis Recruiting Rankings
by Dallas Oliver, 13 July 2016
Share:  
  


Since August 2005, Tennis Recruiting has put out weekly rankings of American junior boys and girls. This week marks the 569th consecutive week with Tennis Recruiting rankings.
Rankings are front and center on the TennisRecruiting.net website, and we field questions every week about how our rankings work. Today we describe our ranking system in some detail, providing insight into its interesting mathematical properties  and hopefully addressing these questions for everyone.
Ranking System Overview
The basis of our rating and ranking system is the BradleyTerry model. BradleyTerry calculates rating values for each player, and these rating values have interesting mathematical properties:
 If you consider ratings for any pair of players, the ratio of those ratings produces an Expected Win Percentage (EWP).
 Summing up all the EWPs for a player gives you his/her Expected Number of Wins.
 If you consider any player's past results used to calculate a rating, the player's Expected Number of Wins in those matches will be exactly equal to his/her Actual Number of Wins.
That sounds a bit technical, so what does it all mean? Let's take each of these three points in turn.
(1) If you consider ratings for any pair of players, the ratio of those ratings produces an Expected Win Percentage (EWP).
Consider two fictitious players  Jane Doe and Wendy Indigo. Doe is rated 1,000, while Indigo has a 3,000 rating. When these two players meet, Jane Doe's EWP will be:
1,000 ÷ (1,000 + 3,000) = 0.25, or 25%
Likewise, Indigo's EWP will be:
3,000 ÷ (1,000 + 3,000) = 0.75, or 75%
Given two player ratings, we can easily come up with an Expected Win Percentage for each player.
(2) Summing up all the EWPs for a player gives you his/her Expected Number of Wins.
We now consider four matches that Jane Does has against opponents where she has a 75% EWP. The win percentage for each match is 75%, or 0.75, so the Expected Number of Wins in those four matches will be:
0.75 + 0.75 + 0.75 + 0.75 = 3.0
Even though Jane Doe is a favorite in all four matches, we expect only three total wins.
(3) If you consider any player's past results used to calculate a rating, the player's Expected Number of Wins in those matches will be exactly equal to his/her Actual Number of Wins.
This statement points out the true power of the BradleyTerry model. If Jane Doe goes 86 in 14 matches, then the ratings for all players will be set such that the sum of Jane's EWPs for those 14 matches will be exactly 8.0. Note that this property holds for all players in the system.
The next section walks you through a more complete scenario ...
Detailed Example
Let's take a look at a year's worth of results for fictitious player Jane Doe. Jane has an 86 record, and, to keep things simple, let's again assume that her raw rating is 1,000.
The following table shows information about Jane's record against her 14 fictitious opponents over the past year:
WL  Opponent  Raw Rating  EWP  Expected Wins 
W  Bertha Red  818  55%  0.55 
W  Dolly Blue  538  65%  0.65 
W  Fay Green  538  65%  0.65 
W  Hanna Yellow  333  75%  0.75 
W  Josephine Orange  333  75%  0.75 
W  Laura Purple  333  75%  0.75 
W  Nadine Brown  176  85%  0.85 
W  Paulette White  53  95%  0.95 

L  Sally Black  19,000  5%  0.05 
L  Vicky Pink  5,666  15%  0.15 
L  Wendy Indigo  3,000  25%  0.25 
L  Andrea Gray  1,222  45%  0.45 
L  Chantal Magenta  1,222  45%  0.45 

L  Erin Cyan  333  75%  0.75 

86   8.0 
In the table, the first two columns are selfexplanatory with winloss outcome and opponent name. We shade the wins and losses in the first column based on the EWPs  darker green indicates a stronger win, while darker red indicates more unexpected losses.
The third column in the table shows raw player ratings, while the fourth and fifth columns show Jane's EWP and Expected Number of Wins, respectively, against each opponent. The first eight rows of the table show Jane's wins (which have the green shadings in the leftmost column), while the next six rows show her losses (which have the red shadings).
The final row of the table illustrates the fact that the Expected Wins equals the Actual Wins. Note in Column 1 that Jane went 86. If we sum up her Expected Wins in Column 5, we get:
0.55 + 0.65 + 0.65 + 0.75 + 0.75 + 0.75 + 0.85 + 0.95 + 0.05 + 0.15 + 0.25 + 0.45 + 0.45 + 0.75 = 8.0
Again, the Expected NUmber of Wins (8.0) is exactly equal to the Actual Number of Wins (8).
We hope this section provided an overview of how our ratings and rankings work  and hints at how they can be used. The next section shows some of the advantages of our rating system.
Discussion
We spent a lot of time exploring various rating systems to use for the Class Rankings at Tennis Recruiting, and we are very happy with the BradleyTerry system that we have now. We believe that our ranking system has a number of advantages:
 The ratings are more predictive than any other system we have explored.
 The ratings have interesting mathematical properties with the EWPs.
 The system is straightforward to implement, producing ratings and rankings in a reasonable amount of time that are easy to verify.
Again, let's take these one at a time ...
(1) The ratings are more predictive than any other system we have explored.
As a company, we are happy with the ratings and rankings produced by the BradleyTerry model. As a predictor of wins and losses  which we believe is a solid measure for any system  it consistently outperforms other models we have evaluated.
While it is impressive that our favorites win 78% of the time, we think it is even more impressive that tournament upsets are in line with our EWPs. As you can see in this analysis of last year's
USTA National Championships, the actual win percentages are in line with our expected win percentages.
(2) The ratings have interesting mathematical properties with the EWPs.
This article has spent time discussing the EWPs and how the Expected Wins perfectly match Actual Wins. We have taken advantage of these interesting properties in several ways  including our forecasts for junior tournaments like the Asics Easter Bowl in April.
We also make use of these properties by producing EWP tables  like the one in the example section above for Jane Doe  for every single player in our system. These EWP tables are already available to college coaches, and we are working on making them available to our premium subscribers as well ... stay tuned.
(3) The system is straightforward to implement, producing ratings and rankings in a reasonable amount of time that are easy to verify.
Although the computation of the BradleyTerry ratings would be hard to do without the help of a computer, the implementation of the algorithm is straightforward. Our system uses an iterative process: each week, the system starts by assigning all players a rating value of 1,000. (Note that when all players have the same rating, the system would expect all players to have equal numbers of wins and losses.) The system then determines which ratings need to be adjusted up or down based on the differences between their Expected and Actual Wins. A player whose Expected Wins is below Actual Wins needs to be adjusted up  and vice versa. After modifying all the player ratings  which modifies the Expected Wins  we check again. This process is repeated over and over until the Expected Wins equals the Actual Wins for all players  each iteration of the process brings player ratings closer and closer to their correct values.
As we mentioned above, it easy to verify that the answer produced by our system is correct. Like we showed for Jane Doe in the previous section, we can add up the EWPs for any player and it should equal the player's Actual Wins.
Arguing that a player is rated or ranked "wrong" would be the same thing as arguing that the player's win percentage or USTA PPR total is wrong. All metrics are what they are. One could argue that a player record is incomplete or incorrect (which is easily addressed) or that a certain metric is not a good measure of a player's quality, but those arguments are different from claiming that the answer is incorrect.
Discussion
We close by addressing several questions.
(1) Does your rating system predict the winner for any given match?
The BradleyTerry model says nothing about which player will win. It expects the higherrated player to win more often than not, but it also expects the lowerrated opponent to win some percentage of the time. Unless the EWP is 100%, the model does not choose a winner.
Even in the case where the higherrated player has a 90% EWP, the system makes no definitive claims. Imagine 1,000 matches where the higherrated player has a 90% EWP. The system expects the lowerrated player to win 100 of those matches. Which 100? The system has no idea.
Predicting which matches the underdog will win would be like predicting a roll of 6 on a standard die. Obviously the 6 will come up onesixth of the time, but it is impossible to know which rolls.
One interesting thing to note is that every time an underdog with a 10% EWP wins, that result becomes part of the official record  at which point the ratings are recalculated and adjusted based on what actually happened.
Passing judgement on any rating or ranking algorithm simply because a higherrated player loses is misplaced  there will always be upsets, and there is always some percentage chance the lowerrated player will win.
(2) Can I independently implement BradleyTerry and reproduce your ratings?
The BradleyTerry model was first developed in 1952, and variants of it have been used in many different sports. Almost every implementation includes variations on the "vanilla" model  in particular because the vanilla implementation has issues with undefeated players, winless players, and players with very short records.
We have experimented with a variety of techniques to address these problems, and we are satisfied with the implementation that is live on our website today.
As we mentioned above, it is difficult to produce the rankings without the aid of a computer. But there are many websites that discuss implementations of BradleyTerry for the technically savvy.
(3) The rating numbers you list for the Jane Doe example do not seem to match up with the numbers you list in your forecasts. Why the difference?
We list raw ratings in those tables, and the numbers can fluctuate wildly. Note in that table that the lowest rating is 53 while the largest rating is 19,000. That difference is due to the EWP properties. There are players who have a 99.99% chance of beating an opponent, and those raw ratings differences are astronomical  and kind of depressing for the underdog.
For these reasons, we transform the raw ratings using a logarithmic scale to more palatable Power Ratings. You can see examples of Power Ratings for players in our analysis of 2016 Wimbledon by clicking here.
(4) What data do you use to come up with these numbers?
The Tennis Recruiting database has many years of data from USTA and ITF junior tournaments. The current rankings we display on the TennisRecruiting.net website use the past twelve months of data from tournaments outlined in our FAQ.
Why twelve months? We use that time frame because most tournaments are annual events, and the twelvemonth window allows players to replace matches that fall off their records with matches of similar quality. For example, the twelvemonth window ensures that highlyranked American high school players will always have results from the most recent USTA National Championships from Kalamazoo or San Diego counting towards their ratings and rankings.
(5) This whole article seems to be about ratings. What are the rankings?
As we discuss in this ratings and rankings article, rankings are simply orderings of players by ratings. At Tennis Recruiting, we rank by graduation year, so we create one rank list for each class for the boys and girls.
Questions? Comments?
We have answered many questions about our rankings over the years via email, but this article is our attempt at a firstclass decription of what we do and how we do it. We welcome your comments below ...
Leave a Comment
More Ranking Articles
19Sep2017
Comparing Rating Algorithms
During the U.S. Open we posted information on social media about matchups involving American players  previewing matches by displaying rating and ranking differences and highlighting wins by underdogs. You might have noticed that the various rating systems often agreed on their favorites. But what about the close matches where they disagreed? Let's take a look ...
19May2016
What Is An Upset?
Tennis Recruiting is a website that rates and ranks junior tennis
players. One of the questions we get most often from our users is,
"What exactly is an upset?" There are many possible
definitions of an upset  this article explores the question and
puts forward an answer.
5Oct2015
An Overview of Ratings and Rankings
Tennis Recruiting is a website that rates and ranks junior tennis
players, and because of that, we field many questions about how to
interpret our lists. Questions like, "Are rankings better than
ratings? Which is more important?" Or, "Since your system
ranks by graduation year, are you able to compare players from
different classes?" This article addresses the simple distinction
between ratings and rankings.