Order Statistics and Tournaments

Sequential examination is not the only way to find the maximum element on a list. Another way to do this is to have the elements play in a tournament much like is done in athletic events. This is accomplished by comparing each pair of elements on the list and moving the largest on to the next round. An example appears below as figure 1.

Figure 1 - A Tournament of Integers

Note that round two featured the elements 17, 22, 87, and 37 since they were larger than their partners in the first round. Eventually 87 won the tournament since it was the largest of all the numbers and thus won in every round of the tournament.

Suppose that we were to design an algorithm to run the tournament and find the maximum. We begin with a list of n numbers and enter them in the tournament by placing them on the list of players. Then we have them play matches (by comparing pairs of them), and retain the winners on the list of players for to the next round. We continue doing this until only one element remains. The complete algorithm is presented in figure 2. For convenience we require the number of elements on the list to be a power of two.

Figure 2 - Tournament Algorithm

This looks good, but we need to argue that it indeed finds the maximum element on a list. We shall do this by arguing that at each round a certain number of elements are smaller than one of the participants in that round, and that in the final round, where the winner emerges, all other elements are smaller

Let us examine the while loop in the context of the picture in figure 1. The player lists of the program are just the elements in each round or vertical column of the picture. At the beginning we have 8 elements and we know nothing about them. After one round, 4 remain in competition and we know that each of them is larger than one other element since it had beat it to reach the second round. After the second round, 2 elements remain and each of them is larger than three other elements. (This is because each of the winners beat two elements and one of these beat another element in the first round.) For example: 22 is larger than three elements since it beat 17 and 14, and 17 beat 3.

The relationships between the round, number of elements which emerged from (or won in) that round (k), and the minimum number of elements smaller than those which emerged from that round are shown in the chart of figure 3.

 round k smaller 0 n 0 1 n/2 1 2 n/4 3 3 n/8 7 m n/2m 2m-1

Figure 3 - Tournament Parameter Relationships

Since the number of elements in the round is halved each time, the number of rounds is O(logn) and thus n/2logn or one element emerges from the final round, and this is larger than 2logn-1 or n-1 elements. For that reason it is the maximum on the list. Now, we know exactly why a tournament will select the maximum element. We still need to show that the algorithm does the same.

With this in mind, we can now form the loop invariant for the while loop of our tournament algorithm. It merely states that the elements about to compete are larger than at least a certain number of elements. Here it is in formal, precise terms.

INVARIANT: The elements on the player list are greater than at least (n/k)-1 of the elements on the original list.

All that remains is to show that the algorithm is correct and complete the implementation details about playing and posting the winners to the new player list. Here are the arguments for verifying the three conditions needed to show that the while loop is correct.

1. ENTRY: PRE Þ INV. Since k = n, then obviously all of these elements are larger than at least (n/k - 1) = (n/n – 1) or 0 elements.
2. EXECUTION: INV and (k > 1) and do(set(k) & play & post) Þ INV. Since k > 1 then at least one match is about to be played. Let us examine a match. Because of the invariant, both elements in the match are greater than at least (n/k - 1) others. This means that the winner must be greater than:
1. those it was greater than before (n/k -1),

2. the loser (1), and

3. those smaller than the loser (n/k - 1).

Adding this up, we find that the winner is larger than

2(n/k - 1) + 1 = 2n/k - 1 = n/(k/2) - 1.

This is exactly the number of elements it must be greater than for the invariant to be true for the next round, which is played between half as many elements as this round.

3. EXIT: INV and (k=1) Þ POST. If k equals 1 then by the invariant, the element remaining on the match list is greater than at least (n/k – 1) = (n/1 – 1) = n-1 elements on the list. This is the maximum.

All that remains is to fill in the details about playing and posting. Playing is just comparing the numbers and posting can be implemented by writing the winners to the top of the tournament list. This is done by using a new list (which we name p[]) as the player list and post the winners to the front of it. Examine the algorithm presented in figure 4.

Figure 4 - The Final Tournament Algorithm

Showing that the inner for-loop does indeed perform the playing and posting required for the tournament is left as an exercise.

Let us now turn to complexity. In each round, k compares are made. Adding them up produces the sum:

n/2 + n/4 + … + 2 + 1 = n - 1.

Thus the complexity is O(n) or linear.

Our next question concerns finding both the maximum and minimum. If we note that the first round separated the maximum candidates from the minimum candidates, we see that we need only to:

1. Finish the maximum tournament.

2. Play a minimum tournament between the n/2 first round losers.

This adds up to:

(n - 1) + (n/2 - 1) = 3n/2 - 2 compares.

Finding the maximum and second largest is a little more interesting. At first glance, we might award second place to the loser in the finals. After all, this is done in sports tournaments. One of the first people to protest this practice in writing was the Reverend Dodgson [Do83], an Oxford Don better known under his penname: Lewis Carroll.

So, after a little thought, we find that we might have to play two tournaments, one for the maximum and one for the second largest elements since the second best could be eliminated early on in the tournament. This would require (n - 1) + ((n - 1) - 1) or 2n -3 compares. But let us examine the size of the second tournament. First we shall recall the precise definition of the second largest element on a list.

Definition. The second largest element on a list is the element that is larger than every element except the maximum.

In terms of tournaments this means that the second largest can only be beaten by the maximum. Thus, there must have been a match between the largest and second largest during the tournament. Look back at figure 1. The second largest element is 56 and it was indeed knocked out by 87; in fact in the first round!

This means that the second place tournament can be played between the logn elements beaten by the maximum. Thus the number of compares needed to find the maximum and second largest is n + logn - 2.

[Do83] Dodgson, Rev. Charles Lutwidge. 'Lawn Tennis Tournaments'. 1883. Reprinted in The Works of Lewis Carroll, ed. Roger Lancelyn Green, Hamlyn Publishing Group Ltd., Middlesex, 1968.