Pity the prisoners

The Locker Puzzle

Background

I haven’t had all that much spare time recently and I’ve been using most of it thinking about/working on a medium sized project so I thought I’d take a break and write a simple little post about puzzles. It’s pretty light material but I hope it might be interesting to somebody.


I had the idea to write another puzzle post because one of my previous posts was discussed briefly in a google groups thread. The post discusses the strategy that some prisoners trying to escape from an eccentric jailer should adopt and in the course of the discussion in the google groups thread, another puzzle about some prisoners was mentioned. This second puzzle asks for the strategy which some prisoners should adopt in order to escape from an improbable situation involving numbered lockers. I noticed that although the solution was discussed in the thread, the proof that the solution is optimal was not given.


I first came across the locker puzzle a few years ago and wondered how to prove that the solution was optimal. I discovered that a proof of optimality appeared in the Mathematical Intelligencer, 28(1):28-31, 2006. Since the argument for optimality is quite nice and doesn’t seem to be obviously available on the web at the moment, I thought it might be worth writing about.

The puzzle

A group of 100 prisoners, numbered 1, 2, …, 100, are told that they will have to play a game with a jailer for their freedom. In the game, the jailer will take 100 tickets, numbered 1, 2, .., 100 and place exactly one of these tickets in each of 100 identical lockers, also numbered 1, 2, …, 100. The jailer will choose the locker for each ticket randomly (and uniformly). Once the jailer has distributed the tickets, he will select a prisoner and bring him into the locker room where he will invite him to open and inspect the contents of 50 lockers of his (the prisoner’s) choice, one by one. After this, the prisoner will be sent away and allowed no further communication with the other prisoners for the duration of the game. Following this, the jailer will select a new prisoner and offer him the same choice (the lockers that the first prisoner opened will all be closed again). The jailer will continue like this till each of the 100 prisoners has been allowed to inspect the contents of 50 lockers of their choice. If, at the end, each prisoner has succeeded in opening the locker containing the ticket with his number on it, all the prisoners will be granted their freedom. (Otherwise they will be forced to work out an excessively long notice period at Lee Overlay Partners, not an inviting prospect).


All of this is explained to the prisoners before they start the game. The question is, what strategy should they adopt and what is its probability of success?

Obedience is the solution

Naturally I encourage you to have a go at the puzzle before reading the solution. Read on only when you are sure you want to!


At first it sounds like the prisoners’ chances of success are pretty slim. After all, each prisoner’s chance of finding the ticket with his number is only 1/2 and so if the prisoners adopt a strategy in which the events for each prisoner finding his ticket are independent then their total probability of success is a miniscule 1/2^{100}. From this it is clear that the prisoners must adopt a strategy in which their individual successes are as far from independent as possible. The only trouble is, it seems difficult to arrange any significant deviation from independence given that the prisoners have no prior knowledge of the lockers’ contents and are allowed no communication once they enter the locker room.


The key to this problem is to realise that the space of strategies is larger than it appears at first (to many people anyway). Recall that the prisoners are allowed to open the 50 lockers one by one. As a result a prisoner may choose which locker to open based on the contents (i.e. the ticket numbers) of any previous lockers he may have opened. Once we have the idea of using the contents of one locker to decide which locker to open next, a simple strategy suggests itself which we call the obedient strategy. In the obedient strategy a prisoner starts by opening the locker with his number on it; he examines the number of the ticket in this locker, if it is his number then he has succeeded so he can open any other 49 lockers, if not then he opens the locker bearing the number of the ticket he has just found. He continues like this until either he has found his own number or he has opened 50 lockers.


The question is, what chance of freedom do the prisoners have if they adopt this strategy?

Success probability of obedient strategy

The jailer’s distribution of tickets corresponds to a permutation. To work out the probability of the prisoners’ success we must consider which permutations will defeat the obedient strategy. Recall that any permutation can be written uniquely (up to order) as a product of disjoint cycles. If there are n prisoners, then those permutations which will defeat the obedient strategy are those which contain a cycle of length > n/2 in their cycle decomposition. Let us count such permutations. If r > n/2 then there are \binom{n}{r} (r-1)! (n-r)! permutations containing a cycle of length r. Thus the probability of a uniformly random permutation containing a cycle of length > n/2 is:
q = \frac{1}{n!}\sum_{r=1+[n/2]}^n \binom{n}{r} (r-1)! (n-r)! = \sum_{r=1+[n/2]}^n 1/r
If we recall that \sum_{r=1}^n 1/r \to \ln(n) + \gamma as n \to \infty where \gamma is Euler’s constant, we then have:
q = \sum_{r=1}^n 1/r - \sum_{r=1}^{[n/2]} 1/r \to \ln(n) + \gamma - \ln(n/2) - \gamma = \ln 2 as n \to \infty
from which we see that the prisoners’ probability of success is p = 1 - q \simeq 1 - \ln 2 = 0.30685...


The good news is that the prisoners have greatly increased their chances of success from 1/2^{100} to about 30%. Unfortunately this is still not great and the bad news is that unfortunately this is the best they can do as we shall now see.

Optimality of obedient strategy

We have finally reached the section of this post which constitutes my reason for all this waffle! Before I give the argument for optimality of the obedient solution, I should repeat that it is taken from Eugene Curtin and Max Warshauer’s article in the Mathematical Intelligencer.

A modification of the game

We consider a modification of the jailer’s game in which a prisoner is required to continue opening lockers until he has found the ticket with his number on it. In this modified version of the game, the prisoners win iff no prisoner opened more than 50 lockers. Clearly this does not change the prisoners’ chances. Since it is convenient for the argument below we shall analyse this modified game in place of the original.

An alternative locker game

Now suppose that the prisoners had been asked to play an alternative game in which the rules were as follows. Prisoner number 1 is invited to start the game by opening lockers (according to any strategy he likes) until he finds ticket number 1. If there are any lockers still closed when he is finished, the lowest numbered prisoner whose ticket has not yet been revealed begins opening lockers until he finds the ticket with his number on it (note that the lockers which were opened by prisoner 1 are left open). This continues until all the lockers have been opened. In this game the jailer declares that the prisoners win provided no prisoner opened more than n/2 lockers.

The alternative game is strategy-neutral

A key observation regarding the alternative game is that it doesn’t matter what strategy the prisoners adopt; all strategies have the same chance of success. To see this suppose that the jailer writes down the ticket numbers in the order than they are discovered in the lockers as the prisoners play the game according to their chosen strategy. He will end up with a permutation of 1, 2, …, n and from this permutation he can see whether or not the prisoners succeeded. Furthermore all permutations are equally likely since each remaining ticket number is always equally likely at each point in the game (i.e. the prisoners have no way of influencing what the ticket number the next locker they open will contain). This means that those permutations which correspond to victory for the prisoners always occur with the same probability nomatter what strategy they adopt, i.e. their strategy is irrelevant.

Success probability for the alternative game

Let us consider the probability of the prisoners’ success for the alternative game. Since all permutations are equally likely, we just need to count those permutations which correspond to wins for the prisoners. In addition to the jailer’s method for representing the course of the game as a permutation as above, there is another natural way of associating a unique permutation to the course of a game. If prisoner 1 opened lockers containing tickets with numbers a_1, a_2, \ldots, a_k = 1 (in order) then we associate the k-cycle (a_1 a_2 \ldots a_k) to prisoner 1 and similarly for all prisoners who open lockers. We thus obtain a set of disjoint cycles, i.e. a permutation. Associating permutations to instances of the game this way, it is clear that the prisoners succeed iff the permutation has no cycles of length > n/2. The chance that the prisoners succeed is thus just the chance pn that a random permutation has no cycle of length > n/2. We have already noted that this is the chance that the obedient strategy succeeds in the original game (we also calculated that it is asymptotically 1 - \ln 2).

The alternative game dominates the original game

To finish, first note that any strategy for the modified original game can be employed in the alternative game. A prisoner implements his strategy for the modified original game in the alternative game by opening lockers as he would in the modified original game, except that he simply observes the contents of any lockers he encounters which are already open. Furthermore if a strategy succeeds for a particular distribution of tickets in the modified original game then it will succeed for that same distribution of tickets in the alternative game. This means that the probability of success for any strategy in the modified original game is bounded above by its probability of success in the alternative game, pn. This shows that the modified original game (and so also the original game) cannot have any strategy with a better chance than pn of success and since the obedient strategy attains this, it must be optimal.

Opening more than half the lockers

I must say I can’t help pitying the poor prisoners given that their best chance of success is a mere 30%. It would be fairer if each prisoner were allowed to open more than half the lockers. All of the above (except the calculation of pn) goes through if they are allowed to open a proportion x of the lockers. Also, if x > 1/2 then the calculation for pn also goes through and we have that p_n \to -\ln x as n \to \infty. For exmaple if the prisoners were allowed to open 60 lockers instead of 50 then their chances would be almost even at least.

Two ant puzzles



Now that I’ve got thinking about puzzles I can’t resist mentioning two other simple puzzles I like.

Ants on a metre stick

The puzzle

Suppose 100 ants, each of which walk at exactly 1cm/s, are to be placed on a metre stick and set walking in a given set of directions (imagine the the metre stick is in front of us running left/right so that each ant walks left or right). If an ant reaches the end of the metre stick he will fall off and if two ants meet, they both immediately turn around and continue walking. Note that eventually all the ants will fall off the metre stick. You are handed these (very well behaved) 100 ants and are required to distribute them along the metre stick (both in terms of position and initial direction of walk) so that it takes as long as possible till the last ant has fallen off the metre stick. The question is, how do you do this and how long is the maximum time?

Solution

I guess the reason I like this puzzle is that it’s one of those puzzles which becomes utterly trivial once you look at it the right way. Ask yourself this question: how would the puzzle be different if the ants simply passed through each other when they met instead of immediately turning around? A moment’s thought reveals that the answer to the puzzle would be unchanged since the distribution of ants on the metre stick very shortly after the meeting would be the same in each case (we regard all ants as identical). This means that we just need to answer the question in the case where the ants pass through each other and this is easy! All we do is to make sure we place one ant at an end of the metre stick and facing in and do what we like with the other 99. That’s it! and the maximum time is obviously 100s.


Another way to think about this is to associate a time to each ant which is how long it would take him to reach the end of the stick if he met no other ant. Then observe that when two ants meet (in the original version where they turn around) all that happens is that they swap the times. The solution is then clear.

An ant on a rubber band

The puzzle

Suppose a perfectly elastic, homogenous, infinitely extensible rubber band connects a Ferrari to a peg on a wall. Initially the rubber band is one metre long, the Ferrari is at rest and there is an ant on the peg (thus one metre away from the Ferrari) about to start walking towards the Ferrari at his maximum speed of 1cm/s. However as soon as the ant begins to walk the Ferrari accelerates (instantly) to 100km/h (which, being a Ferrari, it is able to maintain no matter how hard the rubber band pulls). The question is, can the ant ever reach the Ferrari and if so, how long does it take?

The solution

This isn’t so much a puzzle as a simple calculation but I still like it. The solution is as follows. Let’s denote the ant’s speed by w=1cm/s and the Ferrari’s speed by v=100km/h. Let x denote the fraction of the total length of the rubber band that the ant’s position represents and let l denote the length of the rubber band. Working in the frame of reference of the point of the rubber band that the ant is currently at we have:
w = l\dot x
(We are representing differentiation with respect to time by a dot.) If l0 = 1m is the initial length of the rubber band then we have l(t) = l_o + vt and so:
\dot x = \frac{w}{l_0 + vt}
which, recalling that x(0) = 0, has solution:
x(t) = wv^{-1}\ln(1 + vt/l_0)
From this we can calculate how long till x(t) = 1 as
t = v^{-1}l_0(e^{v/w}-1)
Using our values for v, w, l0, x we find that the ant does indeed reach the Ferrari but not for some time, i.e. about 2.7 x 101197 years.


As I finish writing this, I can’t help wondering how the relativistically correct version of this puzzle might work out. Unfortunately I’m already running late and must head out but maybe I’ll include a discussion in a future post!

About these ads

4 thoughts on “Pity the prisoners

  1. Pingback: Eugene Curtain and Max Washauer « The Lumber Room

  2. These are some very entertaining posts. I think you have a small typo in the “success probability of obedient solution” section. Specifically the final bit:

    q = \Sigma_{r=1}^{n} 1/r – \Sigma_{r=1}^{[n/2]}

    the second summation is missing quantity intended to be summed (1/r). Either that or I’m lost.

  3. Pingback: Cool things to spend time on | umjkh

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s