In front of you is a locked door with a code panel. There are buttons marked and , and the code to unlock the door is a combination of at most letters. The code panel remembers the last few buttons pressed, and if they match the correct code, the door will open. What is the shortest sequences of letters you can use to guarantee you have included the correct code?
This problem is shared by Arthur M. , who obtained it from Pål Hermunn Johansen in an IMO training camp.
Details and assumptions
You do not know whether the code length is or , but the code panel does, and it remembers only the last buttons pressed.
As an explicit example, if the code is the 3 letters , then the door will be unlocked right at the end of typing in the following sequence: .
This section requires Javascript.
You are seeing this because something didn't load right. We suggest you, (a) try
refreshing the page, (b) enabling javascript if it is disabled on your browser and,
finally, (c)
loading the
non-javascript version of this page
. We're sorry about the hassle.
It is easy to see that each code of length 1 , 2 or 3 is a subsequence of (at least) one code of length 4 , so we really only need to try out all the distinct codes of length 4 . There are altogether 5 4 = 6 2 5 possible codes of length 4 . Since a sequence of n letters ( n > 3 ) contains n − 3 (not necessarily distinct) codes of length 4 , we need a minimum of 6 2 5 + 3 = 6 2 8 letters.
Now we shall prove that it is possible to find a sequence of 6 2 8 letters which contains 6 2 5 distinct codes of length 4 . We shall construct a directed graph (with loops and multiple arcs) whose vertices are all the possible 3 -letter sequences, and there is a directed edge from vertex u to vertex v if the second and third letters of u are the first and second letters of v respectively. Thus, each vertex would have 5 edges directed outwards (corresponding to different third letters of its direct successor) and 5 edges directed inwards (corresponding to different first letters of its direct predecessor). We can represent any sequence of n > 3 letters as a walk on the digraph as follows: the first three letters determines which vertex the walk begins with, the second to fourth letter determines the next vertex visited, and so on. From the way the digraph is defined, we will always be able to trace a walk on the digraph. Now there is a bijection between the 4 -letter codes and the edges in the digraph - each 4 -letter code P Q R S can be associated with the edge from P Q R to Q R S and vice versa. So there are 6 2 5 edges in the digraph, and we want to find a walk which traverses every single edge of the graph. For this, we shall make use of the well-known fact that a directed graph has an Eulerian cycle if and only if every vertex has equal in-degree and out-degree, and all of its vertices with nonzero degree belong to a single strongly connected component. We know that every vertex has both in-degree and out-degree of 5 , and that the entire graph is a strong connected component since it is always possible to find a path from any vertex P Q R to any other vertex T U V : P Q R → Q R T → R T U → T U V . So the digraph has a Eulerian cycle, i.e. we can find a trail that visits each of the 6 2 5 edges exactly once (and returns to the original vertex). By following this trail, we will be able to construct the sequence of 6 2 8 letters which contains the 6 2 5 distinct codes of length 4 .
Therefore, our answer is 6 2 8 .
Note that this problem is related to the De Bruijn sequence , and we're actually looking at B ( 5 , 4 ) (but we need to add 3 because the De Bruijin sequence is cyclic).