Arthur's code

In front of you is a locked door with a code panel. There are 5 5 buttons marked A , B , C , D A, B, C, D and E E , and the code to unlock the door is a combination of at most 4 4 letters. The code panel remembers the last few buttons pressed, and if they match the correct code, the door will open. What is the shortest sequences of letters you can use to guarantee you have included the correct code?

This problem is shared by Arthur M. , who obtained it from Pål Hermunn Johansen in an IMO training camp.

Details and assumptions

You do not know whether the code length n n is 1 , 2 , 3 1, 2, 3 or 4 4 , but the code panel does, and it remembers only the last n n buttons pressed.

As an explicit example, if the code is the 3 letters A A B AAB , then the door will be unlocked right at the end of typing in the following sequence: B A B B A B A A B BABBABAAB .


The answer is 628.

This section requires Javascript.
You are seeing this because something didn't load right. We suggest you, (a) try refreshing the page, (b) enabling javascript if it is disabled on your browser and, finally, (c) loading the non-javascript version of this page . We're sorry about the hassle.

3 solutions

Derek Khu
Aug 5, 2013

It is easy to see that each code of length 1 , 2 1, 2 or 3 3 is a subsequence of (at least) one code of length 4 4 , so we really only need to try out all the distinct codes of length 4 4 . There are altogether 5 4 = 625 5^4 = 625 possible codes of length 4 4 . Since a sequence of n n letters ( n > 3 n > 3 ) contains n 3 n-3 (not necessarily distinct) codes of length 4 4 , we need a minimum of 625 + 3 = 628 625+3=628 letters.

Now we shall prove that it is possible to find a sequence of 628 628 letters which contains 625 625 distinct codes of length 4 4 . We shall construct a directed graph (with loops and multiple arcs) whose vertices are all the possible 3 3 -letter sequences, and there is a directed edge from vertex u u to vertex v v if the second and third letters of u u are the first and second letters of v v respectively. Thus, each vertex would have 5 5 edges directed outwards (corresponding to different third letters of its direct successor) and 5 5 edges directed inwards (corresponding to different first letters of its direct predecessor). We can represent any sequence of n > 3 n>3 letters as a walk on the digraph as follows: the first three letters determines which vertex the walk begins with, the second to fourth letter determines the next vertex visited, and so on. From the way the digraph is defined, we will always be able to trace a walk on the digraph. Now there is a bijection between the 4 4 -letter codes and the edges in the digraph - each 4 4 -letter code P Q R S PQRS can be associated with the edge from P Q R PQR to Q R S QRS and vice versa. So there are 625 625 edges in the digraph, and we want to find a walk which traverses every single edge of the graph. For this, we shall make use of the well-known fact that a directed graph has an Eulerian cycle if and only if every vertex has equal in-degree and out-degree, and all of its vertices with nonzero degree belong to a single strongly connected component. We know that every vertex has both in-degree and out-degree of 5 5 , and that the entire graph is a strong connected component since it is always possible to find a path from any vertex P Q R PQR to any other vertex T U V TUV : P Q R Q R T R T U T U V PQR \rightarrow QRT \rightarrow RTU \rightarrow TUV . So the digraph has a Eulerian cycle, i.e. we can find a trail that visits each of the 625 625 edges exactly once (and returns to the original vertex). By following this trail, we will be able to construct the sequence of 628 628 letters which contains the 625 625 distinct codes of length 4 4 .

Therefore, our answer is 628 628 .

Note that this problem is related to the De Bruijn sequence , and we're actually looking at B ( 5 , 4 ) B(5,4) (but we need to add 3 3 because the De Bruijin sequence is cyclic).

Moderator note:

This proof gives an excellent example of how we can use graph theory to solve problems in other areas of combinatorics.

That's a wonderful solution, I learned a lot by reading it! Thanks!

Tim Vermeulen - 7 years, 10 months ago
Michael Tong
Aug 5, 2013

If all of the length-four codes are included, then necessarily all of the length-one, two, and three codes are also included. The codes can be written in such a way that each consecutive quadruplet of 4 letters correspond to a unique code. Thus the minimum number of letters is 3 + 5 4 3 + 5^4 , or 628 628 .

Didn't think that I could publish this since I already published another solution so it was a bit incomplete. Let me elaborate a little bit.

The claim ("The codes can be ... to a unique code") is substantiated with knowledge of De Brujin sequences which show that it is possible for each substring of a length k k in a sequence with an alphabet of n n to be unique under any ( k , n ) (k, n) .

In addition, we know that 628 628 is minimized because there are only a total of 625 625 substrings of length 4 4 counted with multiplicity in this sequence. Stated differently, if there were only 627 627 letters included, then we could only have 624 624 substrings of length 4 4 , thus it is impossible that we could have all 625 625 distinct substrings of length 4 4 .

Michael Tong - 7 years, 10 months ago
Mark Gil Mangao
Aug 6, 2013

This problem is the same as finding the length of the shortest string which contains all the 1-letter, 2-letter, 3-letter and 4-letter permutations of {A, B, C, D, E} , with repetition. Now, let's begin by exhausting all the 1-letter permutation of {A, B, C, D, E} . We could simply do that by making a string of length 5 which contains all the letters in the set. Note that in this string, we have already made 4 2-letter permutations of the set. So we only need to exhaust the remaining 21 2-letter permutations and we can do that by adding more letters in our beginning string. Since we are looking for the shortest string, we add the extra letters such that the 4 2-letter permutations prior to the beginning string would not appear again. For example, if we begin by the string ABCDE , aside from the fact that it exhausts all possible 1-letter code, it also covers the 2-letter codes AB , BC , CD and DE . So the remaining 21 2-letter codes can be exhausted by adding 21 more letters in the 5-letter string, which in our example is ABCDE , without repeating the consecutive AB , BC , CD , DE . By now, we already have a string of length 26 . This string already covers 24 3-letter codes which means that we need to add more letters to the 26-letter string to exhaust the remaining 101 3-letter codes and this could be done by adding 101 more letters. This gives us a string of length 127 which already covers 124 4-letter codes, which leaves 501 4-letter codes. Thus, the length of the shortest string which guarantees to have included the correct code is 127 + 501 = 628.

0 pending reports

×

Problem Loading...

Note Loading...

Set Loading...