Information Theory is the study of how to quantify information. First proposed by Claude Shannon in 1948, in the context of exploring the limitations of data compression, Information Theory plays an important role in many modern technologies, in terms of both what can and cannot be done with data. Information Theory also forms the basis of many purely recreational activities such as magic tricks (one popular example is here though there are many others. Maybe someday I will write a note about some others) and games. (Ever play MasterMind? Twenty Questions? P.I.? Hanabi?)
In the following sequence of problems I have tried to illustrate some of the simplest principles of Information Theory through a sequence of guessing games.
The fundamental unit of information is a bit. A bit (as you probably know) is something than can take one of two values, 0 or 1. (Or, if your want to look at it from the signal process/electrical engineering angle, On or Off. Or True or False, to take a Boolean Logic perspective.) If you are trying to distinguish between different things, then you require bits to describe the distinction.
58% of people got this right.
Alice thinks of a natural number between 1 and 10,000 (inclusive). Bob must guess it. Each time Bob makes a guess, Alice tells him whether it is too small, too big, or correct. If it is not correct, Bob guesses again.
How many guesses does Bob need in order to correctly identify Alice's number?
Clarification: It is not enough for Bob to figure out Alice's number. He must actually say it. That is, Bob's last guess must be the right answer.
From Guessing games
But wait a minute! Alice's answers above could be one of three things, "too small", "too big" or correct. This is true. However, the only advantage you gain from a response of "correct" is that you can stop asking questions. So effectively, for all but the last question, the answers have only possibilities, "too small" and "too big", and Bob's best strategy is to make guesses that divide the space of possible answers into two intervals as equally as possible. Indeed if Alice's answers took the form "guess < hidden number" or "guess hidden number", the choice is truly binary, and the resulting game is nearly (but not exactly) the same.
38% of people got this right.
Bob: Hey Alice, lets play "Guess a number".
Alice: Oh I couldn't be bothered...
Bob: Oh come on, please?
Alice: Fine, I'll play. But you must submit your guesses in batches of 4. Except at the end. On your last attempt you can guess just one number, but if you are wrong you lose.
Bob: Oh, OK. But what will you tell me about my guesses when I make 4 at once?
Alice: The usual. Whether each one is too small, too big, or correct. Tell you what, I'll make it a bit easier. I'll choose a number between 1 and 6249.
%%%
So Alice thinks of a number between 1 and 6249.
Bob may play as many rounds of guessing four numbers as he wishes. [Note: the four guesses need not be distinct, but they will count for four guesses]. On the final round he must commit to a single number that he believes is Alice's number.
How many guesses does Bob need in order to correctly identify Alice's number and guarantee a win?
Remember, your answer must be 1 mod 4.
Follow up to this problem
From Guessing games
In both the above problems, some of the power comes from adaptivity. Bob gets to tailor his next guess based on Alice's answer to the previous one. If Bob cannot do that, then it turns out the only way he can "guess" Alice's number is by guessing all the numbers. However, guessing a number that splits the interval into two intervals is only one kind of binary question Bob could ask. If Bob were allowed to ask a different kind of binary question, things could be different.
65% of people got this right.
Bob: Hey Alice, lets play "Guess a number"
Alice: Not again! Fine. This time why don't you tell me all your guesses at once.
Bob: Come on, where's the fun in that? The only way I can be sure to guess your number is if I guess all 10,000 numbers.
Alice: OK fine, two rounds and we'll only play to 500. You get one chance to submit a bunch of guesses, and I will respond to them. Then you must commit to a single number and if it is not my number you lose.
Bob: Hmm. That's still a bit absurd, but I'll play if you agree to a slight change of rules. Instead of guesses, I will give you a list of questions in the style of "Twenty Questions". You return my list with all my questions answered with "yes" or "no". Then I will commit to a single number. Deal?
Alice: Deal.
%%%
So Alice thinks of a number between 1 and 500.
Bob makes up a list of questions.
Alice receives the list, answers all the questions with yes or no and returns the list to Bob.
Now Bob must guess Alice's number, and if he is wrong he loses.
How many questions does Bob need to ask (non-adaptively) in order to correctly identify Alice's number and guarantee a win?
Full disclosure: This problem was edited to improve correctness. Thank you to @Brian Moehring
Follow up to this problem
From Guessing games
What about missing information? How can we deal with that?
47% of people got this right.
So, Alice thinks of a number between 1 and 1000, and Bob makes up a list of questions.
Alice then receives the list, selects one question to leave blank, answers all of the other questions with yes or no, and returns the list to Bob.
Now Bob must guess Alice's number and, if he is wrong, he loses.
How many questions does Bob need to ask (non-adaptively) in order to correctly identify Alice's number and guarantee a win?
In the above problem, Bob can deal with one missing bit. Erasure codes (about which I will write a note someday...) study how to encode messages so as to be able to recover them even when some of the bits may be "erased".
But what if some of the information were to be "corrupted" rather than "erased"? That is, instead of seeing a blank for some piece of information, you see an incorrect answer. The problem is that when your answer choices are binary, an incorrect answer looks just like a correct answer, so you don't even have a helpful clue as to where the problem has occurred. Perhaps surprisingly, there are still ways to add redundancy to the information so that one can recover from a small number of corruptions. The next two problems ask you to come up with a scheme for recovering from one corruption. The general design of such redundancy and the its limitations comes under the topic of Error Correcting Codes.
40% of people got this right.
So, Alice thinks of a number between 1 and 1000, and Bob makes up a list of 15 questions.
Alice then receives the list, answers all of the questions with yes or no, with the guarantee that at least 14 are answered truthfully, and returns the list to Bob.
Now, Bob must guess Alice's number.
What is the maximum probability that Bob correctly guesses Alice's number? (Assume that he chose an optimal set of 15 questions.)
Clarification: Bob may not ask self-referential questions (e.g. "Is your answer to this question a lie?"). No logical paradoxes, please! Questions asking about the truthfulness of answers to other questions are allowed.
71% of people got this right.
Alice and Bob play the 1-lie version of the numbers guessing game again. But this time, although Bob still has 15 questions, Alice's number is between 1 and 2000.
For those who missed the previous problems in this series, the details:
Alice thinks of a number between 1 and 2000, and Bob makes up a list of 15 questions.
Alice then receives the list, answers all of the questions with yes or no, with the guarantee that at least 14 are answered truthfully, and returns the list to Bob.
Now, Bob must guess Alice's number.
Is it possible for Bob to uniquely identify Alice's number with 15 questions?
Clarification: Bob may not ask self-referential questions (e.g. "Is your answer to this question a lie?"). No logical paradoxes, please! Questions asking about the truthfulness of answers to other questions are allowed.
Easy Math Editor
This discussion board is a place to discuss our Daily Challenges and the math and science related to those challenges. Explanations are more than just a solution — they should explain the steps and thinking strategies that you used to obtain the solution. Comments should further the discussion of math and science.
When posting on Brilliant:
*italics*
or_italics_
**bold**
or__bold__
paragraph 1
paragraph 2
[example link](https://brilliant.org)
> This is a quote
\(
...\)
or\[
...\]
to ensure proper formatting.2 \times 3
2^{34}
a_{i-1}
\frac{2}{3}
\sqrt{2}
\sum_{i=1}^3
\sin \theta
\boxed{123}
Comments
There are no comments in this discussion.