Coin Flipping Game

Alice and Bob are playing a coin flipping game. Each of them will flip their own quarter 100 times and note the outcome of each flip on a line, in their own scoresheet: H H for heads and T T for tails.

If on any line Alice marks H H HH , she will give herself 1 point, begin a new line, and continue flipping.

If on any line Bob marks H T HT , he will give himself 1 point, begin a new line, and continue flipping.

After 100 flips each, who is more likely to have more points?

Clarification: Here is an example of a possible scoresheet for Alice after 25 flips, where she scores 3 points.

  • H T T H H HTT \boxed{HH}
  • H T T T T H T H T H T T H H HTTTTHTHTHTT \boxed{HH}
  • T T T H H TTT\boxed{HH}
  • T T

Taken from a numberphile video

Alice Bob Neither Alice nor Bob

This section requires Javascript.
You are seeing this because something didn't load right. We suggest you, (a) try refreshing the page, (b) enabling javascript if it is disabled on your browser and, finally, (c) loading the non-javascript version of this page . We're sorry about the hassle.

3 solutions

Brandon Monsen
Feb 11, 2017

A Qualitative approach:

For Bob, as soon as he hits an H H , he will earn a point on the next T T . He has a 50 % 50 \% chance of winning every throw he doesn't win after the first H H in a given line.

However for Alice, her odds "reset" if she doesn't hit a second H H after her first. This means she must land another heads before her next 50 % 50 \% chance of winning.

Hence, If we view condition 1 1 as either person flipping H H , and condition 2 2 as scoring a point after condition 1 1 is met, then Bob's condition 2 2 failure IS condition 1 , whereas Alice's condition 2 2 failure is NOT condition 1 .

Each condition is equally likely to be satisfied from either player, so this makes Bob \boxed{\text{Bob}} more likely to score more points.

As @Calvin Lin said below, the likelihood of seeing H H HH or H T HT is exactly identical, but it is only when looking at probability density in the long run, and the fact that we are using 2 different strings of flips, that we see the odds tip towards Bob

Nice solution.
What are then the expected points for Alice and Bob?

Abdelhamid Saadi - 4 years, 3 months ago

Log in to reply

Since we expect Bob to score a point every 4 4 flips, we can then say that we expect him to score 25 25 points, since his point total doesn't affect his probability.

Same goes for Alice, since her EV is 6 6 flips per point, so we would expect she gets 100 6 = 16.6667 \frac{100}{6}=16.6667 points.

Brandon Monsen - 4 years, 3 months ago

Log in to reply

But I think we should take in account the fact that the last point doesn't follow the same rule.

Abdelhamid Saadi - 4 years, 3 months ago

Log in to reply

@Abdelhamid Saadi Which rule are you referring to? The one where they start a new line?

If so, perhaps a better question would be if I have 100 beads and I expect to put n n in a single line. how many lines would I expect to make?

If you meant "and continue flipping" being broken, since they would stop on the 100th flip, they would still get the point. I just wanted to show that they don't stop after recording a point. Perhaps I should reword that?

Brandon Monsen - 4 years, 3 months ago

Log in to reply

@Brandon Monsen What I mean the last line don't need be end by 'HH' or 'HT', while the other lines need to.

Abdelhamid Saadi - 4 years, 3 months ago

Log in to reply

@Abdelhamid Saadi Ah I see. The nice thing about Bob is that his expected number of flips is 4 4 , meaning every four flips we would expect him to score one point. This is nicely divisible by 100 100 so we should actually be able to expect that his last group of four results in a point.

I believe the same thing goes for Alice, but probability has always been one of my weaker branches of math.

Sorry I can't give a good answer.

Brandon Monsen - 4 years, 3 months ago

I think there is a typo. Bob is unlikely to score 100 points.

Bob Kadylo - 4 years, 3 months ago

Log in to reply

I don't believe I wrote he was likely to score 100 points, just that he was more likely to score more points after 100 flips?

Brandon Monsen - 4 years, 3 months ago

Look lower - I quote: "@Abdelhamid Saadi – Since we expect Bob to score a point every 4 flips, we can then say that we expect him to score 100 points." I think you meant to say 25 points.

Bob Kadylo - 4 years, 3 months ago

Log in to reply

Yes you're right. Thanks for pointing that out!

Brandon Monsen - 4 years, 3 months ago

I have some concerns reading through this.

  1. What does "more likely to have more points" mean? My interpretation is to look at P ( B > A ) P ( B > A ) . Note that being more likely does not imply that (on average) E [ B ] > E [ A ] E [ B ] > E[A] . For example, if we had P ( B = 0 ) = 0.99 , P ( B = 100000 ) = 0.01 , P ( A = 1 ) = 1 P( B = 0) = 0.99, P (B = 100000) = 0.01, P(A = 1 ) = 1 , then it is more likely for A to have more points, but expected value of B is still much larger than A.

  2. Calculating the expected number of flips as a proxy for the expected number of points (in the long run) doesn't always work, especially for dependent events. For example, even though the likelihood of getting HHHHH is 1 32 \frac{1}{32} , the expected number of coin tosses before we get 5 consecutive heads is not close to 32, even after accounting for the initial flips) . In general, for positively correlated indicator events, the expected number of observations before a positive case is observed, is going to be larger than 1 / probability of it occurring. (It is equal for independent indicator events, which is what led to this false belief that it is equal for all events.)

The problem and ideas need to be clarified further.

Calvin Lin Staff - 4 years, 3 months ago

Log in to reply

If by looking at P ( B > A ) P(B>A) , you meant the probability that Bob scores more points than Alice, that is what I meant. I'm not sure how to word it differently other than maybe explicitly say "after 100 flips, is Alice or Bob more likely to have the higher score?"

Also, I see what you mean about EV not correlating to likely winner. Just because it's expected, doesn't mean A) that the probabilities of having it occur on that flip are the same and B) that the "decay" or "increase" in probability of the event to occur increase/decrease in the same way.

However, is that not what I did for the qualitative approach? When I broke the point scoring down into conditions all I was doing was showing that one an H is landed, Alice's odds decrease on average over time and Bob's stay the same until they score a point for themselves.

At this point I'm not sure whether or not half of my solution or my entire solution is wrong/incomplete.

Brandon Monsen - 4 years, 3 months ago

Log in to reply

  1. If you're going for P ( B > A ) P ( B > A ) , then what you have to do is look at the probability density function for P ( X = x ) P(X = x ) . As listed in the comment above, it is possible for P ( A > B ) > 1 2 P ( A > B ) > \frac{1}{2} while E [ B ] > E [ A ] E[B] > E[A] .

  • "However, is that not what I did for the qualitative approach?" Unfortunately no, that is not what you did.

  • Consider the following question. We flip a coin repeatedly till we get HH or HT. Which are we more likely to get?

    Calvin Lin Staff - 4 years, 3 months ago
    Dmitry Nikolaev
    Jun 26, 2017

    The solution becomes evident if we look at expected values of the lengths of “dry runs” for Alice and Bob before we score. For Alice we have the following system of equations:

    E 0 = 1 2 ( 1 + E H ) + 1 2 ( 1 + E T ) E T = 1 2 ( 1 + E H ) + 1 2 ( 1 + E T ) E H = 1 2 ( 1 + E T ) \displaystyle E_0 = \frac{1}{2} (1 + E_{H}) + \frac{1}{2} (1 + E_{T}) \\ \displaystyle E_T = \frac{1}{2} (1 + E_{H}) + \frac{1}{2} (1 + E_{T}) \\ \displaystyle E_H = \frac{1}{2} (1 + E_{T})

    For Bob, however, it looks as follows:

    E 0 = 1 2 ( 1 + E H ) + 1 2 ( 1 + E T ) E T = 1 2 ( 1 + E H ) + 1 2 ( 1 + E T ) E H = 1 2 ( 1 + E H ) \displaystyle E_0 = \frac{1}{2} (1 + E_{H}) + \frac{1}{2} (1 + E_{T}) \\ \displaystyle E_T = \frac{1}{2} (1 + E_{H}) + \frac{1}{2} (1 + E_{T}) \\ \displaystyle E_H = \frac{1}{2} (1 + E_{H})

    Solving for E 0 E_{0} in both cases we see that in Alice’s case it’s 5, but in Bob’s case it’s 3, so Alice will on average score on every 6th flip and have an expected score of 16.7, while Bob will score on every 4th flip and have an expected score of 25 (which is very easy to verify by a simulation).

    Calvin Lin Staff
    Feb 28, 2017

    [This solution is not complete.]

    Lemma: (Alice) Consider the game where we flip coins until we first get a HH. What is the probability density function for N H H N_{HH} , the number of coin flips?

    In order for there to be exactly n n flips, we need:

    • For the last 2 flips, it is a HH
    • For the n 2 n-2 flip, it is a T (otherwise we just need n-1 flips)
    • For the first n 3 n-3 flips, there are no two consecutive heads. Recall (not proving here) that the number of ways to get no two consecutive heads is F n 1 F_{n-1} , the Fibonacci number.

    Thus, P ( N = n ) = F n 1 2 n P ( N = n ) = \frac{ F_{n-1}} { 2^n} . A quick sanity check shows that the sum of these probabilites is 1, since n = 2 F n 1 2 n = 1 \sum_{n=2}^{\infty} \frac{ F_{n-1}} { 2^n} = 1 . _\square

    Lemma: (Bob) Consider the game where we flip coins until we first get a HT. What is the probability density function for N H T N_{HT} , the number of coin flips.

    In order for there to be exactly n n flips, we need:

    • For the last 2 flips, it is a HT
    • For the first n 2 n-2 flips, there is no HT. Once we get a H, it has to be H all the way. There are n 1 n-1 ways for the first H to appear.

    Thus, P ( N = k ) = n 1 2 n P (N=k) = \frac{n-1} { 2^n} . A quick sanity check shows that the sum of these probabilites is 1, since n = 2 n 1 2 n = 1 \sum_{n=2}^{\infty} \frac{ n-1 } { 2^n} = 1 . _\square

    Let's summarize these results

    2 3 4 5 6 7 \ldots
    HH 1 4 \frac{1}{4} 1 8 \frac{1}{8} 2 16 \frac{2}{16} 3 32 \frac{3}{32} 5 64 \frac{5}{64} 8 128 \frac{8}{128} \ldots
    HT 1 4 \frac{1}{4} 2 8 \frac{2}{8} 3 16 \frac{3}{16} 4 32 \frac{4}{32} 5 64 \frac{5}{64} 6 128 \frac{6}{128} \ldots

    From here, (not proven) since P ( N H T k ) P ( N H H k ) P ( N_{HT} \leq k ) \geq P(N_{HH} \leq k ) shows that Bob is more likely to have a shorter cycle, hence he is more likely to have more points than Alice.

    (This is where the solution is incomplete) To properly do this, we have to calculate the PDF for the number of points that they get. For example, note that the probability they would get 50 points (all lines have length 2) is the same, but the probability that they will get 49 points favors Bob (2 lines of 3 or 1 line of 4). We then calculate P ( B > A ) P ( B > A ) .

    @Brandon Monsen This is how I would approach the problem if asked about P ( B > A ) P(B>A) .

    In particular, I think what that a sufficient condition is P ( N H T k ) P ( N H H k ) P ( N_{HT} \leq k ) \geq P ( N_{HH} \leq k ) (but clearly not a necessary condition).

    Calvin Lin Staff - 4 years, 3 months ago

    I looked at your question you posted and what you showed me so far makes sense except for one thing.

    In your solution, you said for Alice that she had to flip n n coins without a H H HH appearing, and for bob he had to flip n n coins, and after his first head have all the rest be heads before the final flip.

    That's what I was trying to show in the qualitative solution. That after the first H H , bob will have an equal to or greater than probability compared to Alice of landing a winning flip, which was why I broke it down into conditions.

    What in the wording/concepts am I missing (besides actual numbers) that separates what you and I did in this case?

    Brandon Monsen - 4 years, 3 months ago

    Log in to reply

    (Related to my comment 1. from before) The entire expected value calculation is irrelevant. The goal isn't to show that E [ B ] > E [ A ] E[B] > E[A] . The goal is to calculate P ( B A ) P ( B \geq A ) . This to me, is the biggest distinction to make. In probability, it can be tricky to determine what you should be measuring / looking at / using as a proxy.

    Note that this solution still has a big gap. In particular, we have not determined P ( B A ) P ( B \geq A ) . My guess is that for dominating probability sequences P ( N H T k ) P ( N H H k ) P ( N_{HT} \leq k ) \geq P ( N_{HH} \leq k ) , it is true that P ( B A ) 1 2 P ( B \geq A ) \geq \frac{1}{2} .

    This seems to be the idea that you're going for when you argue that "after the first H, Bob will have an equal to or greater than probability comapared to Alice". However, that isn't clearly expressed in your solution, nor is the reasoning provided.

    Calvin Lin Staff - 4 years, 3 months ago

    This is completely besides the point, but how can n = 2 F n 1 2 n = 1 = n = 2 n 1 2 n \sum_{n=2}^\infty \frac {F_{n-1}}{2^n} = 1 = \sum_{n=2}^\infty \frac {n-1}{2^n} (From the lemmas)??

    How can exponential exponential = linear exponential \sum \frac {\text{exponential}}{\text{exponential}} = \sum \frac {\text{linear}}{\text{exponential}} ?

    Henry U - 2 years, 3 months ago

    1 pending report

    Vote up reports you agree with

    ×

    Problem Loading...

    Note Loading...

    Set Loading...