Prediction Streak

Probability Level 4

In a tournament, ten games of cricket would be played. Each game would necessarily have a winner (i.e., there are no draw or tied games). None of the games are played simultaneously. The outcome of each game is independent of the outcomes of any of the other games. Further, both the teams in a game have an equal chance of winning.

The organiser of the tournament announces a 'prediction contest' where a contestant has to predict the winner of each of the games. The first correct prediction would win 1 point and the second consecutive correct prediction will win 2 point and so on.

A wrong prediction will earn no points and the next correct prediction will earn only 1 point, irrespective of the number of correct predictions done earlier.

For the first four matches, predicting all results correctly will earn $1+2+3+4=10$ points. However, if only the third prediction proves incorrect the number of points would become $1+2+0+1=4$ .

What would be the most common score of a contestant for the '10 match prediction contest'?

10 28 5 27 20 33

2 solutions

Brian Charlesworth
Mar 21, 2015

This is just a cut and paste from my comment to Brock's solution.

it will be quite uncommon to get a high score, e.g., we can only get a score of $20$ three ways, namely

$WWWWLLWWWW, LWWWWLWWWW, WWWWLWWWWL.$

So we need only focus on scores of $10$ and $5$ . To get a score of $10$ we have the following options:

a subsequence of $4$ wins and the rest losses $\Longrightarrow 7$ sequences;
separate subsequences of $3$ wins, $2$ wins and $1$ win and the rest losses $\Longrightarrow 60$ sequences;
threee separate subsequences of $2$ wins each and one subsequence of $1$ win and the rest losses $\Longrightarrow 1$ sequence.

This gives us the number of sequences yielding a score of $10$ as $S(10) = 68.$

Next, to get a score of $5$ we have the following options:

alternating wins and losses $\Longrightarrow 2$ sequences;
one subsequence of $2$ wins and two subsequences of $1$ win each and the rest losses $\Longrightarrow 105$ sequences.

This gives us $S(5) = 107$ , which exceed $S(10) = 68$ . A few quick calculations reveals that

$S(0) = 1, S(1) = 10, S(2) = 36, S(3) = 65, S(4) = 91, S(5) = 105,$

$S(6) = 89, S(7) = 85, S(8) = 84, S(9) = 75, S(10) = 68.$

(I think I have these right; Janardhanan gave slightly different values in the report section so I may be off by just a few, but $S(5)$ is clearly the maximum. The calculations rely heavily on the 'stars and bars' method.)

We see from this that $\boxed{5}$ is indeed the mode of this distribution.

Brock Brown
Mar 18, 2015

I think that the answer is $\boxed{9}$ , not 10, but I picked 10 because it was the closest option.

Here is my code for my reasoning:

Python 2.7:

from time import time
import random
coin = (True, False)
def points():
    streak = 0
    output = 0
    for game in xrange(10):
        winner = random.choice(coin)
        prediction = random.choice(coin)
        if prediction == winner:
            streak += 1
            output += streak
        else:
            streak = 0
    return output
total_points = 0.0
trials = 0.0
seconds = 60 * 5
end = time() + seconds
while time() <= end:
    total_points += points()
    trials += 1
print total_points/trials

After running three 5 minute simulations, I found results such as 9.00318042421, 9.00188525506, and 8.99755225833 which all tend to $\boxed{9}$ .

EDIT: After we determined that we're looking for the mode instead of the mean, here's my revised code:

Python 2.7:

from time import time
import random
coin = (True, False)
count = {}
for i in xrange(56):
    count[i] = 0
def points():
    streak = 0
    output = 0
    for game in xrange(10):
        winner = random.choice(coin)
        prediction = random.choice(coin)
        if prediction == winner:
            streak += 1
            output += streak
        else:
            streak = 0
    return output
total_points = 0.0
trials = 0.0
seconds = 60*5
end = time() + seconds
while time() <= end:
    count[points()] += 1
biggest, answer = 0, 0
for key in count:
    if count[key] > biggest:
        biggest = count[key]
        answer = key
print "Mode:", answer

Output:

Mode: $\boxed{5}$

EDIT: this is soooooo wrong, you probably don't want to read this cuz it's so dumb your iq will drop if you do :p

Well, the only difference that I could possibly think of is that the question asks for the most common score, not the average score. The most likely outcome is of course getting half right and half wrong since $\dbinom{10}{n}$ is maximized at $n=5$ This means there are the most possible ways to get 5 right and 5 wrong.

Taking the smallest possible values, getting every other game wrong vs. getting the first 5 wrong we get the scores of 5 and 15 respectively. If we take the mean of the extrema in a (uniform symmetrical, dunno what word to use) distribution of points we will end up with the mode (or at least that's what I think). Thus the MODE here is not the average of all the scores (it might be but I don't know how to calculate that) and so it must be an integer

Some one please verify there is only 1 MODE.

Trevor Arashiro - 6 years, 2 months ago

You're right; it's the mode that we're looking for, and modes and medians can differ by quite a bit depending on the distribution under analysis, and I think that this is such a distribution. I sense that the most economical way of calculating the mean is via a computer program such as that provided by Brock Brown the Python Wizard, but the mode is within our reach to determine by hand, since it is reasonable to assume that the mode will be somewhere between $1$ and $10$ inclusive.

I have concerns about the phrasing of and the answer to this question. First, it appears that we have to assume that there is a 50/50 chance of predicting the outcome of any match. Given this, there are then $2^{10} = 1024$ possible prediction sequences, each of which is equally likely.

It will be quite uncommon to get a high score, e.g., we can only get a score of $20$ three ways, namely

$WWWWLLWWWW, LWWWWLWWWW, WWWWLWWWWL.$

So we need only focus on scores of $10$ and $5.$ To get a score of $10$ , we have the following options:

a subsequence of $4$ wins and the rest losses $\Longrightarrow 7$ sequences;
separate subsequences of $3$ wins, $2$ wins and $1$ win and the rest losses $\Longrightarrow 60$ sequences;
three separate subsequences of $2$ wins each and one subsequence of $1$ win and the rest losses $\Longrightarrow 1$ sequence.

This gives us the number of sequences yielding a score of $10$ as $S(10) = 68.$

Next, to get a score of $5$ , we have the following options:

alternating wins and losses $\Longrightarrow 2$ sequences;
one subsequence of $2$ wins and two subsequences of $1$ win each and the rest losses $\Longrightarrow 105$ sequences.

This gives us $S(5) = 107$ , which exceeds $S(10) = 68$ , which is why I chose $5$ as the answer. I haven't determined yet if $5$ is the actual mode of this distribution, but of the options given, $5$ is the most likely score.

@Janardhanan Sivaramakrishnan This is a great problem, but I have some concerns about the answer to this problem which I've outlined above. When you get a chance, could you tell me where I may have erred in my reasoning? Thanks.

@Trevor Arashiro Some unchecked calculations yield the following results:

$S(0) = 1, S(1) = 10, S(2) = 36, S(3) = 65, S(4) = 91, S(5) = 105,$

$S(6) = 89, S(7) = 85, S(8) = 84, S(9) = 75, S(10) = 68.$

So it would appear that there is a singular mode, which is $5$ . These ten values sum to $709$ , so there are $315$ sequences that yield scores from $11$ to $55$ which are progressively less common to achieve. (There would be another question; find the sum of all scores that cannot be achieved in any sequence. hehe) So Brock's mean of $~9$ seems reasonable.

Brian Charlesworth - 6 years, 2 months ago

Oh my gosh, I missed so many things....

Can I blame it on the chemistry test I just took that destroyed my mind? I think I'll do that. :)

My solution is without a doubt flawed, it's like In the first half im looking for the expected amount of points, in the second half im looking for the mode of most likely out comes. And I have no idea what im doing at the end.

Also, logically, the Mode should be below then averge, and Brock's Python estimates it at $\approx 9$

Trevor Arashiro - 6 years, 2 months ago

@Trevor Arashiro – Haha. Yeah, chemistry can really mess with the neurons. :) This distribution has a long tail to the upside, so we can be certain that the mode will be less than the mean. With Brock's calculation of $\mu = 9$ ,give or take, it's almost a sure bet that the mode will be less than $9$ which leaves $5$ as the only possible answer. I think that it would have been better to just ask for the mode of the distribution without multiple-choice; that would have made it a level 5 problem, methinks. (Sorry, inadvertent numerical pun there. :P)

@Trevor Arashiro P.S.. I just tried your psychedelic eye question; are you sure it's not "just another sin integral"? It would appear to be just twice the standard integral of $\sin(x)$ over half a cycle. Am I missing something?

Brian Charlesworth - 6 years, 2 months ago

@Brian Charlesworth – haha. I didn't mean to take it that far, but yes, it just is the integral of sin. What I meant is that there are a bunch of questions asking for the integral from 0 to $2\pi$ of sin(x). The only difference is that mine is in degrees :3.

Trevor Arashiro - 6 years, 2 months ago

Is there anything you can't solve with some great code? Really impressive. However, I think that what we are being asked to find here is the mode and not the mean as I believe you have found. I've outlined my mode calculations in a comment above, and have found that $5$ is the (likely) mode.

@Brock Brown

Brian Charlesworth - 6 years, 2 months ago

Yup, $\boxed{5}$ is the mode.

Here are some more random simulations that count for each outcome of points to demonstrate:

Python 2.7:

from time import time
import random
coin = (True, False)
count = {}
for i in xrange(56):
    count[i] = 0
def points():
    streak = 0
    output = 0
    for game in xrange(10):
        winner = random.choice(coin)
        prediction = random.choice(coin)
        if prediction == winner:
            streak += 1
            output += streak
        else:
            streak = 0
    return output
total_points = 0.0
trials = 0.0
seconds = 60*5
end = time() + seconds
while time() <= end:
    count[points()] += 1
biggest, answer = 0, 0
for key in count:
    if count[key] > biggest:
        biggest = count[key]
        answer = key
print "Mode:", answer

Brock Brown - 6 years, 2 months ago

Great. Thanks for the confirmation. Janardhanan meant to have 5 as the answer but made an error when posting the question, so once Calvin get to the report the answer will be changed to 5. :)

Brian Charlesworth - 6 years, 2 months ago

Prediction Streak

2 solutions

0 pending reports