The Rich Get Richer

A bag initially contains one red ball and one blue ball. You select one of these balls at random and then place two balls of that same color back in the bag. For instance, if you drew a red ball, then you would put two red balls back into the bag so that the bag would contain two red balls and one blue bag. You repeat this until the bag contains 2016 balls. For the color that has the most balls present in the bag, what is the expected number of balls of this color? That is, after the bag contains 2016 balls, you open it and choose the color that has more balls (or choose either color in case of a tie); what is the expected number of balls of this color?

If the answer can be expressed as p q \dfrac{p}{q} , where p p and q q are coprime positive integers, enter p + q p+q as your answer.


The answer is 3048191.

This section requires Javascript.
You are seeing this because something didn't load right. We suggest you, (a) try refreshing the page, (b) enabling javascript if it is disabled on your browser and, finally, (c) loading the non-javascript version of this page . We're sorry about the hassle.

3 solutions

Ivan Koswara
Jan 14, 2016

Relevant wiki: Expected Value - Problem Solving

The answer is 3046176 2015 \frac{3046176}{2015} , giving p + q = 3048191 p+q = 3048191 .

We will prove the following surprising claim:

Claim : If the bag has n n balls, then the probability that it has r r red balls (and thus n r n-r blue balls) is exactly 1 n 1 \frac{1}{n-1} , for all n 2 , 1 r n 1 n \ge 2, 1 \le r \le n-1 .

We will prove this claim by induction on n n . For n = 2 n = 2 , this is trivial (the bag starts with 1 red ball and 1 blue ball, so the probability that it has 1 red ball is 1 = 1 2 1 1 = \frac{1}{2-1} ).

Now, suppose we have proven the claim for n = n 0 n = n_0 , and we'd like to prove the claim for n = n 0 + 1 n = n_0 + 1 .

Fix r r with 1 r n 0 1 \le r \le n_0 . We want to compute the probability that when we have n 0 + 1 n_0 + 1 balls, we have r r red balls. Consider the last ball added into the bag. There are two cases:

Case 1 : This ball was red, so the previous bag has r 1 r-1 red balls out of n 0 n_0 balls. If r 2 r \ge 2 , the probability of this happening is 1 n 0 1 r 1 n 0 \frac{1}{n_0 - 1} \cdot \frac{r-1}{n_0} : the probability that we do have r 1 r-1 red balls when we have n 0 n_0 balls (inductive hypothesis), multiplied by the actual probability of taking a red ball in this case. If r = 1 r = 1 , the probability is zero since we cannot have no red balls in the bag (the original red ball must always remain in), but this can also be represented as 1 n 0 1 r 1 n 0 \frac{1}{n_0-1} \cdot \frac{r-1}{n_0} because the second term is zero. So the probability that a red ball is taken to make our bag to have r r red balls out of n 0 + 1 n_0 + 1 balls is 1 n 0 1 r 1 n 0 \frac{1}{n_0 - 1} \cdot \frac{r - 1}{n_0} .

Case 2 : This ball was blue, so the previous bag has r r red balls (and thus n 0 r n_0 - r blue balls) out of n 0 n_0 balls. This case is completely analogous with the above; the probability of this case happening is 1 n 0 1 n 0 r n 0 \frac{1}{n_0 - 1} \cdot \frac{n_0 - r}{n_0} .

In total, the probability that we have r r red balls out of n 0 + 1 n_0 + 1 balls is simply the sum of the probabilities contributed by the two cases above; that is, 1 n 0 1 r 1 n 0 + 1 n 0 1 n 0 r n 0 = 1 n 0 \frac{1}{n_0 - 1} \cdot \frac{r - 1}{n_0} + \frac{1}{n_0 - 1} \cdot \frac{n_0 - r}{n_0} = \frac{1}{n_0} . This completes the proof.

Now that we have proven this, we can compute the expected value easily:

E ( number of majority balls ) = r = 1 2015 P ( $r$ red balls ) number of majority balls = r = 1 2015 1 2015 max { r , 2016 r } = 1 2015 ( r = 1 2015 max { r , 2016 r } ) = 1 2015 ( r = 1 1007 ( 2016 r ) + r = 1008 2015 r ) = 1 2015 ( r = 1 1007 ( 2016 r ) + r = 1 1008 ( 2016 r ) ) = 1 2015 ( 2016 2015 ( 1 + 2 + + 1007 + 1008 + 1007 + + 1 ) ) = 1 2015 ( 2016 2015 100 8 2 ) = 3046176 2015 \displaystyle\begin{aligned} E(\text{number of majority balls}) &= \sum_{r=1}^{2015} P(\text{\$r\$ red balls}) \cdot \text{number of majority balls} \\ &= \sum_{r=1}^{2015} \frac{1}{2015} \cdot \max \{r, 2016-r\} \\ &= \frac{1}{2015} \cdot \left( \sum_{r=1}^{2015} \max \{r, 2016-r\} \right) \\ &= \frac{1}{2015} \cdot \left( \sum_{r=1}^{1007} (2016-r) + \sum_{r=1008}^{2015} r \right) \\ &= \frac{1}{2015} \cdot \left( \sum_{r=1}^{1007} (2016-r) + \sum_{r=1}^{1008} (2016-r) \right) \\ &= \frac{1}{2015} \cdot ( 2016 \cdot 2015 - (1+2+\ldots+1007+1008+1007+\ldots+1) ) \\ &= \frac{1}{2015} \cdot (2016 \cdot 2015 - 1008^2) \\ &= \boxed{ \dfrac{3046176}{2015} } \end{aligned}

The second equality uses our result, together with the fact that the number of majority balls is simply the larger of r r (number of red balls) and 2016 r 2016-r (number of blue balls). The fourth equality uses the result that when r < 1008 r < 1008 , the bigger one is 2016 r 2016-r , and when r 1008 r \ge 1008 , the bigger one is r r . (Okay, they are equal when r = 1008 r = 1008 , but we can choose either one.) The fifth equality performs the replacement r 2016 r r \to 2016-r (this also reverses the order of summation). The sixth equality takes everything apart, setting the 2016's to the side. The seventh equality uses the (easily proven) fact 1 + 2 + + n + ( n 1 ) + + 1 = n 2 1+2+\ldots+n+(n-1)+\ldots+1 = n^2 .

Moderator note:

Knowing that this problem is Polya's Urn allows us to start out with the surprising claim.

A more simple consideration via tree diagram resulted in the expression 3 4 n + 1 \frac{3}{4}n+1 where n denotes the count of even samples. In our case n equals 2014 leading to an expected number 1511.5 = 3023 2 1511.5 = \frac{3023}{2} which means the solution is 3025.

Rem.: Indeed the difference between our expected numbers is approximately 0.25 which is less than one ball and therefore meaningless!

Andreas Wendler - 5 years, 4 months ago

Log in to reply

If you were running a lottery paying out £1,000,000 to the lucky winner, but had only sold £1,200,000 of tickets, you would not regard an expected number of 1.25 winners as meaningless compared to an expected number of 1 winner. In the second case, you would probably make a profit. In the first case, you would probably go bankrupt. Certainly so if you did this every week!

Just because you are describing something that comes in integers, that does not mean that fractional parts of expectations aren't still very important! 2012 stats have the average number of children per family in the UK as 1.7; any demographic analysis (for example, projection of the need for primary school places in 2017) would be seriously adrift if we rounded 1.7 up to 2, and even worse if we rounded down to 1.

Mark Hennings - 5 years, 4 months ago

Log in to reply

Indeed you have always to consider the RELATIVE deviation that is in my case immensely less than in your interesting constructive far-fetched funny examples!

Andreas Wendler - 5 years, 4 months ago

Log in to reply

@Andreas Wendler You have said that a deviation of approximately 0.25 in the expectation of an integer-valued random variable is meaningless. I have given you two examples where such a deviation is anything but meaningless. Funny? Thank you. Far-fetched? I am not so sure.

You are claiming that the relative size of the error is relevant. I disagree. What we are talking about here is the fact that fractional factors in expectations of integer-valued random variables (which have a precise mathematical meaning) are important, as a matter of precision. If we choose to round those answers later, when interpreting them for Joe Public, that is another matter. Your calculation leading to 3023 2 \tfrac{3023}{2} is approximate, but inaccurate.

Mark Hennings - 5 years, 4 months ago

Log in to reply

@Mark Hennings When then for our problem surely sufficient because I think you are not able to deliver an empirical proof of your result! Have not heard anything of a tree diagram? Then you would realize that there is an equal distribution of the probabilities to get 1, 2, ..., 2015 balls of one color at the end fixing my result!

Andreas Wendler - 5 years, 4 months ago

Log in to reply

@Andreas Wendler Please read both my and Ivan's proof. We have both used the fact that the distribution of balls is uniform, and both obtained the correct answer. I suspect that you are calculating the expectation of the wring quantity. If you post yiur calculations, I can point out where you went wrong.

Mark Hennings - 5 years, 4 months ago
Kerry Soderdahl
Jan 30, 2016

It's easy to notice that the probability of any particular sequence of draws resulting in r r red balls and b b blue balls in the bag is ( r 1 ) ! ( b 1 ) ! ( n 1 ) ! \displaystyle\frac{(r-1)!(b-1)!}{(n-1)!} (where n n is the total number of balls) regardless of the order of the draws.

The number of paths leading to r r red balls and b b blue balls, which is the same as the total number of possible sequences of r 1 r-1 red ball draws and b 1 b-1 blue ball draws, is simply ( n 2 ) ! ( r 1 ) ! ( b 1 ) ! \displaystyle\frac{(n-2)!}{(r-1)!(b-1)!} .

It follows that the total probability of every sequence of draws leading to r r red balls and b b blue balls is the product of these:

( r 1 ) ! ( b 1 ) ! ( n 1 ) ! × ( n 2 ) ! ( r 1 ) ! ( b 1 ) ! = ( r 1 ) ! ( b 1 ) ! ( n 2 ) ! ( r 1 ) ! ( b 1 ) ! ( n 1 ) ! = 1 n 1 \frac{(r-1)!(b-1)!}{(n-1)!}\times\frac{(n-2)!}{(r-1)!(b-1)!}=\frac{(r-1)!(b-1)!(n-2)!}{(r-1)!(b-1)!(n-1)!}=\frac{1}{n-1}

Knowing that the probability of each of the the 2015 outcomes resulting in 2016 total balls is uniformly distributed, calculating the expected value of the majority color is now very easy.

To account for the cases where the red ball is the majority: k = 1009 2016 k 2015 \displaystyle\sum_{k=1009}^{2016}\frac{k}{2015}

Similarly, to account for the blue ball majority cases: k = 1009 2016 k 2015 \displaystyle\sum_{k=1009}^{2016}\frac{k}{2015}

In the event of a tie: 1008 2015 \displaystyle\frac{1008}{2015}

Summing these:

1008 2015 + 2 k = 1009 2016 k 2015 = 1008 + 2 k = 1009 2016 k 2015 = 3046176 2015 \frac{1008}{2015}+2\sum_{k=1009}^{2016}\frac{k}{2015}=\frac{1008+2\sum_{k=1009}^{2016}{k}}{2015}=\frac{3046176}{2015}

3046176 + 2015 = 3048191 3046176+2015=\boxed{3048191}

Mark Hennings
Jan 16, 2016

This is another case of Polya's urn; the probability that there are b + 1 b+1 blue balls after 2 n 2n ball choices is 1 2 n + 1 \tfrac{1}{2n+1} for 0 b 2 n 0 \le b \le 2n . If there are b + 1 b+1 blue balls after 2 n 2n choices, there are m a x ( b + 1 , 2 n b + 1 ) \mathrm{max}(b+1,2n-b+1) balls of whichever is the greater colour. Thus the expected number of the number of balls of the larger number after 2 n 2n ball choices is M n = 1 2 n + 1 b = 0 2 n m a x ( b + 1 , 2 n b + 1 ) = 1 2 n + 1 ( b = 0 n 1 ( 2 n b + 1 ) + ( n + 1 ) + b = n + 1 2 n ( b + 1 ) ) = 1 2 n + 1 ( n + 1 + 2 b = n + 1 2 n ( b + 1 ) ) = 1 2 n + 1 ( n + 1 + 2 n ( n + 1 ) + 2 b = 1 n b ) = 1 2 n + 1 ( n + 1 + 3 n ( n + 1 ) ) = ( n + 1 ) ( 3 n + 1 ) 2 n + 1 \begin{array}{rcl} M_n & = & \displaystyle \tfrac{1}{2n+1} \sum_{b=0}^{2n} \mathrm{max}(b+1,2n-b+1)\\ & = &\displaystyle \tfrac{1}{2n+1}\left(\sum_{b=0}^{n-1} (2n- b+1) + (n+1) + \sum_{b=n+1}^{2n} (b+1)\right) \\ & = & \displaystyle \tfrac{1}{2n+1}\left(n+1 + 2\sum_{b=n+1}^{2n} (b+1)\right) \; = \; \tfrac{1}{2n+1}\left(n+1 + 2n(n+1) + 2\sum_{b=1}^n b\right) \\ & = & \tfrac{1}{2n+1} \left( n+1 + 3n(n+1)\right) \; = \; \tfrac{(n+1)(3n+1)}{2n+1} \end{array} With n = 1007 n = 1007 we have M n = 3046176 2015 M_n \,=\, \tfrac{3046176}{2015} , and so the answer is 3048191 \boxed{3048191} .

Ah, yes, it's called Polya's urn. I didn't know its name.

Ivan Koswara - 5 years, 5 months ago

0 pending reports

×

Problem Loading...

Note Loading...

Set Loading...