Expected Random Sum Above 1

Probability Level 5

Starting with $S= 0$ , you choose a number between $0$ and $1$ at random and add it to $S$ . If $S< 1$ , you repeat and choose a number between $0$ and $1$ at random and add it to $S$ . If $S \geq 1$ , you stop.

If the expected value of $S$ when you stop is denoted by $E[S]$ , what is the integer closest to $\frac { 200 \times E[S] }{ e^2 } ?$

The answer is 37.

5 solutions

Paul Sherman
Apr 27, 2014

Here is a solution that invokes a bit more formality but I hope in doing so, also resolves some of the confusion of the earlier posts. I believe it is Beakal's original solution stated more formally.

Imagine you have a infinite sequence $x_1, x_2, x_3, \ldots$ of independently chosen random numbers, each between zero and one. Of course, in any given instance of the process described in the problem, only a finite number of the $x_i$ will contribute to the sum but it doesn't hurt to imagine that one goes on choosing random numbers that will not be used. For $n \ge 1$ , let $I_n$ be an "indicator variable" defined by $I_n = 1$ if $\sum_{i=1}^{n}x_i < 1$ and $I_n = 0$ if $\sum_{i=1}^{n}x_i \ge 1$ . In other words, $I_n$ indicates whether the game has terminated on or before the $n^{th}$ move. For convenience, I'll also define $I_0 = 1$ since certainly the game hasn't terminated before I've even chosen the first random number.

Let $S$ be the sum of random variables when the process terminates (the original problem was to find the expectation $E[S]$ ). Then $S = \sum_{n=1}^{\infty}I_{n-1} x_n$ . Since $I_{n-1}$ depends only on $x_1, x_2, \ldots, x_{n-1}$ , it is independent of $x_n$ . Therefore $E[S] = \sum_{n=1}^{\infty}E[I_{n-1}] E[x_n] = \frac{1}{2} \sum_{n=1}^{\infty}E[I_{n-1}]$ . Now, $E[I_{n-1}]$ is the probability that $\sum_{i=1}^{n-1}x_i < 1$ which is equal to $\frac{1}{(n-1)!}$ (I have not provided the proof of this here but you can find it by Googling "volume of standard simplex"). So $E[S] = \frac{1}{2} \sum_{n=1}^{\infty} \frac{1}{(n-1)!} = \frac{e}{2}$ . Now multiplying by $\frac{200}{e^2}$ you get 36.7879 and the closest integer is 37, which is the answer to the problem.

Interpreting the above solution in light of the concerns in Calvin's posts: the quantity $\sum_{n=1}^{\infty}E[I_{n-1}]$ is the average number of random numbers selected before the process ends. The independence argument in the proof shows that indeed one can simply multiply this sum by $\frac{1}{2}$ to get the answer.

Arghyadeep Chatterjee
Oct 24, 2020

Let the process stop after the $n$ th time .

The condition for this is we pick $x_{1},x_{2}..........x_{n}$ and $x_{n+1}$ such that $\displaystyle x_{1}+x_{2}+.....x_{n}<1$ and $\displaystyle x_{1}+x_{2}+x_{3}+.....x_{n}+x_{n+1}>1$ .

Let $A$ denote the event that $\displaystyle x_{1}+x_{2}+........x_{n}<1$ and $B$ denote the event that $\displaystyle x_{1}+x_{2}+x_{3}+.....+x_{n}+x_{n+1}>1$ .

We need $P(A\cap B)$ in order to calculate expectation.

$P(A\cap B) = P(A)\cdot P(B\vert A)$ . [Here $P(B\vert A)$ denotes the probability of event $B$ given that event $A$ has happened] .

So let us calculate $P(A)$ .

The probability of this event is given by $\displaystyle \frac{\text{Volume of a standard n-simplex}}{\text{Volume of unit Hypercube in n-dimension}}$ .

The volume of the standard n-simplex can be found using calculus.

$\displaystyle \int_{0}^{1}\int_{0}^{1-x_{n}}\int_{0}^{1-x_{n-1}-x_{n}}......\int_{0}^{1-x_{3}-......-x_{n-1}-x_{n}}\int_{0}^{1-x_{2}-x_{3}-......-x_{n-1}-x_{n}} \,dx_{1}\,dx_{2}\,dx_{3}......dx_{n-1}\,dx_{n} = \frac{1^{n}}{n!}=\frac{1}{n!}$ .

Hence $\displaystyle P(A) = \frac{1}{n!}$

Now to find $P(B\vert A)$ let us call $\left(x_{1}+x_{2}+........x_{n}\right) = z$ .

So the $P(B\vert A)$ is identical to finding the probability of $P(z+x_{n+1}) <1$ where $z$ and $x_{n+1}$ are uniformly and randomly selected from $(0,1)$ which is an easy two dimensional problem. We can even apply basic geometry to solve it . It would be the area of the triangle formed by the line $x+y=1$ and the coordinate axis . But let us do it by the above simplex concept as a $2\text{-simplex}$ is just a line.

$\displaystyle P(B\vert A) =\int_{0}^{1}\int_{0}^{1-x_{n+1}}\,dz\,dx_{n+1} = \frac{1}{2}$ .

Hence $\displaystyle P(A\cap B) = \frac{1}{2\cdot n!}$

Hence the expectation $\text{E}(S)$ is given by $\displaystyle \text{E}(S)=\sum_{n=1}^{\infty} n\cdot P(A\cap B) = \sum_{n=1}^{\infty}\frac{n}{2\cdot n!} = \frac{1}{2}\sum_{n=1}^{\infty}\frac{1}{(n-1)!} = \frac{\text{e}}{2}$ .

Hence our answer is $\displaystyle \frac{100}{e} = 36.7879$

Useful wiki :- Simplex .

Beakal Tiliksew
Apr 22, 2014

Initially i assumed the the expected value of each to be $\frac{1}{2}$ , if that isn't the case another way that i found was Not exactly mine but

We start with a general case

Given $N$ random numbers between 0 and 1, the probability ${ P }_{ N }\left( s \right)$ that their sum does not exceed $S$ is given by $\frac { { S }^{ N } }{ N! }$ for $S\le 1$ .

I won't go it the proof for now, But if we assume that it is true, let ${ D }_{ N }(s)ds\quad$ be the probability that sum is between $s+ds$ then it is basically the derivative of ${ P }_{ N }\left( s \right)$ with respect to s Thus we get

${ D }_{ N }(s)ds=\frac { { s }^{ N-1 } }{ (N-1)! }S\le 1$

OK there is a chance that the $(N+1)th$ number pushes the number above $1$ , Obviously the last number is between $1-s$ and $s$ , hence the average result will equal $s+\frac { 1-s }{ 2 } =1+\frac { s }{ 2 }$

Hence the expected sum equals $S=\sum _{ 1 }^{ \infty }{ \left( \int _{ 0 }^{ 1 }{ \left( 1+\frac { s }{ 2 } \right) (s)\left( \frac { { s }^{ N-1 } }{ (N-1)! } \right) } \right) } \\ \quad =\sum _{ 1 }^{ \infty }{ \frac { 3{ N }^{ 2 }+5N }{ 2(N+2)! } } \\ \quad =\sum _{ 1 }^{ \infty }{ \left( \frac { 3 }{ 2N! } -\frac { 2 }{ (N+1)! } +\frac { 1 }{ (N+2)! } \right) } \\ \quad =\frac { 3 }{ 2 } \left( e -1 \right) -2(e -2)+(e -5/2)\\ \quad =\frac { e }{ 2 }$

I feel this method works, it can also be shown as calvin stated that the number of $N$ required to be $e$ same method

Thanks. I have updated the wording of the problem, and the answer is now 37.

Calvin Lin Staff - 7 years, 1 month ago

Calvin Lin Staff
Apr 21, 2014

(This is not a solution, but an area of discussion because the validity of the problem is being questioned.)

The answer should have been $\frac{ 200 E[S] } { e^2 } = \frac{ 100}{ e} \approx 36.7879$ . Those who entered 36 or 37 were marked correct. The correct answer is now 37.

(The following comments are flawed.)

It can be shown that the probability that we need $n$ values is $\frac{ 1}{ n (n-2)!}$ .
This gives us that the expected number of values needed is $\sum \frac{ n}{n (n-2)!} = e$ .
A common claim is that the expected value of the sum is thus $\frac{e}{2}$ . This would be true if the expected value of each of these values in the sum is $\frac{1}{2}$ .
However, the expected value of each of these individual values is not $\frac{1}{2}$ .

Clearly, the expected sum of $n$ values should be increasing in $n$ .
Let us find the expected value when exactly $n=2$ dice are needed, which should give us a lower bound for $E[S]$ .
From calculations similar to the above, we can show that the pdf has the form $p(x) = 4 - 2x$ on the range $x \in [1,2]$ , and hence $E[S | n = 2 ] = \int_{1}^2 x(4-2x) \, dx = \frac{4}{3} = 1.333\ldots$ .
As such, it is easy to believe that $E[S] > \frac{e}{2} = 1.359\ldots$ .

Note: Continuing in this manner, you can eventually find $E[S] = \sum \frac{1}{n(n-2)!} \times E[S|n \text{ values }]$ .

I think the question should have been 100 E[S] / e. That would be 50.

Srinath Sridhar - 7 years, 1 month ago

What is wrong with applying the Wald's lemma ?

Abhishek Sinha - 7 years, 1 month ago

And for a more dependable source for the statement of Wald's lemma, see Theorem 1.1 of this note .

Abhishek Sinha - 7 years, 1 month ago

Wald's lemma doesn't apply, because the distribution doesn't satisfy the condition that "the number of terms in the sum is independent of the summands." (end of first paragraph, or assumption 2). Always remember to check the conditions before applying a theorem.

Calvin Lin Staff - 7 years, 1 month ago

It does not need to. These are called general stopping time problems and the r.v. $N$ might as well be dependent on the sequence $(X_i)$ . We just need it to be measurable w.r.t. the natural filtration and to satisfy some finite expectation property. See the link I posted in my previous comment.

Abhishek Sinha - 7 years, 1 month ago

@Abhishek Sinha – HM, I'm not too certain now. I might have a mistake in my original approach.

Calvin Lin Staff - 7 years, 1 month ago

@Calvin Lin – If you are not too sure about the relevant advanced probability theory, my suggestion would be to just simulate the situation and see what average S you get.

Abhishek Sinha - 7 years, 1 month ago

@Abhishek Sinha – Thanks. I have updated the correct answer to 37.

Calvin Lin Staff - 7 years, 1 month ago

@Abhishek Sinha – I am aware of the theory. I convinced myself that this stopping time didn't satisfy the assumptions, because I felt strongly that $E[S] > \frac{e}{2}$ as outlined above. I agree that $E[S] = \frac{e}{2}$ is likely the answer.

Let's give @Beakal Tiliksew a day to respond with his solution. I've added the flag that "the answer is currently under dispute".

Calvin Lin Staff - 7 years, 1 month ago

@Beakal Tiliksew Care to comment?

Calvin Lin Staff - 7 years, 1 month ago

i Can't really say much about what @Abhishek Sinha statement, i couldn't follow much on wald's equation, i have to study about it.

Beakal Tiliksew - 7 years, 1 month ago

Wait, when i first posted the question, i am pretty sure i made $\frac{e}{2}$ to give the answer of 50, this wasn't the form i initial wrote the question

Beakal Tiliksew - 7 years, 1 month ago

@Beakal Tiliksew – You initially had it as "If the average or the expected value of the sum can written as $p$ . What is $50 \left \lfloor p \right \rfloor$ "

15 mins after posting the question, You made a change of the statement to "If the average or the expected value of the sum can written as $p$ . What is $(200) \frac { { p }^{ 2 } }{ { e }^{ 2 } }$ "

Because of that, I felt strongly that you had a reason for introducing $e^2$ .

(I made slight edits to the phrasing for clarity.)

Calvin Lin Staff - 7 years, 1 month ago

@Calvin Lin – I guess that was an error in my part. When i finished writing the question the first time the answer and the question didn't agree ,and since i couldn't change the answer i was forced to make the question fit, rather in a peculiar way maybe, But anyways thanks , i shall improve on that.

Beakal Tiliksew - 7 years, 1 month ago

Wald's equation is just a fancy way of justifying Fubini's Theorem in the context of probability theory. Fubini's Theorem is a fancy way of justifying the interchange of summation signs, in the context of analysis.

I'd likely create a set that looks into when $X = \sum X_i$ but $E[X] \neq \sum E[X_i]$ .

Calvin Lin Staff - 7 years, 1 month ago

There's something I don't quite follow. You say that the pdf is $p(x) = 4 - 2x$ for $1 \le x \le 2$ . I think you mean the pdf conditioning on $n = 2$ ; then it makes sense, because $\int_1^2 (4 - 2x) \: dx = 1.$

So you compute $E[S \mid n = 2] = \frac{4}{3}$ . But then by your own formula $E[S] = \sum_n \frac{1}{n(n - 2)!} E[S \mid n],$ the contribution from the case $n = 2$ is actually $\frac{1}{2} \cdot \frac{4}{3} = \frac{2}{3},$ which is well below $e/2$ .

Jon Haussmann - 7 years, 1 month ago

I made the assumption that "Clearly the expected sum of $n$ values should be increasing in $n$ ", to justify that $E[S] > E[ S | n = 2 ]$ , and I believed that the increase is somewhat more significant that $\frac{4}{3} - \frac{e}{2} = 0.026\ldots$ .

I haven't figured out what's wrong with my logic as yet, hence the disclaimer "the following comments are flawed.

Calvin Lin Staff - 7 years, 1 month ago

by the way , why can't we say the average value to be $\frac{1}{2}$

Beakal Tiliksew - 7 years, 1 month ago

David Vaccaro
Jul 21, 2014

Define $f_{n}(x)$ as the probability density function of $S$ after $n$ rounds, defined on $0\leq x \leq 2$ . As each term is uniformly distributed we have $f_{1}(x)=1$ on $0 \leq x \leq 1$ and 0 otherwise.

For $x \leq 1$ we have: $f_{n+1}(x)=\int_{0}^{x}f_{n}(t)\textrm{d}t$

For $x \geq 1$ we have $f_{n+1}(x)=f_{n}(x)+\int_{x-1}^{1}f_{n}(t)\textrm{d}t$ :

$f_{n+1}(x)=\frac{1}{n!}x^{n}$ on $0\leq x < 1$

$f_{n+1}(x)=\sum_{k=0}^{k=n}\frac{1-(x-1)^{k}}{k!}$ on $1 \leq x \leq 2$ .

Integrating to get the expectation and taking the limit (using Monotone Convergence) gives $E(S)=\int_{1}^{2}x(e-e^{x-1})\textrm{d}x$ . The result follows via Integration by Parts.

Expected Random Sum Above 1

The answer is 37.

5 solutions

0 pending reports