Uniques out of bag

You have a bag of n n numbers from 1 to n n . You take a number out and record its value and then put it back to the bag. You repeat it n n times so that by the end of this process you have recorded n n values in total. What is the expected fraction of values that appeared exactly once in the sequence of n n recorded values when n n \to \infty ? Enter your answer up to 6th decimal.

Bonus: What is the fraction of not observed numbers out of all numbers?


The answer is 0.367879.

This section requires Javascript.
You are seeing this because something didn't load right. We suggest you, (a) try refreshing the page, (b) enabling javascript if it is disabled on your browser and, finally, (c) loading the non-javascript version of this page . We're sorry about the hassle.

1 solution

Ilya Prokin
Jul 27, 2018

Let X to be the number of unique values in the sequence. We want to compute E [ X / n ] = E [ X ] / n E[X/n] = E[X]/n .

We can write X as a sum of indicator functions: X = i = 1 n I ( i occurred once ) X = \sum_{i=1}^{n} \mathbb{I}(i \text{ occurred once}) . The expectation then:

E [ X ] / n = 1 n i = 1 n E [ I ( i occurred once ) ] = 1 n i = 1 n P ( i occurred once ) = E[X]/n = \frac{1}{n}\sum_{i=1}^nE[\mathbb{I}(i \text{ occurred once})] = \frac{1}{n}\sum_{i=1}^nP(i \text{ occurred once}) = = 1 n i = 1 n P ( 1 occurred once ) = P ( 1 occurred once ) = \frac{1}{n}\sum_{i=1}^nP(1 \text{ occurred once}) = P(1 \text{ occurred once}) .

If value 1 has occurred only at one place in the sequence of recorded numbers, there are n n possible slots for it. There are n 1 n-1 slots remaining to be filled with any of the n 1 n-1 values (excluding i i ). So we have n ( n 1 ) ( n 1 ) n(n-1)^{(n-1)} possibilities when i i is unique for any i i . The total number of ways to record n n numbers is n n n^n . Therefore, P ( 1 occurred once ) = n ( n 1 ) ( n 1 ) n n = ( 1 1 / n ) n 1 = E [ X ] / n P(1 \text{ occurred once}) = \frac{n(n-1)^{(n-1)}}{n^n} = (1-1/n)^{n-1} = E[X]/n .

Take limit: lim n ( E [ X ] / n ) = e 1 \lim_{n \to \infty}(E[X]/n) = e^{-1} .

Bonus problem: Same approach: X = i = 1 n I ( i not observed ) X = \sum_{i=1}^{n} \mathbb{I}(i \text{ not observed}) E [ X ] / n = 1 n i = 1 n E [ I ( i not observed ) ] = P ( 1 not observed ) = ( 1 1 / n ) n E[X]/n = \frac{1}{n}\sum_{i=1}^nE[\mathbb{I}(i \text{ not observed})] = P(1 \text{ not observed}) = (1-1/n)^n . The limit of this is again e 1 e^{-1} (about 36% of values are not observed).

Basically used the same method.

As a slight alternative to computing P ( 1 occurred once ) P(1 \text{ occurred once}) , you can just recognize P ( 1 occurred k times ) P(1 \text{ occurred } k \text{ times}) as a binomial distribution with p = 1 n p = \frac{1}{n} and q = 1 1 n q = 1-\frac{1}{n} , which immediately gives P ( 1 occurred once ) = ( n 1 ) ( 1 n ) 1 ( 1 1 n ) n 1 = ( 1 1 n ) n 1 P(1 \text{ occurred once}) = \binom{n}{1}\left(\frac{1}{n}\right)^1\left(1-\frac{1}{n}\right)^{n-1} = \left(1-\frac{1}{n}\right)^{n-1}

Brian Moehring - 2 years, 10 months ago

0 pending reports

×

Problem Loading...

Note Loading...

Set Loading...