You have a bag of n numbers from 1 to n . You take a number out and record its value and then put it back to the bag. You repeat it n times so that by the end of this process you have recorded n values in total. What is the expected fraction of values that appeared exactly once in the sequence of n recorded values when n → ∞ ? Enter your answer up to 6th decimal.
Bonus: What is the fraction of not observed numbers out of all numbers?
This section requires Javascript.
You are seeing this because something didn't load right. We suggest you, (a) try
refreshing the page, (b) enabling javascript if it is disabled on your browser and,
finally, (c)
loading the
non-javascript version of this page
. We're sorry about the hassle.
Basically used the same method.
As a slight alternative to computing P ( 1 occurred once ) , you can just recognize P ( 1 occurred k times ) as a binomial distribution with p = n 1 and q = 1 − n 1 , which immediately gives P ( 1 occurred once ) = ( 1 n ) ( n 1 ) 1 ( 1 − n 1 ) n − 1 = ( 1 − n 1 ) n − 1
Problem Loading...
Note Loading...
Set Loading...
Let X to be the number of unique values in the sequence. We want to compute E [ X / n ] = E [ X ] / n .
We can write X as a sum of indicator functions: X = ∑ i = 1 n I ( i occurred once ) . The expectation then:
E [ X ] / n = n 1 ∑ i = 1 n E [ I ( i occurred once ) ] = n 1 ∑ i = 1 n P ( i occurred once ) = = n 1 ∑ i = 1 n P ( 1 occurred once ) = P ( 1 occurred once ) .
If value 1 has occurred only at one place in the sequence of recorded numbers, there are n possible slots for it. There are n − 1 slots remaining to be filled with any of the n − 1 values (excluding i ). So we have n ( n − 1 ) ( n − 1 ) possibilities when i is unique for any i . The total number of ways to record n numbers is n n . Therefore, P ( 1 occurred once ) = n n n ( n − 1 ) ( n − 1 ) = ( 1 − 1 / n ) n − 1 = E [ X ] / n .
Take limit: lim n → ∞ ( E [ X ] / n ) = e − 1 .
Bonus problem: Same approach: X = ∑ i = 1 n I ( i not observed ) E [ X ] / n = n 1 ∑ i = 1 n E [ I ( i not observed ) ] = P ( 1 not observed ) = ( 1 − 1 / n ) n . The limit of this is again e − 1 (about 36% of values are not observed).