You have an infinite stream of digits from 0-9, each digit produced uniformly at random. You want to produce a random integer in the following, unusual manner:
First, take the first digit of the stream. Then, for each of the next digits, if it's larger than the last digit you took, ignore it. Otherwise, take it as well. This process ends when you take a 0 (since everything else will be 0). Your number is the sum of all the numbers taken.
For example, consider the following stream: 573493205714390...
This problem has two parts, both relating to this generator; you need to solve both of them.
Find 1 0 6 a + 1 0 3 b + N .
This section requires Javascript.
You are seeing this because something didn't load right. We suggest you, (a) try
refreshing the page, (b) enabling javascript if it is disabled on your browser and,
finally, (c)
loading the
non-javascript version of this page
. We're sorry about the hassle.
That's weird, are you sure you didn't mistype F ( 5 2 ) = 0 . 9 9 9 8 7 instead?
Your approach works, but I don't think it's efficient. If I analyzed it correctly, your algorithm would take roughly O ( n d ) time (I think there are around O ( n d − 1 ) partitions, and O ( n ) time to evaluate the fraction for each). My algorithm takes roughly O ( n d 2 ) time. The good thing is your runtime has a very small coefficient and the answer is also small, so it works out; but if I had asked something like 1 0 − 1 0 0 chance of going over, your algorithm might not finish quickly.
Log in to reply
Typo corrected.
Perhaps not the most efficient to calculate, but given the nature of your answer template, the value of N was not likely to be large, and the coding to solve the problem only took me a couple of minutes.
You can make your method even faster by writing your recurrence relation as ( d + 1 ) P [ n , d ] = d P [ n , d − 1 ] + P [ n − d , d ] That cuts us down to O ( 3 n d ) operations. More intriguingly, this is a nonhomogeneous recurrence relation ( d + 1 ) P [ n , d ] − P [ n − d , d ] = d P [ n , d − 1 ] and hence it is clear that P [ n , d ] can be written as a linear combination of { ζ j , k : 1 ≤ k ≤ j ≤ d } , where ζ j , k ( 1 ≤ k ≤ j ) are the j complex j th roots of j + 1 1 . This raises the tantalising, but notationally complex, possibility of there being a closed form for P [ n , d ]
Can somebody explain why the solution for N is not N=53 given above probabilities, as F(53)=0.99991>99.99%, thus setting N=54 means that 0 to 53 will come slightly more than 99.99% of the time?
Btw. I got F(52)=0.99988719 and F(53)=0.99990995, thus shifted by 1 compared to Ivan's solution, but similar to Mark's solution, and I've used a dynamic programming approach similar to Ivan's solution.
Log in to reply
Because if N = 5 3 , you're saying that the sum being strictly less than 53 has 99.99% probability. As you see, that's not right (99.988%). You need N = 5 4 , so that the sum being strictly less than 54 (that is, 0 to 53) has 99.99% probability.
The biggest observation that will tremendously simplify everything is the following: if your last taken digit is d , then you can effectively treat the stream as a stream of digits from 0 to d , because you're going to skip all digits greater than d anyway. (More over, each digit still appears with equal probability.) This means we can generalize the problem: instead of a stream of digits from 0 to 9, let's say we have a stream of digits from 0 to d , and once we have a formula (or algorithm) to compute the answers to both problems for general d , we'll just plug in d = 9 .
Despite being a computer science problem, the expected number can in fact be computed without resorting to computer. Let E [ d ] be the expected value of the obtained number from a stream of 0 to d . Clearly E [ 0 ] = 0 ; our stream is simply endless zeroes. What about the other d ?
Suppose the first digit of the stream is k . Then the expected value we'll get in this case is simply k + E [ k ] : take the first digit, and by the important observation above, the rest of the stream can be simplified to a stream of 0 to k , which has expected value E [ k ] . Since our number is the sum of the two, by linearity of expectation we get the claim.
Since k can be any of 0 to d uniformly, we simply add them together:
E [ d ] = k = 0 ∑ d d + 1 1 ⋅ ( k + E [ k ] )
We can try to compute E [ 1 ] , E [ 2 ] , E [ 3 ] , … , since each E [ d ] only depends on the values of E [ k ] for 0 ≤ k ≤ d ; the values of E [ k ] for 0 ≤ k ≤ d − 1 are known, so the resulting equation is just a linear equation in E [ d ] , which can be solved easily. After trying a few values, you'll notice that E [ d ] = d ; now we can use strong induction to prove that this holds for all d . Thus E [ 9 ] = 9 , and so a = 9 , b = 1 .
The second part of the problem is more difficult and definitely requires computer. The trick is to flip the problem: instead of asking "given a probability p , what is the smallest N such that the probability that the generated integer is smaller than N ?", we instead ask "given a number N , what is the probability p N that the generated integer is smaller than N ?" Clearly p N increases as N increases, so we simply need to find some N where p N ≥ 0 . 9 9 9 9 to solve the problem.
For convenience, we'll instead define P [ n , d ] to be the probability that the number is greater than or equal to n , from a stream of 0 to d . Then N will be the number satisfying P [ N , 9 ] ≤ 0 . 0 0 0 1 < P [ N − 1 , 9 ] . We still need to figure out how to compute P [ n , d ] , though. It's easy to make some base cases: if n ≤ 0 , then P [ n , d ] = 1 (the resulting number must be at least 0); and if n > 0 and d = 0 , then P [ n , d ] = 0 (when the stream is all zeroes, you can't do anything).
The trick is to compute P [ n , d ] from simpler values of n , d , similar to above. Let's say the first number in the stream is k . Then the probability that the desired number is at least n is easy to compute: we already have k , so we need n − k more, and the stream is cut to simply 0 to k . This is exactly P [ n − k , k ] . Since k can be any of 0 to d , and they all are mutually independent, we obtain
P [ n , d ] = k = 0 ∑ d d + 1 1 ⋅ P [ n − k , k ]
This is easy to compute. The only problem is that we don't know how high n can be (in order to find the right P [ n , 9 ] ), so we might potentially do this for a very long time. You can either just try it anyway (and be delighted that the answer is pretty small), or we can put a very rough upper bound at it:
Let n = 9 m . In order to have our number to be at least n = 9 m , we need to see at least m nonzero digits (since each of them is at most 9). Even if we skip numbers, those must also be nonzero (since we never skip zeroes). In other words, the first m digits of the stream must be nonzero. The probability that this is the case is simply ( 9 / 1 0 ) m ; each digit has probability 9 / 1 0 to be nonzero, and there are m independent events that must all succeed at the same time. Computing m such that ( 9 / 1 0 ) m < 0 . 0 0 0 1 , we obtain m ≥ 8 8 . Thus it's enough to test up to n = 9 ⋅ 8 8 = 7 9 2 ; we must have P [ 7 9 2 , 9 ] < 0 . 0 0 0 1 . Hey, this is pretty small! We can indeed simulate it after all.
After we try it out, we find that P [ 5 3 , 9 ] ≈ 0 . 0 0 0 1 1 2 8 0 7 3 , P [ 5 4 , 9 ] ≈ 0 . 0 0 0 0 9 0 0 4 2 6 . Thus N = 5 4 . The final answer is 9 0 0 1 0 5 4 .
For the first part, let E ( k ) be the expected sum if we only take digits i ≤ k (the "ceiling").
Lemma : E ( k ) = k
Proof : By induction. Clearly, E ( 0 ) = 0 . Now assume that E ( i ) = i for i < k . Let k be the "ceiling". Any digits greater than k we ignore; the first digit we "take" is uniformly distributed over { 0 , … , k } ; each digit in this set has a probability 1 / ( k + 1 ) . The number we find will be i + whatever we can expect with this new digit i as the "ceiling". Thus E ( k ) = i = 0 ∑ k k + 1 1 ( i + E ( i ) ) = k + 1 1 i = 0 ∑ k ( 2 i ) = k + 1 1 2 ⋅ 2 1 k ( k + 1 ) = k Q.E.D.
Thus the first answer is E ( 9 ) = 9 , or a = 9 and b = 1 .
For the second part, we must study the probabilities P ( k , n ) of reaching a total of n with "ceiling" k . A recursion relation is easily established: { P ( 0 , 0 ) = 1 P ( k , n ) = k + 1 1 ∑ i = 0 k P ( i , n − i ) k ≥ 0 , n ≥ 0 , n − i ≥ 0
There is no simple expression for P ( k , n ) , so we use a computer to generate this. To find the solution, we determine the cumulative distribution and find the smallest N for which n = 0 ∑ N − 1 P ( 9 , n ) ≥ 0 . 9 9 9 9 .
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
|
The output of this program is N = 5 4 .
Thus we post the solution a ⋅ 1 0 6 + b ⋅ 1 0 3 + N = 9 0 0 1 0 5 4 .
Problem Loading...
Note Loading...
Set Loading...
I used exactly the same method as Ivan to develop E [ d ] = d , and hence a = 9 , b = 1 .
Any sequence that obtains the sum n ≥ 1 is a partition of the integer n , and there is one valid sequence that yields the number n for each partition of n that has all digits less than 1 0 . The probability of obtaining a particular sequence a 1 ≥ a 2 ≥ ⋯ ≥ a m > 0 is 1 0 ( a 1 + 1 ) ( a 2 + 1 ) ⋯ ( a m + 1 ) 1 since we roll a 1 with probability 1 0 1 , each a j with probability a j − 1 + 1 1 , and the final 0 to terminate the sequence with probability a m + 1 1 . Thus, if M is the total scored, then P [ M = n ] = ( a 1 , a 2 , . . . , a m ) ∈ P 1 0 ( n ) ∑ 1 0 ( a 1 + 1 ) ( a 2 + 1 ) ⋯ ( a m + 1 ) 1 where P 1 0 ( n ) is the set of partitions of n for which all digits are less than 1 0 . Note that P [ M = 0 ] = 1 0 1 . Thus the cumulative distribution function of M is F ( n ) = P [ M ≤ n ] = 1 0 1 + m = 1 ∑ n P [ M = m ] n ≥ 1 Since Mathematica has an IntegerPartitions command, it is particularly easy to programme to calculate that F ( 5 2 ) = 0 . 9 9 9 8 7 and that F ( 5 3 ) = 0 . 9 9 9 9 1 , so that N = 5 4 , and the answer is 9 0 0 1 0 5 4 .