You! You're the one who helped me build
my previous theater addition
, right? I hope you still have all that analysis mumbo jumbo you worked on because I'm building another theater and I need your help to size it again.
Little has changed since the last time, but I'll give recap of the facts you need to know anyway. Just a few things the PR people are making me brush up on... ugh. Can't stand those "fair-trade this", "fair-wages that" - you can give me numbers... I love numbers!
-
At each intermission in my theater, every theater-goer has a 1% chance of needing to get up for something. These people will all stand up in their densely-packed rows of seats.
-
Every one of these 1%-ers, which we'll call the 'givens', will need to get to the aisle. Unlike last time, though, we're going to have
2
aisles in this theater instead of 1, one aisle on each end of all the rows!
-
To get to the aisle, other customers will need to get up to let them out. We call these the 'forced'. Now, since each 'given' could choose to go to one of two aisles, the 'forced' aren't immediately obvious when the givens stand up. However, it turns out that all of my customers are really smart. The givens will direct themselves in such a way so that
the smallest possible number of people in the row are 'forced'
. For example, in a row of 4 people, if the person in the 2nd seat from the left has to get up (and no one else), he will go to his left, making for only 1 'forced', instead of to his right, which would make 2 customers 'forced'.
-
Corporate has changed the customer-happiness requirements. We're now considered 'doing well' if at most 10% of our customers are unhappy at intermissions - that is, if they are 'forced'. We're no longer counting givens towards this statistic.
-
So, I want to ensure, that at any given intermission, the probability that more than 10% of our customers are unhappy is, at most, 50%.
I have 25 rows to make in this new theater. I want to make these rows as many seats long as possible, whilst still obeying the customer happiness requirements set out by corporate above. How many seats long can I make all of these rows, and still obey the requirements?
Details:
-
It is highly reccommended you solve, or at least understand the solution to,
the previous problem
before trying to tackle this one.
-
All rows must be the same size.
-
Assume all seats are always filled. Don't worry about empty seats.
-
When computing 10% of customers, round down. So, for 5 columns of seats in the theater, we need to consider the probability that at most 12, not 13 customers are unhappy.
We already know how to solve the second half of this problem using simple Dynamic Programming. If we can compute, for a given number of columns C , P ( x ) for all x ∈ [ 0 , C ] , where P ( x ) is the probability that x people in a given row are 'forced', then we can easily compute the probability that at most 10% of customers are 'forced' across 25 rows. For details on this part of the solution, read my solution of the previous problem.
The challenge, thus, in this problem, is to compute P ( x ) for arbitrary x . There is no simple closed form for this value, and, meta-gaming a bit, since we can see that the answer is 4 7 , solutions which are exponential in the number of columns are not feasible! Below, I will present a way to pre-compute all P ( x ) for a given C , in time O ( C 3 ) , which is tractible.
Observation 1: The number of 'forced' in a given row is equal to C − k − q , where C is the number of columns in the row, q is the number of givens, and k is the maximum length of any segment in the row of consecutive non-givens.
It's simple to show this is the case.
Now, let's define a function, P ( N , k , q ) . We'll define this function to represent the probability that:
It turns out this function is actually relatively simple to define recursively. If you start changing the ≥ , ≤ semantics in the restrictions, or try to get exact requirements, you'll find the formula gets quite ugly. There are numerous nuanced base cases for this P ( N , k , q ) function which we'll gloss over here - if you're curious, look at the code sample at the end of the solution. The recursive definition for all other cases is thus:
P ( N , k , q ) = 1 0 0 1 P ( N − 1 , k , q − 1 ) + 1 0 0 9 9 P ( N − 1 , k , q ) + ( P ( N − k − 1 , 0 , q − 1 ) − P ( N − k − 1 , k , q − 1 ) ) ( 1 0 0 1 ) ( 1 0 0 9 9 ) k
Broken down into English, we're adding up:
The probability that the N -th customer is a given, times the probability that the previous N − 1 customers yielded at least one segment of k or more continuous non-givens, as well as at most q − 1 givens,
the probability that the N -th customer is not a given, times the probability that the previous N − 1 customers yielded at least one segment of k or more continuous non-givens, as well as at most q givens,
the probability that this particular row yielded a continuous segment of k or more non-givens only at the very end, with at most q givens elsewhere in the row, expressed as:
the probability that, for the first N − k − 1 customers, the longest contiguous segment of givens had length at most k − 1 , and after those N − k − 1 customers, we find a single given, and then k non-givens.
Meaningful values for P ( N , k , q ) occur in 0 ≤ k , q ≤ k + q ≤ N . For a given C , this space of ( N , k , q ) tuples has size O ( C 3 ) , and so all these values can be computed in O ( C 3 ) time. Additionally, following the exclusion principle, we can now easily compute the more useful function, E ( N , k , q ) , which is the probability that, in a row of N customers, the longest segment of contiguous non-givens has length exactly k , and there are exactly q givens in the row:
E ( N , k , 0 ) = P ( N , k , 0 ) − P ( N , k + 1 , 0 )
E ( N , k , q > 0 ) = ( P ( N , k , q ) − P ( N , k + 1 , q ) ) − ( P ( N , k , q − 1 ) − P ( N , k + 1 , q − 1 ) )
Now, we're ready to compute the function we need for the last step of the problem: P ( x ) , the probability that x customers are 'forced' in an arbitrary row of length C . To do this, first initialize all P ( x ) to 0 , then, loop over all ( k , q ) tuples that satisfy the inequality 0 ≤ k , q ≤ k + q ≤ C . Compute E ( C , k , q ) for each tuple, and since C − k − q customers in the row are forced, add that value to the running total for P ( C − k − q ) .
When we're done, we'll have a fully defined P ( x ) function, and we're ready to run the final dynamic programming step. Doing so for each possible C incrementally will eventually yield:
f ( 4 7 ) ≈ 0 . 4 9 5 2 3 2
f ( 4 8 ) ≈ 0 . 5 1 4 8 3 8
Where f ( c ) is the probability that, given c columns of seats in the theater, more than 10% of customers will be 'forced'. Thus, since the function is, again, clearly increasing, we provide 4 7 as our answer.
Here is the Java program I wrote to compute the answer. I like to use Guava and AutoValue, sadly, so if you don't, you won't be able to paste-and-run this as-is.