A restaurant has N ≥ 4 dishes on its menu that are each distinct in quality, rated from 1 (the worst) to N (the best). You do not know the quality rating of each dish, although you can judge a better dish after eating two different dishes.
You plan to visit the restaurant 4 times this week, and your goal is to maximize the average quality rating of the 4 dishes that you order. For the first n days, you order a different random dish each day. Then, for the next 4 − n days, you order the best dish out of the dishes that you have already tried.
What value of n maximizes the average expected quality rating of the dishes that you order?
This section requires Javascript.
You are seeing this because something didn't load right. We suggest you, (a) try
refreshing the page, (b) enabling javascript if it is disabled on your browser and,
finally, (c)
loading the
non-javascript version of this page
. We're sorry about the hassle.
I followed along until you wrote the expected value for the ratings. Could you please break down the formula? Thanks! I appreciate it. :-)
Log in to reply
I just took k / 4 times the EV of the first k meals plus ( 4 − k ) / 4 times the EV of the max, then I factored out 2 N + 1 .
These algebraic solutions are very impressive, but aren't they using a sledgehammer to crack a nut? If we give ranks of 4 (best) to 1 (worst) of the four dishes we may or may not sample, then the mean ranks if we sample all four or just one are clearly 2.5. If we sample just two there are six possibilities: three including dish 4, two including dish 3 but not dish 4, and one with dishes 1 and 2. The mean for the first group is that of 4, 2, 4, 4, i.e. 3.5, for the second group it's that of 3, 1.5, 3, 3, i.e. 21/8, and for the last group it's that of 2, 1, 2, 2, i.e. 7/4. So overall mean is (3 x 3.5 + 2 x 21/8 + 7/4)/6 = 35/12. If we sample just three dishes then there are four possibilities, three including dish 4 with mean of 4 for the other 2 dishes, and one possibility with dishes 3, 2 and 1. The mean for the first group is that of 4, 2, 2 and 4, i.e. 3, and of second group is that of 3, 2, 1, 3, i.e. 2.25. So overall mean is (3 x 3 + 2.25)/4 = 45/16, which is less than 35/12. So sample of 2 is best. If N is greater than 4 we know nothing about the dishes we don't sample (they may be delicious!), so this makes no difference to our expectation. Unsubtle, but Occam would approve. For a general result, if we sample n out of N dishes and then stick to the highest ranking for the remaining (N-n) meals, then the mean rank equals {n(N+1)(2N-n+1)}/{2N(n+1)}. From this we can deduce that the optimal value of n is when SQRT(2N+2.25) - 1.5 < n < SQRT(2N+2.25) - 0.5. Certain values of N (2, 5, 9, 14, …, i.e. of type (k^2 + 3k)/2) give two co-optimal values, e.g. when N = 5, n =2 and n = 3 both give 3.6 as highest mean rank.
The first line says there is a number of dishes greater than or equal to 4, you assume there are exactly 4 different dishes
Log in to reply
If we are sampling on only 4 days, then we can sample no more than 4 dishes. If there are more than 4 dishes we have no idea about those we don't sample (I made this point later on)
If R j is the expected average rating after four meals, given that the first j choices have been made randomly, and the best of these options chosen thereafter, then R 1 = N 1 a = 1 ∑ N 4 a + a + a + a = 2 1 ( N + 1 ) Let D 2 = { ( a , b ) : 1 ≤ a , b ≤ N , a = b } . Then R 2 = N ( N − 1 ) 1 ( a , b ) ∈ D 2 ∑ 4 a + b + 2 m a x ( a , b ) Now ( a , b ) ∈ D 2 ∑ a = a = 1 ∑ N a ( N − 1 ) = 2 1 ( N − 1 ) N ( N + 1 ) while ( a , b ) ∈ D 2 ∑ m a x ( a , b ) = m = 2 ∑ N m ∣ ∣ { ( a , b ) ∈ D 2 : m a x ( a , b ) = m } ∣ ∣ = m = 2 ∑ N m × 2 ( m − 1 ) = 3 2 ( N − 1 ) N ( N + 1 ) so that R 2 = 4 N ( N − 1 ) 1 [ 2 1 ( N − 1 ) N ( N + 1 ) + 2 1 ( N − 1 ) N ( N + 1 ) + 3 4 ( N − 1 ) N ( N + 1 ) ] = 1 2 7 ( N + 1 ) Similarly, let D 3 = { ( a , b , c ) : 1 ≤ a , b , c ≤ N , a , b , c d i s t i n c t } , so that R 3 = N ( N − 1 ) ( N − 2 ) 1 ( a , b , c ) ∈ D 3 ∑ 4 a + b + c + m a x ( a , b , c ) Now ( a , b , c ) ∈ R 3 ∑ a = a = 1 ∑ N a ( N − 1 ) ( N − 2 ) = 2 1 ( N + 1 ) N ( N − 1 ) ( N − 2 ) while ( a , b , c ) ∈ D 3 ∑ m a x ( a , b , c ) = m = 3 ∑ N m ∣ ∣ { ( a , b , c ) ∈ D 3 : m a x ( a , b , c ) = m } ∣ ∣ = m = 3 ∑ N m × 3 ( m − 1 ) ( m − 2 ) = 4 3 ( N + 1 ) N ( N − 1 ) ( N − 2 ) and hence R 3 = 4 N ( N − 1 ) ( N − 2 ) 1 [ 2 3 ( N + 1 ) N ( N − 1 ) ( N − 2 ) + 4 3 ( N + 1 ) N ( N − 1 ) ( N − 2 ) ] = 1 6 9 ( N + 1 ) Finally. if D 4 = { ( a , b , c , d ) : 1 ≤ a , b , c , d ≤ N , a , b , c , d d i s t i n c t } , then ( a , b , c , d ) ∈ D 4 ∑ a = a = 1 ∑ N a ( N − 1 ) ( N − 2 ) ( N − 3 ) = 2 1 ( N + 1 ) N ( N − 1 ) ( N − 2 ) ( N − 3 ) and so R 4 = N ( N − 1 ) ( N − 2 ) ( N − 3 ) 1 ( a , b , c , d ) ∈ D 4 ∑ 4 a + b + c + d = 2 1 ( N + 1 ) The largest of R 1 = R 4 = 2 1 ( N + 1 ) , R 2 = 1 2 7 ( N + 1 ) and R 3 = 1 6 9 ( N + 1 ) is R 2 for all N , and so the answer is 2 .
very painful calculation.
How about this: the expected value of the max of k ratings is ( N + 1 ) k + 1 k , and the average rating of any of the first k meals is just 2 N + 1 , so the expected average of the ratings is 4 1 2 N + 1 ( k + ( 4 − k ) k + 1 2 k ) = 8 k + 8 9 k − k 2 ( N + 1 ) . This recovers your calculations for k = 1 , 2 , 3 , 4 .
Also, it generalizes: if we're eating m total meals instead of 4 , the formula is 2 m k + 2 m ( 2 m + 1 ) k − k 2 ( N + 1 ) . This is maximized at k = 2 m + 2 − 1 , so the solution will be either the floor or the ceiling of this number in general.
Log in to reply
Indeed, but you have a bit of work to do to prove the result for the expected value of the maximum of k ratings. I thought about whether to do the general case, but decided that the calculations would take too long to type out!
What sort of course covers this?
I couldn't find an online source for this, but in Matt Parker's book Things to Make and Do in the Fourth Dimension , he talks about an optimal dating algorithm where you use the first n potential matches to compare against all future dates. It's an adaptation of the "secretary problem", but looking that up online brings up the n / e method, which, as I understand it, is more useful when you want the absolute best possible choice, but when you allow wiggle room with how good the final choice is, it is lacking.
Anyways, I just plugged 4 into the square root to get 2 as the number of trial choices.
(I admit, this was lazy and not entirely justifiable)
Yes, I have posted Secretary Problem weeks ago.
I took an educated guess using my intuition. Firstly, when n is 1, you will get a completely average restaurant experience as you are sampling the same random dish 4 times. Similarly, when n is 4, you are ordering 4 random dishes, which again will give you an average experience. So the question comes down to which gives a better outlook, ordering 2 random dishes and the best of the 2 for the two following days (n =2), or 3 random dishes and the best of the 3 for the fourth day (n = 3). When n = 2 on average your repeated meals will be above average. When n = 3 your repeated meal will be above average, and more so than the two repeated meals of n = 2, as it is the best of 3 instead of the best of 2. However, due to the shape of the normal distribution, there would be diminishing returns when sampling more meals. Thus I intuited that 2 random meals and the best repeated twice would yield a higher average rating than 3 random reals and the best repeated once, and my answer was n = 2 .
I also have a common sense approach, but different from yours.
As a starting point, let's consider this alternative plan: You eat a different random dish each of the first 3 days; then on the fourth day eat the best dish from the first two days. Combined quality rating will be:
a + b + c + m a x ( a , b )
where a , b , c are the quality ratings from the first three days respectively.
This is better than choosing n = 1 or n = 4 , but not as good as choosing n = 2 or n = 3 . More precisely, the n = 2 plan would change the c to m a x ( a , b ) , whereas the n = 3 plan would change the m a x ( a , b ) to m a x ( a , b , c ) = m a x ( m a x ( a , b ) , c ) .
A little thought should convince you that the former is, on average, the better choice.
Log in to reply
To make that last statement rigorous:
m a x ( m a x ( a , b ) , c ) − m a x ( a , b ) = m i n ( m a x ( a , c ) − a , m a x ( b , c ) − b ) ≤ m a x ( a , c ) − a .
Log in to reply
I'm really interested by your approach. So if I understand right, when you change to n = 3 you increase by <= max(a, c) - a, but when you change to n = 2 you increase by max(a, b) - c, and on average the latter is larger?
I will investigate the cases N = 4 , N = 5 and N = ∞ and then do an educated guess. Note
Case N = 4 : We simply write out the cases. Let Q be the average expected quality of a meal over all days: n=1, n=4: these both reflect a random choice and the expected quality Qis that of the average dish. Q ( 1 ) = Q ( 4 ) = 4 1 + 2 + 3 + 4 = 2 . 5
n=2: the pair 1,2 gives best result 2, and overall sum 1×1+3×2=7; the pair 1,3 gives best result 3 and overall sum 1×1+3×3=10, etc. Thus we find Q ( 2 ) = 6 × 4 7 + 1 0 + 1 3 + 1 1 + 1 4 + 1 5 = 2 . 9 1 6 . .
n=3: the triplet 1,2,3 gives best result 3 and overall sum 1+2+3+3=9, etc. Thus we find: Q ( 3 ) = 4 × 4 9 + 1 1 + 1 2 + 1 3 = 2 . 8 1 2 . . So for the case N=4, the optimum is at n=2.
Case N = 5 : Similarly we find
n=1, n=4: Q ( 1 ) = Q ( 4 ) = 5 1 5 = 2 . 5
n=2: Q ( 2 ) = 4 0 1 4 0 = 3 . 5
n=3: Q ( 3 ) = 4 0 1 3 5 = 3 . 3 7 5
So for the case N=5, the optimum is at n=2.
Case N = ∞ : The quality of the best meal after n trial days can be modeled by having the dish quality vary from 0 to 1 on each dimension of an n-dimensional cube, and integrate the maximum of the coordinates over the generalized volume.
n=1, n=4:
∫ 0 1 x d x = 2 1
Q ( 1 ) = Q ( 4 ) = 4 1 ( 4 × 2 1 ) = 2 1
n=2:
∫ 0 1 ∫ 0 1 m a x ( x , y ) d y d x = ∫ 0 1 ( ∫ 0 x x d y + ∫ x 1 y d y ) d x = ∫ 0 1 ( x 2 + 2 1 ( 1 2 − x 2 ) ) d x = 2 1 ∫ 0 1 ( x 2 + 1 ) d x = 2 1 ( 3 1 + 1 ) d x = 3 2
Q ( 2 ) = 4 1 ( 2 × 2 1 + 2 × 3 2 ) = 1 2 7
n=3:
∫ 0 1 ∫ 0 1 ∫ 0 1 m a x ( x , y , z ) d z d y d x = ∫ 0 1 ∫ 0 1 ( ∫ 0 m a x ( x , y ) m a x ( x , y ) d z + ∫ m a x ( x , y ) 1 z d z ) d y d x = 2 1 ∫ 0 1 ∫ 0 1 ( m a x 2 ( x , y ) + 1 ) d y d x = 2 1 ∫ 0 1 ( ∫ 0 x x 2 + 1 d y + ∫ x 1 y 2 + 1 d y ) d x = 2 1 ∫ 0 1 x ( x 2 + 1 ) + 3 1 ( 1 3 − x 3 ) + 1 − x d x = 2 1 ∫ 0 1 3 2 x 3 + 3 4 d x = 2 1 ( 6 1 + 3 4 ) = 4 3
Q ( 3 ) = 4 1 ( 3 × 2 1 + 1 × 4 3 ) = 1 6 9
So for cases N = 4 , N = 5 and N = ∞ the optimum is at n = 2 and my educated guess is that that is the right answer.
Problem Loading...
Note Loading...
Set Loading...
The expected value of the max of k dishes chosen from N without replacement is ( N + 1 ) k + 1 k (see this discussion for a proof), and the average rating of any of the first k meals is just 2 N + 1 , so the expected average of the ratings is 2 N + 1 ( 4 k + 4 4 − k k + 1 2 k ) = 8 k + 8 9 k − k 2 ( N + 1 ) . This recovers Mark's explicit calculations for k = 1 , 2 , 3 , 4 . A quick check shows that this is maximized when k = 2 .
Also, it generalizes: if we're eating m total meals instead of 4 , the formula is 2 m k + 2 m ( 2 m + 1 ) k − k 2 ( N + 1 ) . This is maximized at k = 2 m + 2 − 1 , so the solution will be either the floor or the ceiling of this number in general.