Probability of Flowers

Probability Level 5

This used to be a philosophy question, but let's calculate!

One day, a teacher told his student to go to the garden behind the school. The garden is a straight corridor with several kinds of flowers beside it. The teacher told the student, "There are some roses in the garden. You must walk in the corridor from beginning to end and pick the rose that you think is the most beautiful. Be careful, you cannot walk back, you can only move forward: you cannot regret once you pass a flower not picking it."

The student then figures out the following plan, knowing the garden has exactly $11$ roses:

He first observes the first $5$ roses but doesn't pick among them.
After the fifth rose, he just picks the rose that he thinks is the most beautiful compared to the first $5$ roses.
Then he won't look at the rest and straight away exits the garden.

What is the probability that the student actually picks the most beautiful rose?

If the probability can be expressed as $\frac ab,$ where $a$ and $b$ are coprime positive integers, find the value of $a+b.$

Details and Assumptions:

The roses are uniformly distributed in a straight line in the garden, that is, the most beautiful rose will have an equal chance to be planted on any spot in that line of roses. The same goes for the second most beautiful rose, the third most beautiful rose, and so on. No two roses can be planted at the same spot.
The student cannot pick more than one rose.
No two roses are equally beautiful.

Bonus: Generalize the expression of the probability $P(n,k)$ with $n-k\geq1$ for arbitrary values of $n$ and $k,$ where $n$ denotes the total number of roses in the garden, and $k$ the number of roses the student first observed but didn't pick. For instance, this question asks for $P(11,5)=\frac ab.$ In addition, what relationship do $n$ and $k$ have, to maximize the probability?

The answer is 7675.

2 solutions

Kelvin Hong
Jun 11, 2018

Let's start the generalization. Assume that the total number of roses be $n$ and the number of roses to be observed be $k$ . Then, arrange the flower according to their degree of beautiful (I know beautiful is different to everyone, but just focus on the mathematical side!). We will have the set $S=\{a_1,a_2,a_3,\dots,a_n\}$ where $a_1$ is the most beautiful flower, $a_2$ is the second most beautiful flower, $a_n$ is the ugliest flower.

Note that, $a_1$ cannot be in the first $k$ roses because the student will not pick them, so we only consider the case $a_1$ on the position $d$ where $k+1\leq d\leq n$ . Furthermore, we need to know the chance $a_1$ appear on every position is $\dfrac1n$ , so every case with a value $d$ will have a base probability $B=\dfrac1n$ .

Let's discuss if $a_1$ is on the position $d$ , then consider if we put the first $k$ ugliest roses on the first $k$ position which are the roses in the set $\{a_{n-k+1},a_{n-k+2},\dots,a_n\}$ , then the student must pick up the $k+1^{th}$ roses because it is the most beautiful compared to the first $k$ . It isn't what we want. So there is a bound where we need to decide to put the roses in the first $k$ position.

There are exactly $d-1$ roses in front of $a_1$ , so the first $d-1$ ugliest roses must at front of $a_1$ as the minimum condition. To calculate this, we need to know the $d-1^{th}$ ugliest rose is $a_{n-d+2}$ . At the minimum case, $a_{n-d+2}$ must be the most beautiful rose in the first $d-1$ roses and must place at position inside the first $k$ roses, to ensure the student will pick the rose at the position $d$ , which is what we want.

Before calculating these messy things, we first define the set $A_i$ to be the valid permutation of roses when the most beautiful rose in the first $k$ roses is $a_i$ . To make it clear, every element in the set $A_i$ is a string contains all of the flowers $a_1$ to $a_n$ , but with different permutations. Then, the student must pick up the actual most beautiful rose by the permutation of these elements, which is called "valid permutation".

Then, let the sample set $SP$ be all of the permutation when $a_1$ fixed at the position $d$ , so $n(SP)=(n-1)!$

After the setting process, we now count the number of elements aka the valid permutations of roses in the set $A_{n-d+2}$ . $a_{n-d+2}$ must in the first $k$ position, there are $k$ choices for it to be planted. then all of the roses uglier then $a_{n-d+2}$ must place in front of position $d$ , that is $(d-2)!$ . Then, all of the roses more beautiful than $a_{n-d+2}$ must place after position $d$ , that is $(n-d)!$ . So $n(A_{n-d+2})=k[(d-2)!][(n-d)!]$

Then, we consider the case for the set $A_{n-d+1}$ . We put $a_{n-d+1}$ in first $k$ position, which is $k$ choices. Then there are $n-d-1$ roses more beautiful than it (not $n-d$ , already excluded $a_1$ ), and we gonna put it after position $d$ , which is $n-d$ slots, so it is $\dfrac{(n-d)!}{1!}=(n-d)!$ . After that, there are $d-1$ roses left, all of them are uglier than $a_{n-d+1}$ , so the number of permutation of them will be $(d-1)!$ . Combining all of these, we get $n(A_{n-d+1})=k[(d-1)!][\dfrac{(n-d)!}{1!}]$

Then, the case $A_{n-d}$ will lead to the result $n(A_{n-d})=k[d!][\dfrac{(n-d)!}{2!}]$ .

We just need to loop through the case until $A_2$ , not $A_1$ , because the most beautiful rose will not place on the first $k$ position, so $n(A_2)=k[(n-2)!][\dfrac{(n-d)!}{(n-d)!}]$

Let the probability the student successfully pick up $a_1$ when it is on the $d$ position be $P(d)$ , then it can be expressed as $\begin{aligned}P(d)&=\frac{n(A_{n-d+2})}{n(SP)}+\frac{n(A_{n-d+1})}{n(SP)}+\frac{n(A_{n-d})}{n(SP)}+\dots+\frac{n(A_2)}{n(SP)}\\&=\frac{k[(n-d)!]}{(n-1)!}\bigg[(d-2)!+\frac{(d-1)!}{1!}+\frac{d!}{2!}+\dots+\frac{(n-2)!}{(n-d)!}\bigg]\end{aligned}$

By using the identity (which will be prove at the bottom of the solution) $r!+\frac{(r+1)!}{1!}+\frac{(r+2)!}{2!}+\dots+\frac{n!}{(n-r)!}=r!\binom{n+1}{r+1}$ We can simplify $P(d)$ to $\begin{aligned}P(d)&=\frac{k[(n-d)!]}{(n-1)!}[(d-2)!]\binom{n-1}{d-1}\\&=\frac{k[(n-d)!]}{(n-1)!}[(d-2)!]\frac{(n-1)!}{[(d-1)!][(n-d)!]}\\&=\frac k{d-1}\end{aligned}$

Overall, the student pick up $a_1$ no matter what position it is $k+1\leq d\leq n$ will be $\begin{aligned}P&=B\sum_{d=k+1}^nP(d)\\&=\frac kn\bigg[\frac1k+\frac1{k+1}+\frac1{k+2}+\dots+\frac1{n-1}\bigg]\\&=\frac kn\sum_{d=k+1}^n\frac1{d-1}\end{aligned}$ Since the harmonic series can't be simplify anymore, this is the generalized form of the probability. Let me write down here: $P(n,k)=\frac kn\sum_{d=k+1}^n\frac1{d-1}$ Substituting $n=11$ and $k=5$ , we will get $P(11,5)=\dfrac5{11}\bigg[\dfrac15+\dfrac16+\dfrac17+\dfrac18+\dfrac19+\dfrac1{10}\bigg]=\dfrac{2131}{5544}$ , so the sum $a+b$ will equal to $2131+5544=\boxed{7675}$ .

You can see the calculation about finding the optimum ratio of $n$ and $k$ done by Michael Mendrin in the comment section below.

Below is the proof of the identity.

Prove that $r!+\frac{(r+1)!}{1!}+\frac{(r+2)!}{2!}+\dots+\frac{n!}{(n-r)!}=r!\binom{n+1}{r+1}$

Proved by Induction:

When $n=r$ , $LHS=r!$ , $RHS=r!\binom{r+1}{r+1}=r!$ , the equality holds.

Assume that the identity holds when $n=r+k$ , that is $\displaystyle r!+\dfrac{(r+1)!}{1!}+\dfrac{(r+2)!}{2!}+\dots+\dfrac{(r+k)!}{k!}=r!\binom{r+k+1}{r+1}$ . Then when $n=r+k+1$ , we have $\begin{aligned}LHS&=r!\binom{r+k+1}{r+1}+\frac{(r+k+1)!}{(k+1)!}\\&=\frac{r!(r+k+1)!}{k!(r+1)!}+\frac{(r+k+1)!}{(k+1)!}\\&=\frac{(r+k+1)!}{k!}\bigg[\frac1{r+1}+\frac1{k+1}\bigg]\\&=\frac{(r+k+1)!}{k!}\frac{r+k+2}{(r+1)(k+1)}\\&=\frac{(r+k+2)!}{(r+1)[(k+1)!]}\\&=r!\frac{(r+k+2)!}{(r+1)!(k+1)!}\\&=r!\binom{r+k+2}{r+1}\\&=RHS\end{aligned}$ This completes the proof.

For very large $n$ , the optimum number of $k$ roses to inspect before making a decision thereafter is

$k=\dfrac{n}{e}$

where $e$ is Euler's number.

so, for example, if only the first $4$ flowers were checked instead of $5$ , then the probability that the prettiest rose will be picked is $\frac{251}{630}= 0.398413 > 0.38438$ , using your own formula.

We can integrate instead of summing $\frac{1}{d-1}$ to work out this limiting value $k=\frac{n}{e}$ , since this is for large $n$ .

This does have a real-life practical application. If you decide that you're only going to look at $k$ items out of $n$ , and then pick the first that is better than any of the first $k$ , then first look at $\frac{n}{e}$ items and afterwards make your pick.

Maybe you can post "the next..." of this problem by asking for this optimum ratio?

Michael Mendrin - 2 years, 12 months ago

Yes, I recently wanted to put the optimum ratio in the bonus section, but I will not do it, since you have discover the answer of it. Thanks for answer the question!

Kelvin Hong - 2 years, 12 months ago

Why can't you add that to the bonus section? I had only wondered about the relationship between "5" and "11", and thought maybe it had something to do with maximum probability of picking the best flower.

As a philosophical question, if I have $n$ things to look at, it makes sense to adopt this strategy only if you're not allowed to return anywhere. After all, in adopting this strategy, you're likely to find the best flower after looking at about half of the flowers, but you only have a probability of 37% success in finding it, when the probability of the best flower being somewhere in the first half is about 50%. But sometimes in real life, you can't go back. For example, if you know that there are 10 homes in the market, and knowing that after you've passed up a home for sale it is very likely to sell very quickly afterwards, first check about 3 or 4 homes, and then afterwards pick the first home better than any of them.

Michael Mendrin - 2 years, 12 months ago

Well, I just not care about the optimum ratio, 11 and 5 just two random numbers! Oh, I'm sorry, I mean I won't put a new question about it, I will put it in the bonus section. Thanks for suggestion!

Kelvin Hong - 2 years, 12 months ago

@Kelvin Hong – I think it moves your problem from "philosophy" to a real-life application. I'm still thinking about the consequences of this. Thanks for adding to the bonus, and your very interesting problem. Difficult to analyze.

Michael Mendrin - 2 years, 12 months ago

@Michael Mendrin – Ya thanks! Also, note that this question is not original, I took it from my friend math course problem. Have fun!

Kelvin Hong - 2 years, 12 months ago

I wonder how to calculate the optimum ratio? Can you explore here?

Kelvin Hong - 2 years, 12 months ago

For large $n$ , we can integrate instead of summing, so

$\dfrac { k }{ n } \displaystyle \int _{ k+1 }^{ n }{ \frac { 1 }{ d-1 } }$

$\dfrac{k}{n} \left(Log(n-1)-Log(k) \right)$

Differentiating this with respect to $n$ gets us, to find the maximum probability

$\dfrac{k}{n(n-1)} - \dfrac{k}{n^2} \left(Log(n-1)-Log(k) \right) = 0$

which we can rearrange to

$kn-k(n-1)Log\left(\dfrac{n-1}{k} \right) = 0$ , or

$\dfrac{n-1}{n}Log \left( \dfrac{n-1}{k} \right) = 1$

Let $k = \dfrac{n}{x}$ where $x$ is some constant. then we have

$\dfrac{n-1}{n}Log \left( \dfrac{n-1}{n} x\right) = 1$

so that as $n \rightarrow \infty$ , $x \rightarrow e$

This is easy compared to the work you've done in deriving the sum.

Michael Mendrin - 2 years, 12 months ago

Thanks for your calculation Mr. Mendrin, I will add this to the bonus section.

Kelvin Hong - 2 years, 12 months ago

Julian Poon
Jun 19, 2018

$\begin{aligned} P(n,k) &= \sum_{i=1}^{n} P(\text{The }i\text{-th rose is picked})\times P(\text{The }i\text{-th rose is the best}) \\ &= \frac{1}{n} \left(\sum_{i=1}^{k}0 + \sum_{i=k+1}^{n}P(\text{The most beautiful rose in the first } (i-1)\text{ roses is in the first } k\text{ roses})\right) \\ &= \frac{1}{n} \sum_{i=k+1}^{n} \frac{k}{i-1} \\ &= \frac{k}{n} \sum_{i = k+1}^{n} \frac{1}{i-1} \\ P(11,5) &= \frac{2131}{5544} \end{aligned}$

See Michael's comment for the optimum ratio for large $n$

I will present a solution which may be similar to @Julian Poon despite I did not get this correctly.

The number of permutations satisfying our requirements $\displaystyle =\sum_{d=k+1}^{n} 1\cdot \dbinom{n-1}{d-1}\cdot \dbinom{k}{1}\cdot (d-2)!\cdot (n-d)!$

The total number of all possible permutations $=n!$

Let me explain the details in deriving this crucial first line.

I can't attach an image here, so please refer to @Kelvin Hong image.

Let $S=\{a_{1},a_{2},...,a_{n}\}$ where $a_{1}$ is the most beautiful flower and $a_{n}$ is the ugliest flower.

We assign $a_{1}$ , the prettiest to the $d$ th spot from left for $k+1\leq d\leq n$ . There is $1$ way to do so.

After the prettiest flower has been assigned, there are still $n-1$ flowers left to be assigned.

Of $n-1$ flowers, choose $d-1$ flowers to be assigned to the first $d-1$ spots, giving us $\dbinom{n-1}{d-1}$ .

$\textbf{But}$ the prettiest of the $d-1$ chosen flowers should be assigned to the first $k$ spots. Otherwise the student will pick it instead of $a_{1}$ and walk away, which is not desired. So, there are $\dbinom{k}{1}$ to do so.

The remaining $d-2$ chosen flowers can be arranged randomly in the remaining spots before the $dth$ spot. So, there are $(d-2)!$ possible number of ways to do so.

Then, the remaining flowers that are unassigned can be arranged at random. Since the first $d$ flowers has been assigned, the number of them $=n-d$ . So, there are $(n-d)!$ possible number of ways to do so.

With all being explained, we are ready to move on to the computational process.

The number of permutations satisfying our requirements

$\begin{aligned} &= \displaystyle\sum_{d=k+1}^{n} 1\cdot \dbinom{n-1}{d-1}\cdot \dbinom{k}{1}\cdot (d-2)!\cdot (n-d)! \\ &= \displaystyle\sum_{d=k+1}^{n} \cfrac{(n-1)!}{(d-1)!(n-d)!}\cdot k\cdot (d-2)!\cdot (n-d)!\\ &= \displaystyle\sum_{d=k+1}^{n} \cfrac{(n-1)!}{(d-1)!}\cdot k\cdot (d-2)!\\ &= \displaystyle\sum_{d=k+1}^{n} \cfrac{(n-1)!}{(d-1)(d-2)!}\cdot k\cdot (d-2)!\\ &= \displaystyle\sum_{d=k+1}^{n} \cfrac{(n-1)!}{d-1}\cdot k\\ \end{aligned}$

We see that $n,k$ are constants in the original question.

$\therefore \sum_{d=k+1}^{n} \cfrac{(n-1)!}{d-1}\cdot k=k(n-1)!\sum_{d=k+1}^{n}\cfrac{1}{d-1}$

$\displaystyle P(n,k)=\cfrac{k(n-1)!\sum_{d=k+1}^{n}\cfrac{1}{d-1}}{n!}=\cfrac{k(n-1)!\sum_{d=k+1}^{n}\cfrac{1}{d-1}}{n(n-1)!}=\cfrac{k\sum_{d=k+1}^{n}\cfrac{1}{d-1}}{n}$

$\therefore P(n,k)=\cfrac{k\sum_{d=k+1}^{n}\cfrac{1}{d-1}}{n}$

Returning to the question, $n=11,k=5$

$P(11,5)=\cfrac{5\sum_{d=6}^{11}\cfrac{1}{d-1}}{11}=\cfrac{5}{11}\left(\cfrac{1}{5}+\cfrac{1}{6}+\cfrac{1}{7}+\cfrac{1}{8}+\cfrac{1}{9}+\cfrac{1}{10}\right)=\cfrac{2131}{5544}$

$a+b=2131+5544=7675$

donglin loo - 2 years, 11 months ago

Thanks for your detail solution.

Kelvin Hong - 2 years, 11 months ago

Probability of Flowers

The answer is 7675.

2 solutions

1 pending report

Vote up reports you agree with