To prevent an epidemic outbreak, all the inhabitants of a city are tested for a virus.
The whole population can be divided into the subsets
$\begin{aligned}
I &= \{x| x \text{ is infected} \} \\
T &= \{x| x \text{ is tested positive} \}
\end{aligned}$
and the corresponding complements
$\overline{I}$
and
$\overline{T}$
. The test has
**
sensitivity
**
(i.e. test positive if infected)
$P(T|I) = 99\%$
and
**
specificity
**
(i.e. test negative if healthy)
$P(\overline{T}|\overline{I}) = 98\%.$
Of the whole population only a percentage of
$P(I) = 0.1\%$
is sick. Paul has a positive test result. To the nearest integer percentage, what is the probability
$P(I|T)$
, that he is actually infected?

**
Bonus question:
**
Paul is repeatedly tested positive in a second test. How high is the probability that he is infected? (Both tests are statistically independent and have the same effectiveness and specificity.)

5%
10%
33%
45%
71%
98%

**
This section requires Javascript.
**

You are seeing this because something didn't load right. We suggest you, (a) try
refreshing the page, (b) enabling javascript if it is disabled on your browser and,
finally, (c)
loading the
non-javascript version of this page
. We're sorry about the hassle.

In a city with $N = 100,000$ inhabitants $N P(I) = 100$ persons are infected. Out of them $N P(T|I) P(I) = 99$ persons are tested positive. Out of the healthy population of $N P(\overline{I}) = 99,900$ people a percentage of $P(T|\overline{I}) = 1 - P(\overline{T}|\overline{I}) = 2\,\text{\%}$ are also tested positive, so that we have $N P(T|\overline{I}) P(\overline{I}) = 1998$ false positive test results. The probablitity for a positive tested person to be infected results $P(I|T) = \frac{P(T|I) P(I)}{P(T|I) P(I) + P(T|\overline{I}) P(\overline{I})} = \frac{99}{99 + 1998} = 4.721\,\text{\%} \approx 5 \,\text{\%}$ Altough the test has a fairly large sensitivity and specitivity, only $5 \,\text{\%}$ of the positive tests are actually correct due to the small amount of infections in the whole population.

For the second test we have group of $N' = 2097$ people with $99$ infections. Out of them $98$ persons are tested a second time true positive. Out of $1998$ healthy people there are still $40$ false positive test. Therefore, the infection probability results $P(I|T_1 \cap T_2) \approx \frac{98}{98 + 40} \approx 71\,\text{\%}$ After the second positive test result, Paul is probably infected, but we are still far from certainty.