Problem Set 2

DNA tests are 100% accurate at matching a sample with a subject if done correctly and less accurate after accounting for human error. At a police department, you have dozens of unsolved cases with DNA samples taken at the scene of the crime. In order to help crack these cold-cases, your police department has instituted a new program of comparing the DNA of all arrested suspects with these existing DNA samples. In preparation, you have done validation of your department's procedures and found that the rate of false positives (i.e., identifying a match when there is not one) is 0.1% and your rate of false negatives (failing to identify a match when there is one) is 0.05%. Let us assume that the probability of each cold-case match is independent. We will further assume that we happen to arrest each perpetrator within one year and we arrest 100 unique people per week. Using this information, answer the following questions symbolically.

Question 1

Use $X$ to indicate the result of the DNA test and $Z$ to indicate if a suspect's DNA truly matches the DNA sample. Write out the given information about these two rvs you can identify from the prompt.

$$ P(X = 0 | Z = 0) = 0.999 $$$$ P(X = 1 | Z = 0) = 0.001 $$$$ P(X = 0 | Z = 1) = 0.0005 $$$$ P(X = 1 | Z = 1) = 0.9995 $$$$ P(Z = 0) = 1 - \frac{1}{100 \times 52} $$$$ P(Z = 1) = \frac{1}{100 \times 52} $$

Question 2

What is the probability that the suspect matches the DNA sample from the cold-case if your test shows a positive match?

We are being asked $P(Z = 1 | X = 1)$. Using Bayes' theorem: $$ P(Z = 1 | X = 1) = \frac{P(X = 1 | Z = 1) P(Z = 1)}{P(X = 1)} $$

We have each of these except for $P(X = 1)$. We can obtain that through marginalization of the conditional:

$$ P(X = 1) = P(X = 1 | Z = 0)P(Z = 0) + P(X = 1 | Z = 1) P(Z = 1) $$$$ P(X = 1) = 0.001 \times \frac{5199}{5200} + 0.9995 \times \frac{1}{5200} = 0.00119 $$$$ P(Z = 1 | X = 1) = \frac{0.9995}{5200 \times 0.001003} = 0.162 $$

Question 3

What is the probability that the you fail to identify a match because your test was negative even though the suspect truly matched the DNA sample?

We are being asked $P(Z = 1 | X = 0)$. Following the same procedure above:

$$ P(Z = 1 | X = 0) = \frac{P(X = 0 | Z = 1) P(Z = 1)}{P(X = 0)} $$$$ P(X = 0) = 0.999\times \frac{5199}{5200} + 0.0005 \times \frac{1}{5200} = 0.999 $$$$ P(Z = 1 | X = 0) = \frac{0.0005}{5200 \times 0.999} = 9.63 \times 10^{-8} $$

Question 4

If you have 35 cold-cases, what is the probability that someone will be falsely accused in at least one of the 35.

We can see from question 2 that 84% are false positives. We will invert this question to be easier. What is the probability that someone is not falsely accused? That would require them either being guilty or not getting a positive result 35 times in a row.

$$ P(X = 1, Z = 1) + P(X = 0, Z = 0) + P(X = 0, Z = 1) = 1 - P(X = 1, Z = 0) = 1 - P(X = 1 | Z = 0) P(Z = 0) $$$$ = 1 - 0.001 \times \frac{51999}{5200} = 0.999 $$

That is the probability they ARE NOT falsely accused. Now this needs to occur 35 times in a row: $0.999^{35} = 0.9656$. Any other outcome will be 1 or more of the 35 cold-cases resulting in a false accusation.

After inverting:

$$ 100 \times (1 - 0.9656) = 3.4\% $$