Lecture 5: Conditioning Continued, Law of Total Probability

Stat 110, Prof. Joe Blitzstein, Harvard University


Thinking conditionally is a condition for thinking.

How can we solve a problem?

  1. Try simple and/or extreme cases.
  2. Try to break the problem up into simpler pieces; recurse as needed.

Let $A_1, A_2, \dots, A_n$ partition sample space $S$ into disjoint regions that sum up to $S$. Then

\begin{align} P(B) &= P(B \cap A_1) + P(B \cap A_2) + \dots + P(B \cap A_n) \\ &= P(B|A_1)P(A_1) + P(B|A_2)P(A_2) + \dots + P(B|A_n)P(A_n) \end{align}

Note that statistics is as much of an art as it is a science, and choosing the right partitioning is key. Poor choices of partitions may results in many, even more difficult to solve sub-problems.

This is known as the Law of Total Probability. Conditional probability is important in its own right, and sometimes we use conditional probability to solve problems of unconditional probability, as above with $P(B)$.

But conditional probability can be very subtle. You really need to think when using conditional probability.

Ex. 1

Let's consider a 2-card hand drawn from a standard playing deck. What is the probability of drawing 2 aces, given that we know one of the cards is an ace?

\begin{align} P(\text{both are aces | one is ace}) &= \frac{P(\text{both are aces})}{P(\text{one is ace})} \\ &= \frac{P(\text{both are aces})}{1 - P(\text{neither is ace})} \\ &= \frac{\binom{4}{2}/\binom{52}{2}}{1 - \binom{48}{2}/\binom{52}{2}} \\ &= \frac{1}{33} \end{align}

But now think about this: What is the probability of drawing 2 aces, knowing that one of the cards is the ace of spades?

\begin{align} P(\text{both are aces | ace of spades}) &= P(\text{other card is also an ace}) \\ &= \frac{3}{51} \\ &= \frac{1}{17} \end{align}

Notice how the fact that we know we have the ace of spades nearly doubles the probability of having 2 aces.

Ex. 2

Suppose there is a test for a disease, and this test is touted as being "95% accurate". The disease in question afflicts 1% of the population. Now say that there is a patient who tests positive for this disease under this test.

First we define the events in question:

Let $D$ be the event that the patient actually has the disease.

Let $T$ be the event that the patient tests positive.

Since that phrase "95% accurate" is ambiguous, we need to clarify that.

\begin{align} P(T|D) = P(T^c|D^c) = 0.95 \end{align}

In other words, conditioning on whether or not the patient has the disease, we will assume that the test is 95% accurate.

What exactly are we trying to find?

What the patient really wants to know is not $P(T|D)$, which is the accuracy of the test; but rather $P(D|T)$, or the probability she has the disease given that the test returns positive. Fortunately, we know how $P(T|D)$ relates to $P(D|T)$.

\begin{align} P(D|T) &= \frac{P(T|D)P(D)}{P(T)} ~~~~ & &\text{... Bayes Rule} \\ &= \frac{P(T|D)P(D)}{P(T|D)P(D) + P(T|D^c)P(D^c)} ~~~~ & & \text{... by the Law of Total Probability} \\ &= \frac{(0.95)(0.01)}{(0.95)(0.01) + (0.05)(0.99)} ~~~~ & & \text{... the rarity of the disease competes with the rarity of true negatives}\\ &\approx 0.16 \end{align}

Common Pitfalls

  1. Mistaking $P(A|B)$ for $P(B|A)$. This is also known as the Prosecutor's Fallacy, where instead of asking about the probability of guilt (or innocence) given all the evidence, we make the mistake of concerning ourselves with the probability of the evidence given guilt. An example of the Prosecutor's Fallacy is the case of Sally Clark.

  2. Confusing prior $P(A)$ with posterior $P(A|B)$. Observing that event $A$ occurred does not mean that $P(A) = 1$. But $P(A|A) = 1$ and $P(A) \neq 1$.

  3. Confusing independence with conditional independence. This is more subtle than the other two.

Definition

Events $A$ and $B$ are conditionally independent given event $C$, if

\begin{align} P(A \cap B | C) = P(A|C)P(B|C) \end{align}

In other words, conditioning on event $C$ does not give us any additional information on $A$ or $B$.

Does conditional independence given $C$ imply unconditional independence?

Ex. Chess Opponent of Unknown Strength

Short answer, no.

Consider playing a series of 5 games against a chess opponent of unknown strength. If we won all 5 games, then we would have a pretty good idea that we are the better chess player. So winning each successive game actually is providing us with information about the strength of our opponent.

If we had prior knowledge about the strength of our opponent, meaning we condition on the strength of our opponent, then winning one game would not provide us with any additional information on the probability of winning the next.

But if we do not condition on the strength of our opponent, meaning that we have no prior knowledge about our opponent, then successively winning a string a games actually does give us information about the probability of winning the next game.

So the games are conditionally independent given the strength of our opponent, but not independent unconditionally.

Does unconditional independence imply conditional independence given $C$?

Ex. Popcorn and the Fire Alarm

Again, short answer is no.

You can see this in the case of some phenomenon with multiple causes.

Let $A$ be the event of the fire alarm going off.

Let $F$ be the event of a fire.

Let $C$ be the event of someone making popcorn.

Suppose that either $F$ (an actual fire) or $C$ (the guy downstairs popping corn) will result in $A$, the fire alarm going off. Further suppose that $F$ and $C$ are independent: knowing that there's a fire $F$ doesn't tell me anything about anyone making popcorn $C$; and vice versa.

But the probability of a fire given that the alarm goes off and no one is making any popcorn is given by $P(F|A,C^c) = 1$. After all, if the fire alarm goes off and no one is making popcorn, there can only be one explanation: there must be a fire.

So $F$ and $C$ may be independent, but they are not conditionally independent when we condition on event $A$. Knowing that nobody is making any popcorn when the alarm goes off can only mean that there is a fire.