**Dependence**: Two events E and F are dependent when knowing something about whether E happens gives us information about whether F happens (and vice versa).**Independence**: We say that two events E and F are independent if the probability that they both happen is the product of the probabilities that each one hapens: $P(E,F) = P(E) \cdot P(F)$

For example, when flipping a coin twice, knowing whether the first flip is Heads or Tails gives us no information about whether the second flip is Heads. However, knowing whether the first flip is Heads gives us information about whether **both flips** are Tails.

**Sample Space**: The set of all possible outcomes. The "universe" in this picture above.**Cardinality**: The number of elements in a set. The "pipe" symbol represents cardinality. $|A|$ is the number of elements in A.

Let's define the event "people with cancer" as $A$ and "people with no cancer" as $\neg A$.

What is the probability of A?

$$P(A) = \frac{|A|}{|U|}$$Say we are studying cancer, so we observe people and see whether they have cancer or not. If we take as our Universe all people participating in our study, then there are two possible outcomes for any particular individual, either he has cancer or not.

Questions:

What is the max probability of A?

Since $|A| <= |U|$ (number of elements of A <= number of elements of U), $P(A) <= 1$ (probability <= 100%).

Let's define the event "people who tested positive for cancer" as $B$ and "people who tested negative for cancer" as $\neg B$.

What is the probability of B?

$$P(B) = \frac{|B|}{|U|}$$- Event A: People with cancer
- Event B: People who tested positive for cancer

What is the probability of AB?

$$P(AB) = \frac{|AB|}{|U|}$$Given that the test is positive for a randomly selected individual, what is the probability that said individual has cancer?

$$P(A|B) = \frac{|AB|}{|B|}$$Questions:

How would you describe the “cancer status” and “test status” of people in each portion of the diagram (by color)?

- Pink: cancer, negative test
- Purple: cancer, positive test
- Blue: no cancer, positive test
- White: no cancer, negative test

Conditional Probability Notes

The notation for this is P(A|B) and it is read “the probability of A given B”.

- Given that we are in region B, what is the probability that we are in region AB?
- If we make region B our new Universe, what is the probability of A?

What we’ve effectively done is change the Universe from U (all people), to B (people for whom the test is positive).

This is known as **transforming the sample space**.

Probability that one event occurs *given* that another event has occurred.

Probability of A given B (prob of cancer given that the test is positive)

$$ P(A|B) = \frac{P(AB)}{P(B)} $$Probability of B given A (prob of testing positive given that you have cancer)

$$ P(B|A) = \frac{P(AB)}{P(A)} $$Note that when writing a **joint probability** the order does not matter $P(AB) == P(BA)$.

Researchers randomly assigned 72 chronic users of cocaine into three groups: desipramine (antidepressant), lithium (standard treatment for cocaine) and placebo. Results of the study are summarized below.

relapse | no relapse | total | |
---|---|---|---|

desipramine | 10 | 14 | 24 |

lithium | 18 | 6 | 24 |

placebo | 20 | 4 | 24 |

total | 48 | 24 | 72 |

WRITE THESE QUESTIONS ON THE BOARD AND HAVE PEOPLE SOLVE THEM

Marginal Probability

P(relapsed) = 48 / 72 ~ 0.67

Joint Probability

P(relapsed and desipramine) = 10 / 72 ~ 0.14

Conditional Probability

- P(relapse | desipramine) = P(relapsed and desipramine) / P(desipramine) = (10/72) / (24/72) = .42
- P(relapse | lithium) = 18 / 24 ~ 0.75
- P(relapse | placebo) = 20 / 24 ~ 0.83

talk about how discriminative learning algorithms learn the DIFFERENCE between multiple classes. i.e. a logistic regression trying to find the best fit line between the classes

generative learning models looks at each class individually and tries to learn that class in of itself. then it looks at a new observation and sees which model (for each class) it more closely resembles

- General Assembly Data Science 8 DC Notebooks by Kevin Markham LinkedIn | Twitter | Github | Website
- Andrew Ng CS229 Video Lectures / Lecture Notes
- https://oscarbonilla.com/2009/05/visualizing-bayes-theorem/
- https://docs.google.com/presentation/d/1psUIyig6OxHQngGEHr3TMkCvhdLInnKnclQoNUr4G4U/edit#slide=id.gfc69f484_023