From Wikipedia:
Basic setup: You are given two indistinguishable envelopes, each of which contains a positive sum of money. One envelope contains twice as much as the other. You may pick one envelope and keep whatever amount it contains. You pick one envelope at random but before you open it you are given the chance to take the other envelope instead.
There is no indication as to which of $X$ and $Y$ contains the lesser/greater amount.
Let's consider two competing arguments:
\begin{align} &\text{argument 1: } &\quad \mathbb{E}(Y) &= \mathbb{E}(X) \\ \\ &\text{argument 2: } &\quad \mathbb{E}(Y) &= \mathbb{E}(Y|Y=2X) \, P(Y=2X) \, + \, \mathbb{E}(Y|Y=\frac{X}{2}) \, P(Y=\frac{X}{2}) \\ & & &= \mathbb{E}(2X) \, \frac{1}{2} \, + \, \mathbb{E}(\frac{X}{2}) \, \frac{1}{2} \\ & & &= \frac{5}{4} \, \mathbb{E}(X) \end{align}So which argument is correct?
Argument 1 is symmetry, and that takes precedence.
Argument 2, however, has a flaw: we start with a condition, but right after assuming that we equate $\mathbb{E}(Y|Y=2X)$ with $\mathbb{E}(2X)$ and $\mathbb{E}(Y|Y=\frac{X}{2})$ with $\mathbb{E}(\frac{X}{2})$, and then we aren't conditioning any more. There is no reason to confuse a conditional probability with an unconditional probability.
Continuing with a further example of conditional expectation, consider repeated trials of coin flips using a fair coin.
So what we are really looking for are $\mathbb{E}(W_{HT})$ and $\mathbb{E}(W_{HH})$. Which do you think is greater? Are the two equal, is $W_{HT} \lt W_{HH}$, or is $W_{HT} \gt W_{HH}$?
If you think they are equal by symmetry, then you're wrong. By symmetry we know:
\begin{align} & & \mathbb{E}(W_{TT}) &= \mathbb{E}(W_{HH}) \\ & & \mathbb{E}(W_{HT}) &= \mathbb{E}(W_{TH}) \\ \\ &\text{but } & \mathbb{E}(W_{HT}) &\neq \mathbb{E}(W_{HH} \end{align}Consider first $\mathbb{E}(W_{HT})$; we can solve for this without using conditional expectation if we just think about things.
From the picture above, you can see that by the time we get the first $H$, we are actually halfway done. With this partial progress, all we need now is to see a $T$. If we see another $H$, that is OK, and we still keep waiting for a $T$. If we call the number of flips until the first $H$ $W_{1}$, then we can call the number of coin flips after that wait until we see $T$ $W_{2}$.
It is easy to recognize that $W_{1}, W_{2} \sim 1 - \operatorname{Geom}(\frac{1}{2})$, where support $k \in \{0,1,2,\dots\}$
So we have
\begin{align} \mathbb{E}(W_{HT}) &= \mathbb{E}(W_1) + \mathbb{E}(W_2) \\ &= \left[\frac{1 - {1/2}}{1/2} + 1 \right] + \left[\frac{1 - {1/2}}{1/2} + 1 \right] \\ &= 1 + 1 + 1 + 1 \\ &= \boxed{4} \end{align}Now let's consider $\mathbb{E}(W_{HH})$
In this case, if we get another $H$ immediately after seeing the first $H$, then we are done. But if we don't get $H$, then we have to start all over again and so we don't enjoy any partial progress.
Let's solve this using conditional expectation.
Similar to how we solved Gambler's Ruin by conditioning on the first toss, we have
\begin{align} \mathbb{E}(W_{HH}) &= \mathbb{E}(W_{HH} | \text{first toss is } H) \frac{1}{2} + \mathbb{E}(W_{HH} | \text{first toss is } T) \frac{1}{2} \\ &= \left( 2 \, \frac{1}{2} + (2 + \mathbb{E}(W_{HH}))\frac{1}{2} \right) \frac{1}{2} + \left(1 + \mathbb{E}(W_{HH}) \right) \frac{1}{2} \\ &= \left(\frac{1}{2} + \frac{1}{2} + \frac{\mathbb{E}(W_{HH})}{4} \right) + \left(\frac{1}{2} + \frac{\mathbb{E}(W_{HH})}{2} \right) \\ &= \frac{3}{2} + \frac{3 \, \mathbb{E}(W_{HH})}{4} \\ &= \boxed{6} \end{align}Genetics is a field where you might need to know about strings of letters, not $H,T$ but rather $A,C,T,G$.
If you're interested here's a good TED talk by Peter Donnelly on genetics and statistics.
Consider $\mathbb{E}(Y | X=x)$: what is $X=x$?
It is an event, and we condition on that event.
\begin{align} &\text{discrete case: } &\quad &\mathbb{E}(Y|X=x) = \sum_{y} y \, P(Y=y|X=x) \\ \\ &\text{continuous case: } &\quad &\mathbb{E}(Y|X=x) = \int_{-\infty}^{\infty} y \, f_{Y|X}(y|x) \, dy = \int_{-\infty}^{\infty} y \, \frac{ f_{X,Y}(x,y) }{ f_{X}(x) } \, dy \end{align}Now let $g(x) = \mathbb{E}(Y|X=x)$. This is a function of $Y$.
Then define $\mathbb{E}(Y|X) = g(X)$. e.g. if $g(x) = x^2$, then $g(X) = X^2$. So $\mathbb{E}(Y|X)$ is itself a random variable, and rather than a function of $Y$, is a function of $X$.
Let $X,Y$ be i.i.d. $\operatorname{Pois}(\lambda)$.
Let $T = X + Y$, find the conditional PMF.
\begin{align} P(X=k|T=n) &= \frac{P(T=n|X=k) \, P(X=k)}{P(T=n)} &\quad \text{by Bayes' Rule} \\ &= \frac{P(Y=n-k) \, P(X=k)}{P(T=n)} \\ &= \frac{ \frac{e^{-\lambda} \, \lambda^{n-k}}{(n-k)!} \, \frac{e^{-\lambda} \, \lambda^{k}}{k!}}{ \frac{e^{-2\lambda} \, (2\lambda)^n}{n!} } \\ &= \frac{n!}{(n-k)! \, k!} \, \left( \frac{1}{2} \right)^n \\ &= \binom{n}{k} \, \left( \frac{1}{2} \right)^n \\ \\ X | T=n &\sim \operatorname{Bin}(n, \frac{1}{2}) \\ \\ \mathbb{E}(X|T=n) &= \frac{n}{2} \Rightarrow \mathbb{E}(X|T) = \frac{T}{2} \end{align}Alternately, notice the symmetry...
\begin{align} \mathbb{E}(X|X+Y) &= \mathbb{E}(Y|X+Y) &\quad \text{by symmetry (because i.i.d.)} \\\\ \\ \mathbb{E}(X|X+Y) + \mathbb{E}(Y|X+Y) &= \mathbb{E}(X+Y|X+Y) \\ &= X + Y \\ &= T \\ \\ \Rightarrow \mathbb{E}(X|T) &= \frac{T}{2} \end{align}The single most important property of conditional expection is closely related to the Law of Total Probability.
Recall that $\mathbb{E}(Y|X)$ is a random variable. That being so, it is natural to wonder what the expected value is.
Consider this:
\begin{align} \mathbb{E} \left( \mathbb{E}(Y|X) \right) &= \mathbb{E}(Y) \end{align}We will go into more detail next time!
View Lecture 26: Conditional Expectation Continued | Statistics 110 on YouTube.