$$X = \mu + \phi \alpha$$
where $\mu$ is the mean, and $\alpha$ is the eigen vectors. By generating different coeffcitns $\alpha$, we can generate new $X$ values.
Beyond associating inputs to putputs
Recognize objects in the world and their factors of variation: recgnize a car and its differnt forms doors open or closed
Understand and imagine how the world evolves
Detect surprising events in the world
Imagine and generate rich plans for future
Establish concepts as useful for reasoning and decision making
If the number of hiddent units is the same as input and output units, then using an identity matrix as weights can re-generate the same input as output.
To avoid this, we can make the number o fhidden units smaller, so then the network should learn useful informaiton from input.
Goal: modeling $p(x)$
Three ways to do this:
Fully observed models
Transformation models:
Review on graphical model: to get the joint probability distribution $p(x_1, ...,x_n)
Undirected graphical model (here we have to deal with the partition function $Z$)
Directed grpahical model captiures the independendce assumption:
Example: $x_1\to x_2 \ \ x_2\to x_3 \ \ x_2 \to x_4 \ \ x_4\to x_3$
$p(x_1,x_2,x_3,x_4) = p(x_1)~p(x_2|x_1)~p(x_3|x_2x_4)~p(x_4|x_2)$
iIn general, we can write: $p(x_1,...x_n) = \prod_{i=1}^n p(x_i|\pi(x_i))$ where $\pi$ rep[resents the parents of a node.
$$p(x_1,...x_n) = \frac{\prod_{i=1}^m f_i (\phi(x))}{Z}$$
in undirected graphical model, we have to deal with the partition function $Z$.
In [ ]:
In [ ]:
In [ ]:
In [ ]:
In [ ]: