Chapter 10
Unsupervised Learning
Principal compnents allow us to summarize this set of correlated variables with a smaller number of the respresentative variables that collectively explain most of the variablity in the original set
It finds a low-dimensional representation of a data set that contains as much as possible of the variation. The first principal component of a set of features $X_1,X_2,...,X_p$ is the norimal linear combination of the featues $$ Z_1= \phi_{11}X_1+\phi_{21}X_2+\ldots+\phi_{p1}X_p $$
that has the largest variance. By Normalized, we mean that $\sum_{j=1}^p\phi_{j1}^2=1$ In the other words, the first principal component loading vector solves the optimization problem $$ \underset{\phi_{11},\ldots,\phi_{p1}}{maximize}\big\{ \frac{1}{n} \sum_{i=1}^{n}(\sum_{j=1}^{p}\phi_{j1}x_{ij})^2\big\} $$ subject to $$ \sum_{j=1}^{p}\phi_{j1}^2=1 $$ Problem can be solved via an eigen decoomposition.
the variables should be centered to have mean zero.
Two different software package will yield the same principal component loading vectors.
Depend on how many interesting patterns are found.
In [ ]: