Chapter 10
Unsupervised Learning

1 Principal Component Analysis

Principal compnents allow us to summarize this set of correlated variables with a smaller number of the respresentative variables that collectively explain most of the variablity in the original set

1.1 what are Principal Components

It finds a low-dimensional representation of a data set that contains as much as possible of the variation. The first principal component of a set of features $X_1,X_2,...,X_p$ is the norimal linear combination of the featues $$ Z_1= \phi_{11}X_1+\phi_{21}X_2+\ldots+\phi_{p1}X_p $$

that has the largest variance. By Normalized, we mean that $\sum_{j=1}^p\phi_{j1}^2=1$ In the other words, the first principal component loading vector solves the optimization problem $$ \underset{\phi_{11},\ldots,\phi_{p1}}{maximize}\big\{ \frac{1}{n} \sum_{i=1}^{n}(\sum_{j=1}^{p}\phi_{j1}x_{ij})^2\big\} $$ subject to $$ \sum_{j=1}^{p}\phi_{j1}^2=1 $$ Problem can be solved via an eigen decoomposition.

1.2 More on PCA

1.2.1 Scaling teh variables

the variables should be centered to have mean zero.

1.2.2 Uniquenness of the Principal Components

Two different software package will yield the same principal component loading vectors.

1.2.3 Deciding how many principal component to use

Depend on how many interesting patterns are found.

2 Clustering Methods

2.1 K-Means clustering

2.2 hierarchical clustering


In [ ]: