Collaborative Filtering

User-based Nearest Neighbor

  • Find other users who are similar to the test case (person of interest)
$$Pred(i) = \bar{r}_u+\frac{\sum_{n\in neighbors(u)} similarity(u,n).(r_{ni} - \bar{r}_{n})}{\sum_{n\in neighbors(u)} similarity(u,n)}$$
  • $\bar{r}_u$: average rating of user $u$

Item-based Nearest Neighbor

  • Find similar items
$$Pred() = \frac{\sum_{j\in ratedItems(u)} similarity(i,j).r_{uj}}{\sum_{j\in ratedItems(u)} similarity(i,j)}$$

Latent Factor (Matrix Factorization)

  • Rating matrix $A_{users \times items}$
  • User feature matrix $U_{users \times features}$
  • Item feature matrix $M_{items \times features}$
  • Predicted matrix: $U\times M^T$

How to find $U$ and $M$

  • Randomly initializae $U, M$ and the missing values in $A$
  • Repeat until convergence
    • Find $M$ such that $||A-UM^T||_F$ is minimized $$M_{ij} = M_{ij} \frac{(A^T U)_{ij}}{(MU^TU)_{ij}}$$
    • Find $U$ such that $||A-UM^T||_F$ is minimized $$U_{ij} = U_{ij} \frac{(A^T M)_{ij}}{(UM^TM)_{ij}}$$
    • Replace the missing values in $A$ with the corresponding value from $UM^T$

Frobenius norm: $||A||_F = \sqrt{\sum_{ij} A_{ij}^2}$

Example

In [ ]: