Tutorial 6 - Examples in Document Clustering

Assignment 1

Cluster terms and documents in your favorite document collection or book:

  1. Download the collection.
  2. Form the term $\times$ document matrix. Remove stop words, if necessary, and apply stemming, if possible.
  3. Normalize the weights, if necessary.
  4. Form the matrices $A$ and $A_n$.
  5. Cluster documents (and terms) in $k$ clusters using spectral $k$-partitioning of bipartite graphs.
  6. Comment the solution.