Hierarchical Clustering

Example

p1 p2 p3 p4 p5 p6
p1 0.00
p2 0.00
p3 0.00
p4 0.00
p5 0.00
p6 0.00

Inter-cluster proximity

  • Min distance (single link)
  • Max distance (complete link)
  • Average distance (average link)

Single link

  • handle non-spherical shaped clusters
  • prevents breaking large clusters (k-means breaks large clusters)
  • very sensitive to noise limitation

Complete link

  • more robust to noise
  • similar porblem with k-means: breaking large clusters

Summary

K-means

  • Storage $O((N+k)d)$
  • Computation $O(Nkd \ Iterations)$

Hierarchical

  • Storage: $O(N^2)$
  • Computation $O(N^3)$ but using

In [ ]: