Forms of constraints:
Properties of constraints:
$CL(a,b) \ \&\ ML(b,c) \ \rightarrow \ CL(a,c)$
Coherence: measure the agreement between constraints w.r.t. a distance measure; i.e. nearby points are not part of Cannot-Link, etc.
High informtauveness and high coherence $\rightarrow$ high gain
In the cluster assignment phase of K-means algorithm, for any point, check if it violates the constraints or not.
A new objective function $\phi_{pckm}$
$$\phi_{pckm}(\Omega) = SSE(\Omega) + \text{penalty}_{ML}(\Omega) + \text{penalty}_{CL} (\Omega)$$
HMRF = Hidden Markov Random Field
A hybrid approach, between constraint-based and distance-based
$$\phi_{hmrfkm}(\Omega) = \sum \sum D(x,\omega_j) + \sum_{x_i,x_j\in ML} \omega_{ij} \xi_D 1[l_i \ne l_j] + \\ \sum_{x_i,x_j\in CL} \bar{\omega}_{ij} \left(\xi(D_{\max}) - \xi_D(x_i,x_j)\right) 1[l_i = l_j] + \\ \log Z$$Transform the data into a weighted graph $G=(V,E,W)$ where the set of vertices $V=\{v_1, v_2, .. v_n\}$ represent the original data points, and weights of edges represent the similarity between pairs of points.
Graph-cut problem: partition he graph into k connected components such that the weigth of edges, $Cut(C_a, C_b)=\displaystyle \sum_{i\in C_a, j\in C_b} w_{ij}$
Normalized Cut For a clustering $C=\{C_1, C_2, ..C_k\}$, the NCut value is defined by $NCut(C) = \displaystyle \sum_{j=1}^k \frac{Cut(C_j,\bar{C}_j)}{Vol(C_j)}$
Volume of a partition $Vol(C_a)=\displaystyle \sum_{i\in C_a} \sum_{j=1}^{n} w_{ij}$
Laplacian Matrix $L = D - W$
$$\displaystyle \min_{C_1, C_2, ..C_k} \text{Trace}\left(H^T L H \right) \text{ s.t. } H^T D H = I$$
Distance between two clusters in complete link: $$\text{dist}(\omega_i, \omega_j) = \max(\text{dist}(x, x') \ | \ x \in \omega_i, x' \in \omega_j)$$
Conditional Mutual Information $$MI(A;B|C) = MI(A;B,C) - MI(A;C)$$
Optimzation problem $$P^*_{C|X} = \displaystyle \underset{P_{C|X}\in \mathcal{P}}{\mathrm{argmax}} MI(C; Y|Z)$$
Constraint properties:
Cluster assignement
In [ ]: