Fuzzy Clustering and Fuzzy k-Means
Fuzzy clustering is the opposite of “Hard Clustering” (i.e., “Crispy Clustering”).
For example, every data point x would claim its percentage belongness to every cluster Ci (1≤i≤K where K is the number of clusters). However, the report will be too long as in this type of clustering representation.
Let uij be the probability of that xi belongs to Cluster j and VjKj=1 are the centers of the K clusters. Our objective is to minimize
N∑i=1K∑j=1(uij)q‖xi−Vj‖2which turns out to be TSSE (Total Sum of Squared Error) if uij is defined as a binary number (0 or 1).
Fuzzy k-means
Fuzzy k-means (FKM) is also called fuzzy C-means (FCM). This clustering method is stated as below:
Step 1. Guess K initial cluster centers {Vj} Kj=1 randomly given N data points.
Step 2. Update membership coefficients {uij} N,Ki=1,j=1
uij=(‖xi−Vj‖−2)1q−1∑Kl=1(‖xi−Vl‖−2)1q−1where ∑i,juij=1Step 3. Update centers.
V(new)j=∑i(uij)q⋅xi∑iuijStep 4. Repeat Step 2. and Step 3. until no big changes among uijN,Ki=1,j=1
Note that it is common that q>1 (e.g., q=2) so that this iterative clustering will be easier to converge. On the other hand, when q≈1 (i.e., q=1.001,1q−1=1000), the clustering result would be similar to the behaviour of Hard Clustering.
When Fuzzy k-means almost becomes Hard k-means
If q=1+ (e.g., q=1.001), then FkM almost becomes Hard k-means.
For example, when q=1.001 and thus 1q−1=1000. Assume there are 3 clusters currently, and
- |xi−v1|=1√50
- |xi−v2|=1√49
- |xi−v3|=1√48
then
- |xi−v1|−2=50
- |xi−v2|−2=49
- |xi−v3|−2=48
since uij is positively relative to [|xi−vj|−2]1q−1, thus
- ui1=501000501000+491000+481000=11+10−9+10−18≈1
- ui2=491000501000+491000+481000=10−91+10−9+10−18≈10−9
- ui3=481000501000+491000+481000=10−181+10−9+10−18≈10−18
The other disadvantage is that when q=1+, FkM converges slowly.