Contents
  1. 1.Fuzzy k-means
    1. 1.1.When Fuzzy k-means almost becomes Hard k-means
  2. 2.Reference

Fuzzy clustering is the opposite of “Hard Clustering” (i.e., “Crispy Clustering”).

For example, every data point x would claim its percentage belongness to every cluster Ci (1iK where K is the number of clusters). However, the report will be too long as in this type of clustering representation.

Let uij be the probability of that xi belongs to Cluster j and VjKj=1 are the centers of the K clusters. Our objective is to minimize

Ni=1Kj=1(uij)qxiVj2

which turns out to be TSSE (Total Sum of Squared Error) if uij is defined as a binary number (0 or 1).

Fuzzy k-means

Fuzzy k-means (FKM) is also called fuzzy C-means (FCM). This clustering method is stated as below:

Step 1. Guess K initial cluster centers {Vj} Kj=1 randomly given N data points.

Step 2. Update membership coefficients {uij} N,Ki=1,j=1

uij=(xiVj2)1q1Kl=1(xiVl2)1q1where i,juij=1

Step 3. Update centers.

V(new)j=i(uij)qxiiuij

Step 4. Repeat Step 2. and Step 3. until no big changes among uijN,Ki=1,j=1

Note that it is common that q>1 (e.g., q=2) so that this iterative clustering will be easier to converge. On the other hand, when q1 (i.e., q=1.001,1q1=1000), the clustering result would be similar to the behaviour of Hard Clustering.

When Fuzzy k-means almost becomes Hard k-means

If q=1+ (e.g., q=1.001), then FkM almost becomes Hard k-means.

For example, when q=1.001 and thus 1q1=1000. Assume there are 3 clusters currently, and

  • |xiv1|=150
  • |xiv2|=149
  • |xiv3|=148

then

  • |xiv1|2=50
  • |xiv2|2=49
  • |xiv3|2=48

since uij is positively relative to [|xivj|2]1q1, thus

  • ui1=501000501000+491000+481000=11+109+10181
  • ui2=491000501000+491000+481000=1091+109+1018109
  • ui3=481000501000+491000+481000=10181+109+10181018

The other disadvantage is that when q=1+, FkM converges slowly.

Reference

Contents
  1. 1.Fuzzy k-means
    1. 1.1.When Fuzzy k-means almost becomes Hard k-means
  2. 2.Reference