Monothetic Clustering is often used in Taxonomy. For example, when you see a strange animal, how do you know if it’s never reported before? You may need to ask $N$ True-or-False questions.

  • Q1. animal? (yes/no)
  • Q2. with legs? (yes/no)

So $Monthetic$ means that every time we only use a single attribute(variable) to cluster.

Read More

Our major task here is turn data into different clusters and explain what the cluster means. We will try spatial clustering, temporal clustering and the combination of both.

For each method of clustering, we will

  • try at least 2 values for each parameter in every algorithm.
  • explain the clustering result.
  • make some observation , compare different method and parameters.

Read More

What we want to do here is to design 3 mining tasks with their definitions of transactions and find some rules behind them.

For each task, we should

  • Try at least two discretization methods (divided by 10, divided by 20, …)
  • Try at least two algorithms (Apriori, FP-growth, …) to find association rules.
  • List the interesting rules.
  • Compare the differences between them.

Read More

Anomalies, or say outliers, are the set of data points that are considerably different than the remainder of the data. Common applications of anomaly detection are credit card fraud detection, telecommunication fraud detection, network intrusion detection, fault detection, and so on.

The working assumption of anomaly detection is:

There are considerably more “normal” observations than “abnormal” observations (outliers/anomalies) in the data.

Read More