Analytical Data Clustering
Analytical clustering is a quick and automatic way by preserving certain features of the input data. The method is analytical, deterministic, unsupervised, automatic, and noniterative.
Analytical clustering is a quick and automatic way by preserving certain features of the input data. The method is analytical, deterministic, unsupervised, automatic, and noniterative.
Monothetic Clustering is often used in Taxonomy. For example, when you see a strange animal, how do you know if it’s never reported before? You may need to ask $N$ True-or-False questions.
So $Monthetic$ means that every time we only use a single attribute(variable) to cluster.
Fuzzy clustering is the opposite of “Hard Clustering” (i.e., “Crispy Clustering”).
For example, every data point $x$ would claim its percentage belongness to every cluster $C_i$ ($1 \leq i \leq K$ where $K$ is the number of clusters). However, the report will be too long as in this type of clustering representation.
關聯規則探勘(Association Rule Mining)是資料探勘領域中很常用的一種探勘方式,其中Apriori
演算法和FP-Growth
演算法是最為有名的。在這篇文章中,我會介紹我在這兩個演算法上的實作以及實作成果的實驗數據。
Peak-climbing is also called “mode-seeking” or “valley-seeking”.
In general, there are two steps in Graph Methods.
Step 1. Construct a graph to connect all data (e.g., Minimal Spanning Tree, Relative Neighborhood Graph, Gabrial Graph, Delaunay Triangles, …)
Step 2. Delete some edges which are too long (inconsistent edges)
Our major task here is turn data into different clusters and explain what the cluster means. We will try spatial clustering, temporal clustering and the combination of both.
For each method of clustering, we will
What we want to do here is to design 3 mining tasks with their definitions of transactions and find some rules behind them.
For each task, we should
In this report, I will do some data preprocessing and then get some basic information about the dataset, New York Citi Bike Trip Histories, via tools.
Anomalies, or say outliers, are the set of data points that are considerably different than the remainder of the data. Common applications of anomaly detection are credit card fraud detection, telecommunication fraud detection, network intrusion detection, fault detection, and so on.
The working assumption of anomaly detection is:
There are considerably more “normal” observations than “abnormal” observations (outliers/anomalies) in the data.