IJCATR Volume 2 Issue 5

Detection of Outliers and Reduction of their Undesirable Effects for Improving the Accuracy of K-means Clustering Algorithm

Bahman Askari Sattar Hashemi Mohammad Hossein Yektaei
10.7753/IJCATR0205.1009
keywords : clustering, k-means, outliers, outlier detection.

PDF
Clustering is an unsupervised categorization technique and also a highly used operation in data mining, in which, the data sets are divided into certain clusters according to similarity or dissimilarity criterions so that the assigned objects to each cluster would be more similar to each other comparing to the objects of other clusters. The k-means algorithm is one of the most well-known algorithms in clustering that is used in various models of data mining. The k-means categorizes a set of objects into certain number of clusters. One of the most important problems of this algorithm occurs when encountering to outliers. The outliers in the data set lead to getting away from the real cluster centers and consequently a reduction in the clustering algorithm accuracy. In this paper, we separate outliers from normal objects using a mechanism based on dissimilarity of objects. Then, the normal objects are clustered using k-means algorithm process and finally, the outliers are assigned to the closest cluster. The experimental results show the accuracy and efficiency of the proposed method.
@artical{b252013ijcatr02051009,
Title = "Detection of Outliers and Reduction of their Undesirable Effects for Improving the Accuracy of K-means Clustering Algorithm",
Journal ="International Journal of Computer Applications Technology and Research(IJCATR)",
Volume = "2",
Issue ="5",
Pages ="552 - 556",
Year = "2013",
Authors ="Bahman Askari Sattar Hashemi Mohammad Hossein Yektaei"}
  • null