Abstract
High dimensional clustering is often encountered in real application and projective clustering is an effective way to deal with high dimensional clustering problems aiming to capture the dense areas embedded in subsets of attributes/subspaces. Most projective clustering algorithms use equal or varying width hyper-rectangle structure to identify the dense areas and their locations. Therefore, it is a crucial task to decide the widths of these hyper-rectangle structures in projective clustering. Naturally, making use of the real data distribution directly to determine the widths of the dense structures is a promising and feasible approach. In this paper, we propose a projective clustering algorithm based on hyper-rectangle structure, whose width is estimated from the kernel distribution of real data. In particular, we first define a structure called Significant Local Dense Area (SLDA) structure by using an efficient kernel density estimator, Rodeo; and then design a greedy search method to find the whole SLDAs covered the data distribution in the high-dimensional space; eventually, we run a single-linkage clustering algorithm on the SLDAs to form the final clusters and identify the outliers. The main strength of the proposed algorithm is validated by the experiments on synthetic and real world data sets. Copyright © 2012 IEEE.
Original language | English |
---|---|
Title of host publication | Proceedings of The 2012 International Joint Conference on Neural Networks, IJCNN |
Place of Publication | Danvers, MA |
Publisher | IEEE |
ISBN (Print) | 9781467314909 |
DOIs | |
Publication status | Published - 2012 |