Abstract
Under social tagging systems, a typical Web2.0 application, users label digital data sources by using tags which are freely chosen textual descriptions. Tags are used to index, annotate and retrieve resource as an additional metadata of resource. Poor retrieval performance remains a major problem of most social tagging systems resulting from the severe difficulty of ambiguity, redundancy and less semantic nature of tags. Clustering method is a useful tool to increase the ability of information retrieval in the aforementioned systems. In this paper, we propose a novel clustering algorithm named LIPC (Local Information Passing Clustering algorithm). The main steps of LIPC are: (1) we estimate a KNN neighbor directed graph G of tags and calculate the kernel density of each tag in its neighborhood; (2) we generate local information, local coverage and local kernel of each tag; (3) we pass the local information on G by I and O operators until they are converged and tag priory are generated; (4) we use tag priory to find out the clusters of tags. Experimental results on two real world datasets namely MedWorm and MovieLens demonstrate the efficiency and the superiority of the proposed method. Copyright © 2011 Springer-Verlag Berlin Heidelberg.
Original language | English |
---|---|
Title of host publication | Database systems for Advanced applications: 17th International Conference, DASFAA 2012, International Workshops: FlashDB, ITEMS, SNSM, SIM3, DQDI, Busan, South Korea, April 15-18, 2012, Proceedings |
Editors | Jianliang XU, Ge YU, Shuigeng ZHOU, Rainer UNLAND |
Publisher | Springer |
Pages | 333-343 |
ISBN (Electronic) | 978364220244 |
ISBN (Print) | 9783642202438 |
DOIs | |
Publication status | Published - Apr 2011 |