Abstract
Nowadays, a huge amount of text is being generated for social networking purpose on the Web. Keyword extraction from such text benefit many applications such as advertising, search, and content filtering. Recent studies show that graph based ranking is more effective than traditional term or document frequecy based approaches. However, most work in the literature constructs word to word graph within a document or a collection of documents before applying a kind of random walk. Such a graph does not consider the influence of document importance on keyword extraction. Moreover, social text like a microblog post usually has speical social features such as hashtag and so on, which can help us understand its topic. In this paper, we propose hashtag biased ranking for keyword extraction from a collection of microblog posts. We first build a word-post weighted graph by taking into account the posts themselves. Then, a hashtag biased random walk is applied on this graph, which guides our approach to extract keywords according to the hashtag topic. Last, the final ranking of a word is determined by the stationary probability after a number of interations. We evaluate our proposed method on a real Chinese microblog posts. Experiments show that our method is more effective than the traditional word to word graph based ranking in terms of precision. Copyright © 2015 Springer International Publishing Switzerland.
Original language | English |
---|---|
Title of host publication | Knowledge science, engineering and management: 8th International Conference, KSEM 2015, Chongqing, China, October 28-30, 2015, Proceedings |
Editors | Songmao ZHANG, Martin WIRSING, Zili ZHANG |
Publisher | Springer |
Pages | 348-359 |
ISBN (Electronic) | 9783319251592 |
ISBN (Print) | 9783319251585 |
DOIs | |
Publication status | Published - 2015 |