Unsupervised keyword extraction from microblog posts via hashtags

Lin LI, Jinhang LIU, Yueqing SUN, Guandong XU, Jingling YUAN, Luo ZHONG

Research output: Contribution to journalArticlespeer-review

7 Citations (Scopus)

Abstract

Nowadays, huge amounts of texts are being generated for social networking purposes on Web. Keyword extraction from such texts like microblog posts benefits many applications such as advertising, search, and content filtering. Unlike traditional web pages, a microblog post usually has some special social feature like a hashtag that is topical in nature and generated by users. Extracting keywords related to hashtags can reflect the intents of users and thus provides us better understanding on post content. In this paper, we propose a novel unsupervised keyword extraction approach for microblog posts by treating hashtags as topical indicators. Our approach consists of two hashtag enhanced algorithms. One is a topic model algorithm that infers topic distributions biased to hashtags on a collection of microblog posts. The words are ranked by their average topic probabilities. Our topic model algorithm can not only find the topics of a collection, but also extract hashtag-related keywords. The other is a random walk based algorithm. It first builds a word-post weighted graph by taking into account posts themselves. Then, a hashtag biased random walk is applied on this graph, which guides the algorithm to extract keywords according to hashtag topics. Last, the final ranking score of a word is determined by the stationary probability after a number of iterations. We evaluate our proposed approach on a collection of real Chinese microblog posts. Experiments show that our approach is more effective in terms of precision than traditional approaches considering no hashtag. The result achieved by the combination of two algorithms performs even better than each individual algorithm. Copyright © 2018 The Authors.

Original languageEnglish
Pages (from-to)93-120
JournalJournal of Web Engineering
Volume17
Issue number1-2
Publication statusPublished - Mar 2018

Citation

Li, L., Liu, J., Sun, Y., Xu, G. Yuan, J., & Zhong, L. (2018). Unsupervised keyword extraction from microblog posts via hashtags. Journal of Web Engineering, 17(1-2), 93-120.

Keywords

  • Keyword extraction
  • Microblog post
  • Hashtag
  • Topic model
  • Random walk

Fingerprint

Dive into the research topics of 'Unsupervised keyword extraction from microblog posts via hashtags'. Together they form a unique fingerprint.