SBTM: Topic modeling over short texts

Jianhui PANG, Xiangsheng LI, Haoran XIE, Yanghui RAO

Research output: Chapter in Book/Report/Conference proceedingChapter

4 Citations (Scopus)

Abstract

With the rapid development of social media services such as Twitter, Sina Weibo and so forth, short texts are becoming more and more prevalent. However, inferring topics from short texts is always full of challenges for many content analysis tasks because of the sparsity of word co-occurrence patterns in short texts. In this paper, we propose a classification model named sentimental biterm topic model (SBTM), which is applied to sentiment classification over short texts. To alleviate the problem of sparsity in short texts, the similarity between words and documents are firstly estimated by singular value decomposition. Then, the most similar words are added to each short document in the corpus. Extensive evaluations on sentiment detection of short text validate the effectiveness of the proposed method. Copyright © 2016 International Publishing Switzerland.

Original languageEnglish
Title of host publicationDatabase systems for advanced applications: DASFAA 2016 International Workshops: BDMS, BDQM, MoI, and SeCoP, Dallas, TX, USA, April 16-19, 2016 proceedings
EditorsHong GAO, Jinho KIM, Yasushi SAKURAI
Place of PublicationCham
PublisherSpringer
Pages43-56
ISBN (Electronic)9783319320557
ISBN (Print)9783319320540
DOIs
Publication statusPublished - 2016

    Fingerprint

Citation

Pang, J., Li, X., Xie, H., & Rao Y. (2016). SBTM: Topic modeling over short texts. In H. Gao, J. Kim, & Y. Sakurai (Eds.), Database systems for advanced applications: DASFAA 2016 International Workshops: BDMS, BDQM, MoI, and SeCoP, Dallas, TX, USA, April 16-19, 2016 proceedings (pp. 43-56). Cham: Springer.

Keywords

  • Short text classification
  • Sentiment detection
  • Topic-based similarity
  • Biterm topic model