Leveraging sentiment distributions to distinguish figurative from literal health reports on Twitter

Rhys BIDDLE, Aditya JOSHI, Shaowu LIU, Cecile PARIS, Guandong XU

Research output: Chapter in Book/Report/Conference proceedingChapters

32 Citations (Scopus)

Abstract

Harnessing data from social media to monitor health events is a promising avenue for public health surveillance. A key step is the detection of reports of a disease (referred to as 'health mention classification') amongst tweets that mention disease words. Prior work shows that figurative usage of disease words may prove to be challenging for health mention classification. Since the experience of a disease is associated with a negative sentiment, we present a method that utilises sentiment information to improve health mention classification. Specifically, our classifier for health mention classification combines pre-trained contextual word representations with sentiment distributions of words in the tweet. For our experiments, we extend a benchmark dataset of tweets for health mention classification, adding over 14k manually annotated tweets across diseases. We also additionally annotate each tweet with a label that indicates if the disease words are used in a figurative sense. Our classifier outperforms current SOTA approaches in detecting both health-related and figurative tweets that mention disease words. We also show that tweets containing disease words are mentioned figuratively more often than in a health-related context, proving to be challenging for classifiers targeting health-related tweets. Copyright © 2020 IW3C2 (International World Wide Web Conference Committee).

Original languageEnglish
Title of host publicationProceedings of The Web Conference 2020
Place of PublicationNew York
PublisherAssociation for Computing Machinery (ACM)
Pages1217-1227
ISBN (Electronic)9781450370233
DOIs
Publication statusPublished - Apr 2020

Citation

Biddle, R., Joshi, A., Liu, S., Paris, C., & Xu, G. (2020). Leveraging sentiment distributions to distinguish figurative from literal health reports on Twitter. In Proceedings of The Web Conference 2020 (pp. 1217-1227). Association for Computing Machinery (ACM). https://doi.org/10.1145/3366423.3380198

Keywords

  • Health mention classification
  • Public health surveillance
  • Twitter
  • Figurative language

Fingerprint

Dive into the research topics of 'Leveraging sentiment distributions to distinguish figurative from literal health reports on Twitter'. Together they form a unique fingerprint.