Cantonese studies in the big data era: Applications and implications of the corpus of mid-20th century Hong Kong Cantonese

Research output: Contribution to conferencePapers

Abstract

This talk discusses how big data — linguistic corpus — can benefit Cantonese studies. Corpus data provide us with quantitative and qualitative information to look at how the language is actually used. Taking a bottom-up approach, we can sometimes find from the corpus new patterns and topics for research.
The talk will be divided into three parts: The first part introduces The Corpus of Mid-20th Century Hong Kong Cantonese such as its design and source of data. In the second part, I will discuss how the corpus data can be used for (a) language teaching; (b) pragmatic and discourse analysis; (c) exploring the inter-relationship between language, society and cognition. Some of these topics did not receive much attention in previous studies on Cantonese.
The talk will conclude with a demonstration of the phase 2 of the corpus. Copyright © 2018 WICL-4.
Original languageEnglish
Publication statusPublished - Jun 2018

Citation

Chin, A. (2018, June). Cantonese studies in the big data era: Applications and implications of the corpus of mid-20th century Hong Kong Cantonese. Paper presented at The 4th Workshop on Innovations in Cantonese Linguistics (WICL-4): Cantonese Linguistics in the Pacific Rim: Theory and Applications, University of British Columbia, Vancouver, Canada.

Fingerprint Dive into the research topics of 'Cantonese studies in the big data era: Applications and implications of the corpus of mid-20th century Hong Kong Cantonese'. Together they form a unique fingerprint.