Abstract
This paper describes the structure and compilation process of a Hong Kong Chinese Lexicon being developed for a forthcoming Cantonese keyboard. The lexicon contains more than 120,000 headwords, including common words and high-frequency named entities, and incorporates information such as Jyutping romanization, indicative word frequency, simple translations in five languages (English, Hindi, Indonesian, Nepali, Urdu), disambiguation, POS information, speech register and cross-register synonyms. The lexicon improves the Chinese learning experience of CAL students in two ways: (a) introducing and promoting the effective usage of input methods and dictionaries for learners that facilitate them to locate the correct choice of word or phrase within a particular context, and (b) providing additional register-specific data that enable advanced learners to further expand their vocabulary. This lexicon can be used in conjunction with WordNet or LIWC (Linguistic Inquiry and Word Count) for a range of natural language processing tasks. Copyright © 2023 World Scientific Publishing Co Pte Ltd.
Original language | English |
---|---|
Article number | 2350018 |
Journal | International Journal of Asian Language Processing |
Volume | 33 |
Issue number | 2 |
DOIs | |
Publication status | Published - Dec 2023 |
Citation
Lau, C. M., & Leung, W. S. S. (2023). The construction of a large-scale Hong Kong Chinese lexicon with multilingual translations for Chinese-as-an-Additional-Language (CAL) students. International Journal of Asian Language Processing, 33(2), Article 2350018. https://doi.org/10.1142/S2717554523500182Keywords
- Hong Kong Chinese lexicon
- Chinese as an Additional Language
- Cantonese Jyutping keyboard
- Multilingual translation