The cultivation of a Chinese-English-Japanese trilingual parallel corpus from comparable patents

Bin LIU, Ka Po CHOW, Ka Yin Benjamin TSOU

Research output: Contribution to conferencePaper

Abstract

Ranging from machine translation (MT) to cross-lingual information retrieval, many NLP applications require parallel corpora as critical resources. Given the phenomenal growth in patents and in the need to mediate between different languages, we explore a new but important area involving patents by investigating how a Chinese-English-Japanese trilingual parallel corpora can be cultivated from comparable patents, and introduce our mined trilingual corpus, which demonstrates the considerable potential of cultivating large-scale parallel corpora from comparable patents.
Original languageEnglish
Publication statusPublished - 2011

Citation

Lu, B., Chow, K. P., & Tsou, B. K. (2011, September). The cultivation of a Chinese-English-Japanese trilingual parallel corpus from comparable patents. Paper presented at The Machine Translation Summit XIII, Xiamen University, Xiamen, China.

Fingerprint Dive into the research topics of 'The cultivation of a Chinese-English-Japanese trilingual parallel corpus from comparable patents'. Together they form a unique fingerprint.