Mining parallel knowledge from comparable patents

Bin LU, Ka Yin Benjamin TSOU, Tao JIANG, Jingbo ZHU, Oi Yee KWONG

Research output: Chapter in Book/Report/Conference proceedingChapter

5 Citations (Scopus)

Abstract

In recent years, the field of ontology learning from text has attracted much attention, resulting in a wide variety of approaches on mining knowledge from textual data. Since patent documents usually contain a large amount of technical terms, it is possible to acquire technical vocabularies from patents and to learn the relation between the technical terms. In this chapter, the authors address some major issues of mining parallel knowledge from comparable Chinese-English patents which contain both equivalent sentences as well as much noise. Based on a Chinese-English comparable corpus of patents, the authors attempt to mine two kinds of parallel knowledge which are parallel sentences and parallel technical terms, and investigate the application of the mined knowledge on statistical machine translation. The extracted parallel sentences and technical terms could be a good basis for further acquisition of term relations and the translation of monolingual ontologies, as well as for statistical machine translation systems and other cross-lingual information access applications. Copyright © 2011, IGI Global.
Original languageEnglish
Title of host publicationOntology learning and knowledge discovery using the web: Challenges and recent advances
EditorsWilson WONG, Wei LIU , Mohammed BENNAMOUN
Place of PublicationHershey, PA
PublisherInformation Science Reference
Pages247-271
ISBN (Print)9781609606251
Publication statusPublished - 2011

Fingerprint

Ontology

Citation

Lu, B., Tsou, B. K., Jiang, T., Zhu, J., & Kwong, O. Y. (2011). Mining parallel knowledge from comparable patents. In W. Wong, W. Liu, & M. Bennamoun (Eds.), Ontology learning and knowledge discovery using the web: Challenges and recent advances (pp. 247-271). Hershey, PA: Information Science Reference.