In recent years, the field of ontology learning from text has attracted much attention, resulting in a wide variety of approaches on mining knowledge from textual data. Since patent documents usually contain a large amount of technical terms, it is possible to acquire technical vocabularies from patents and to learn the relation between the technical terms. In this chapter, the authors address some major issues of mining parallel knowledge from comparable Chinese-English patents which contain both equivalent sentences as well as much noise. Based on a Chinese-English comparable corpus of patents, the authors attempt to mine two kinds of parallel knowledge which are parallel sentences and parallel technical terms, and investigate the application of the mined knowledge on statistical machine translation. The extracted parallel sentences and technical terms could be a good basis for further acquisition of term relations and the translation of monolingual ontologies, as well as for statistical machine translation systems and other cross-lingual information access applications. Copyright © 2011, IGI Global.
|Title of host publication||Ontology learning and knowledge discovery using the web: Challenges and recent advances|
|Editors||Wilson WONG, Wei LIU , Mohammed BENNAMOUN|
|Place of Publication||Hershey, PA|
|Publisher||Information Science Reference|
|Publication status||Published - 2011|