Differentiation between oriental and European scripts

Jie DING, Suk Wah Louisa LAM, Ching Y. SUEN

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

Two types of techniques are usually adopted in language differentiation: token matching and statistical analysis. In this paper we present a method which uses a combined analysis of several discriminating statistical features, for the differentiation between European and oriental language scripts. When applied to more than 23 languages, it has proved to be effective in classifying documents printed in these different scripts. Copyright © 1997 Oriental Languages Computer Society, Inc. (OLCS)
Original languageEnglish
Title of host publicationProceedings of the 17th International Conference on Computer Processing of Oriental Languages(ICCPOL '97)
Place of PublicationHong Kong
PublisherHong Kong Baptist University
Pages35-40
Volume1
Publication statusPublished - 1997

Fingerprint

Computer programming languages
Statistical methods

Citation

Ding, J., Lam, L., & Suen, C. Y. (1997). Differentiation between oriental and European scripts. In Proceedings of the 17th International Conference on Computer Processing of Oriental Languages(ICCPOL '97) (Vol. 1, pp. 35-40). Hong Kong: Hong Kong Baptist University.