A confidence-based entity resolution approach with incomplete information

Qi GU, Yan ZHANG, Jian CAO, Guandong XU, Alfredo CUZZOCREA

Research output: Chapter in Book/Report/Conference proceedingChapters

2 Citations (Scopus)

Abstract

Entity resolution identifies entities from different data sources that refer to the same real-world entity and it is an important prerequisite for integrating data from multiple sources. Entity resolution mainly relies on similarity measures on data records. Unfortunately, the data quality of data sources is not so good in practice. Especially web data sources often only provide incomplete information, which leads to the difficulties of direct applying similarity measures to identify the same entities. In order to address this problem, the concept of confidence is introduced to measure the trustworthy of the similarity calculation. An adaptive rule-based approach is used to calculate the similarity between records and its confidence is also derived. Then the similarity and confidence are propagated on the entity relational graph until fix point is reached. Finally, any pair of two records can be determined as matched or unmatched based on a threshold. We performed a series of experiments on real data sets and experiment results show that our approach has a better performance comparing with others. Copyright © 2014 IEEE.

Original languageEnglish
Title of host publicationProceedings of the 2014 IEEE International Conference on Data Science and Advanced Analytics, DSAA
PublisherIEEE
Pages97-103
ISBN (Electronic)9781479969913
DOIs
Publication statusPublished - 2014

Citation

Gu, Q., Zhang, Y., Cao, J., Xu, G., & Cuzzocrea, A (2014). A confidence-based entity resolution approach with incomplete information. In Proceedings of the 2014 IEEE International Conference on Data Science and Advanced Analytics, DSAA (pp. 97-103). IEEE. https://doi.org/10.1109/DSAA.2014.7058058

Fingerprint

Dive into the research topics of 'A confidence-based entity resolution approach with incomplete information'. Together they form a unique fingerprint.