A Method to Discover Truth with Two Source Quality Metrics
In many web integration applications, there are usually some sources that depict the same entity object with different descriptions, which leads to lots of conflicts.Resolving conflicts and finding the truth can be used to improve the quality of integration or to build a high-quality knowledge base, etc.In the single-truth data conflicting scenario, existing methods have limitations to distinguish false negative, also named as data missing, and false positive.So their source quality measurements are inadequate.Therefore,in this paper, we use recall and false positive rate to measure source quality and present a method to discover truth.The experimental results on three real-word data sets show that the proposed algorithm can effectively distinguish the data missing and false positive and improve the precision of truth discovery.
truth discovery data conflicts data integration single-truth scenario
Dong Yu Derong Shen Mingdong Zhu Tiezheng Nie Yue Kou Ge Yu
College of Information Science and Engineering Northeastern University Shenyang, China
国际会议
济南
英文
161-164
2015-09-11(万方平台首次上网日期,不代表论文的发表时间)