Name Disambiguation Using Atomic Clusters
Name ambiguity is a critical problem in many applications, in particular in the online bibliography systems, such as DBLP and CiteSeer. Previously, several clustering based methods have been proposed although, the problem still presents to be a big challenge for both research and industry communities. In this paper, we present a complementary study to the problem from another point of view. We propose an approach of finding atomic clusters to improve the performance of existing clustering-based methods. We conducted experiments on a dataset from a real-world system: Arnetminer.org. Experiments results show that significant improvements can be obtained by using the proposed atomic clusters finding approach (about +8% and +27% improvements depending on different clustering methods).
Feng Wang Juanzi Li Jie Tang Jing Zhang Kehong Wang
Department of Computer Science and Technology,Tsinghua University East Main Building Room 10-201,Tsinghua University,Beijing 100084,China
国际会议
The Ninth International Conference on Web-Age Information Management(第九届web时代信息管理国际会议)(WAIM 2008)
张家界
英文
2008-07-20(万方平台首次上网日期,不代表论文的发表时间)