Cloud-based Name Disambiguation Algorithm

摘要：

In Scientific Collaboration Networks, the phenomenon that one author name corresponds to many author entities is very common. Traditional algorithms for name disambiguation performed inefficiently in dealing with massive data. This paper presents a parallel algorithm for solving the name disambiguation problem: first merge authors with same names and similar author information, then divide the scientific collaboration networks into author communities, authors with same name in one community is supposed as one entity with great possibility. The algorithm is based on the Cloud-Computing platform, and has the ability to deal with massive data. In our experiment, the algorithm efficiently processed massive data and achieved an average f-score of 0.93.

关键词： Cloud Computing Name Disambiguation Similarity Community Detection

作者: Yang Juan He Hua Wu Bin

作者单位: Beijing Key Laboratory of Intelligent Telecommunications Software and Multimedia Beijing University of Posts and Telecommunications Beijing, China 100876

会议类型: 国际会议

会议名称: 2010 International Conference of Informationa Science and Management Engineering(2010年信息科学与管理工程国际学术会议 ISME 2010)

会议地点: 西安

会议语种:英文

页码: 735-738

在线出版日期: 2010-08-07（万方平台首次上网日期，不代表论文的发表时间）

会议专题

Cloud-based Name Disambiguation Algorithm