Cloud-based Name Disambiguation Algorithm
In Scientific Collaboration Networks, the phenomenon that one author name corresponds to many author entities is very common. Traditional algorithms for name disambiguation performed inefficiently in dealing with massive data. This paper presents a parallel algorithm for solving the name disambiguation problem: first merge authors with same names and similar author information, then divide the scientific collaboration networks into author communities, authors with same name in one community is supposed as one entity with great possibility. The algorithm is based on the Cloud-Computing platform, and has the ability to deal with massive data. In our experiment, the algorithm efficiently processed massive data and achieved an average f-score of 0.93.
Cloud Computing Name Disambiguation Similarity Community Detection
Yang Juan He Hua Wu Bin
Beijing Key Laboratory of Intelligent Telecommunications Software and Multimedia Beijing University of Posts and Telecommunications Beijing, China 100876
国际会议
西安
英文
735-738
2010-08-07(万方平台首次上网日期,不代表论文的发表时间)