Duplicated Record Detection Based on Improved RBF Neural Network
This paper presents a method based on modified Radial Basis Function(RBF)neural network to improve the accuracy and recall rate for detection of duplicated records.Firstly,key fields of records are clustered by Density-Based Spatial Clustering of Applications with Noise(DBSCAN)and all records are classified to several classes which include duplicated records.Secondly,the similarity of corresponding fields of records in each class is computed using Jaro algorithm and duplicated records are labeled manually.Finally,Subtractive Clustering Method(SCM)and Particle Swarm Algorithm(PSO)are used to optimize the parameters of RBF neural network so that monitoring model of duplicated records is built.This method is tested with different datasets.The experimental results show that the accuracy and recall rate for the detection of duplicated records are improved significantly.
complex system Duplicated records RBF neural network PSO SCM parameter optimizing
Xinting Liu Xiaodong Cai Bo Li Mingyao Chen
School of Computer and Information Security,Guilin University of Electronic Technology,China Guilin Topintelligent Communication Technology Co.,Ltd,China
国际会议
重庆
英文
2034-2037
2017-03-25(万方平台首次上网日期,不代表论文的发表时间)