会议专题

A PROPERTY OPTIMIZATION METHOD in SUPPORT of APPROXIMATELY DUPLICATED RECORDS DETECTING

In approximately duplicated records detecting of large dataset, the composition of data is complicated and the properties of data are too many, so the measurement accuracy is not high, the implementation cost is oversized. In view of these problems, a subfuzzy clustering property optimization method based on grouping is proposed. That is, first, the properties of group record are processed to reduce the dimension of property effectively and obtain the representation of the group, and then a similarity comparison method is used to detect approximately duplicated records in groups. It is shown in theoretical analysis and experiment, this method has higher detection accuracy and efficiency, and could better solve the recognition problems of approximately duplicated records in large dataset.

Property Optimization Approzimately Duplicated Records Sub-Fuzzy Clustering Similarity

Xiao Mansheng Liu Youshi Zhou Xiaoqi

School of Science,Hunan University of Technology Zhuzhou,412008,Hunan,China College of Science and Technology,Hunan University of Technology,Zhuzhou,412008,Hunan,China

国际会议

2009 IEEE International Conference on Intelligent Computing and Intelligent Systems(2009 IEEE 智能计算与智能系统国际会议)

上海

英文

1933-1937

2009-11-20(万方平台首次上网日期,不代表论文的发表时间)