A New KNN Categorization Algorithm for Harmful Information Filtering

摘要：

The prediction result of classifier is biased towards the class with more samples, when the harmful text information is filtered. This is because that the samples that including the harmful information were difficult to gain. Construct virtual samples is an effective means to solve the problem of pattern recognition in the small sample, using the upsampling method to construct virtual samples in the data layer, the traditional KNN algorithm has been improved: a small sample set is divided into clusters by using the K-means clustering, the virtual samples are generated and verified the validity in the cluster. The experimental results show that this method can construct the virtual samples which are similar to the real sample characteristics, and improved the classification effect of KNN algorithm.

关键词： component Small sample pattern recognition Virtual sample Harmful information filtering Network information security

作者: Juan DU Zhi an Yi

作者单位: Software College, Northeast Petroleum University Da qing, China

会议类型: 国际会议

会议名称: 2012 Fifth International Symposium on Computational Intelligence and Design 第五届计算智能与设计国际会议 ISCID 2012

会议地点: 杭州

会议语种:英文

页码: 489-492

在线出版日期: 2012-10-28（万方平台首次上网日期，不代表论文的发表时间）

会议专题

A New KNN Categorization Algorithm for Harmful Information Filtering