Learning to Detect Web Spam by Genetic Programming
Web spam techniques enable some web pages or sites to achieve un deserved relevance and importance. They can seriously deteriorate search engine ranking results. Combating web spam has become one of the top chal lenges for web search. This paper proposes to learn a discriminating function to detect web spam by genetic programming. The evolution computation uses multi-populations composed of some small-scale individuals and combines the selected best individuals in every population to gain a possible best discriminat ing function. The experiments on WEBSPAM-UK2006 show that the approach can improve spam classification recall performance by 26%, F-measure per formance by 11%, and accuracy performance by 4% compared with SVM.
Web Spam Information Retrieval Genetic Programming Machine Learning
Xiaofei Niu Jun Ma Qiang He Shuaiqiang Wang Dongmei Zhang
School of Computer Science and Technology, Shandong University, Jinan 250101, China School of Comput School of Computer Science and Technology, Shandong University, Jinan 250101, China Department of Computer Science, Texas State University, San Marcos, US
国际会议
11th International Conference,WAIM 2010(第十一届网络时代管理国际会议)
九寨沟
英文
18-27
2010-07-14(万方平台首次上网日期,不代表论文的发表时间)