会议专题

Improving Spamdexing Detection Via a Two-Stage Classification Strategy

Spamdexing is any of various methods to manipulate the relevancy or prominence of resources indexed by a search engine,usually in a manner inconsistent with the purpose of the indexing system.Combating Spamdexing has become one of the top challenges for web search.Machine learning based methods have shown their superiority for being easy to adapt to newly developed spamtechniques.In this paper,we propose a two-stage classification strategy to detect web spam,which is based on the predicted spamicity of learning algorithmsand hyperlink propagation.Preliminary experiments on standard WEBSPAM-UK2006 benchmark show that the two-stage strategy is reasonable and effective.

Guang-Gang Geng Chun-Heng Wang Qiu-Dan Li

Key Laboratory of Complex System and Intelligent Science,Institute of Automation Chinese Academy of Sciences,Beijing 100080,P.R.China

国际会议

4th Asia Information Retrieval Symposium(AIRS 2008)(第四届亚洲信息检索研讨会)

哈尔滨

英文

356-364

2008-01-16(万方平台首次上网日期,不代表论文的发表时间)