会议专题

MULTI-LAYER FEATURES BASED PERSONALIZED SPAM FILTERING

In this paper, we face a new challenge that the filter is expected to converge much faster, e.g. within 10 labeled SMSs or less. Topic model based dimension reduction can minimize the structural risk with limited training data. But dimension reduction will go against the completeness of feature space. It is very difficult to obtain the convergence rate and the completeness at the same time only by one kind of feature. This paper uses supervised dual-PLSA for Dimensionality Reduction and presents a multi-layer features model, which employs two layer features and adopts a novel method to combine them. Experiments show that multi-layer features model have the best performance.

Spam Filtering Personalized Filtering PLSA dual-PLSA Multi-layer features

Weiran Xu Zhanyi Wang Dongxin Liu Jun Guo Rile Hu

School of Information and Communication Engineering,Beijing University of Posts and Telecommunicatio Nokia Research Center(China), Beijing

国际会议

2009 IEEE International Conference on Network Infrastructure and Digital Content(2009年IEEE网络基础设施与数字内容国际会议 IEEE IC-NIDC2009)

北京

英文

368-373

2009-11-06(万方平台首次上网日期,不代表论文的发表时间)