A Novel Spam Image Filtering Framework with Multi-Label Classification

摘要：

Gray images, which could reasonably be considered as either spam or ham by different recipients, present significant obstacles to conventional binary spam filtering systems. The inconsistent labels of gray images will inevitably deteriorate the overall filter performance. In this paper, we present a novel framework named BFMLC (Binary Filtering with Multi-Label Classification) to take both spam image filtering and user preferences into account. The BFMLC framework comprises two-stage classification tasks: the filter-oriented binary classification and user-oriented multi-label classification. A filter based on the BFMLC framework can not only discriminate spam images from ham images, but also classify spam image as several predefined topics. According to user preference settings on the client side, the specific spam images (gray images) are delivered to individuals. Moreover, the BFMLC framework can be generalized to deal text, image or mixed emails. We implement a spam image filtering system based on the BFMLC framework and conduct experiments in public personal datasets. The experimental results show that the system can identify spam images with the average accuracy of 96.309％ and classify spam images as predefined topics with the average precision of 89.42％.

作者: Hongrong Cheng Zhiguang Qin Chong Fu Yong Wang

作者单位: School of Computer Science & Engineering University of Electronic Science & Technology of China, Chengdu, Sichuan 611731 China

会议类型: 国际会议

会议名称: 2010 International Conference on Communications,Circuits and Systems(2010年通信、电路与系统国际会议)

会议地点: 成都

会议语种:英文

页码: 282-285

在线出版日期: 2010-06-28（万方平台首次上网日期，不代表论文的发表时间）

会议专题

A Novel Spam Image Filtering Framework with Multi-Label Classification