A Novel Reliable Negative Method Based on Clustering for Learning from Positive and Unlabeled Examples

摘要：

This paper investigates a new approach for training text classifiers when only a small set of positive examples is available together with a large set of unlabeled examples.The key feature of this problem is that there are no negative examples for learning.Recently,a few techniques have been reported are based on building a classifier in two steps.In this paper,we introduce a novel method for the first step.which cluster the unlabeled and positive exampies to identify the reliable negative document,and then run SVM iteratively.We perform a comprehensive evaluation with other two methods,and show ex perimentally that it is efficient and effective.

关键词： Semi-Supervised Learning Text Classification Bisecting k-means Clustering Learning from Positive and Unlabeled Examples (LPU)

作者: Bangzuo Zhang Wanli Zuo

作者单位: College of Computer Science and Technology,Jilin University,ChangChun,130012,China;College of Comput College of Computer Science and Technology,Jilin University,ChangChun,130012,China

会议类型: 国际会议

会议名称: 4th Asia Information Retrieval Symposium(AIRS 2008)(第四届亚洲信息检索研讨会)

会议地点: 哈尔滨

会议语种:英文

页码: 385-392

在线出版日期: 2008-01-16（万方平台首次上网日期，不代表论文的发表时间）

会议专题

A Novel Reliable Negative Method Based on Clustering for Learning from Positive and Unlabeled Examples