A Classification for Short Text Based on Category Distinguishing Features

摘要：

　　Short text is characterized with sparseness and weak description for concept, which make the traditional classification unsuitable for short text.Existing classification methods for short text can be divided into two categories.One tends to expand the feature space with the help of some external resources such as wiki.This type of methods is time-consuming and the results are largely dependent on the quality of the external resources.The other selects features and instances in an iterative process, in which, the feature selection is the key for the classification.In this paper, we prefer the latter and propose a short text classification based on the category distinguishing abilities of features.Firstly, we select the features with higher ability for category distinguishing, and extract the training and test subset with those selected features.Secondly, the training subset is used to pre-classify the test subset.The process is iterated until the all test data are labeled.Experimental results show the effectiveness of the proposed method and its superiority over the existing methods.

关键词： Feature Selection Sparseness Short text classification

作者: Xuegang Hu Chaoqun Yang Yuhong Zhang

作者单位: School of Computer Science and Information Engineering, Hefei University of Technology, Hefei 230009, China

会议类型: 国际会议

会议名称: International Conference on Computational Science and Engineering Applications(CSEA2015)2015计算机科学与工程应用国际会议

会议地点: 三亚

会议语种:英文

页码: 304-310

在线出版日期: 2015-12-26（万方平台首次上网日期，不代表论文的发表时间）

会议专题

A Classification for Short Text Based on Category Distinguishing Features