会议专题

A Classification for Short Text Based on Category Distinguishing Features

  Short text is characterized with sparseness and weak description for concept, which make the traditional classification unsuitable for short text.Existing classification methods for short text can be divided into two categories.One tends to expand the feature space with the help of some external resources such as wiki.This type of methods is time-consuming and the results are largely dependent on the quality of the external resources.The other selects features and instances in an iterative process, in which, the feature selection is the key for the classification.In this paper, we prefer the latter and propose a short text classification based on the category distinguishing abilities of features.Firstly, we select the features with higher ability for category distinguishing, and extract the training and test subset with those selected features.Secondly, the training subset is used to pre-classify the test subset.The process is iterated until the all test data are labeled.Experimental results show the effectiveness of the proposed method and its superiority over the existing methods.

Feature Selection Sparseness Short text classification

Xuegang Hu Chaoqun Yang Yuhong Zhang

School of Computer Science and Information Engineering, Hefei University of Technology, Hefei 230009, China

国际会议

International Conference on Computational Science and Engineering Applications(CSEA2015)2015计算机科学与工程应用国际会议

三亚

英文

304-310

2015-12-26(万方平台首次上网日期,不代表论文的发表时间)