A Novel Text Representation Model for Text Classification

The text representation in text classification is usually a sequence of terms.As the number of terms becomes very high,it is greatly time-consuming to perform existed text categorization tasks.In this paper we presented a novel text representation model for text classification which greatly reduced the required resources.This model represents text with several features.Each feature corresponds to a theme that emerged from a set of related articles.We also introduce an efficient way to build the model.The proposed model has been applied to na飗e bayes classifier and experiments on Reuters-21578 corpus have shown that the efficiency is greatly improved without sacrificing classification accuracy even when the dimension of the input space is significantly reduced.
Jun Wang Yiming Zhou
School of Computer Science and Engineering,Beihang University Beijing,100191,P.R .China School of Computer Science and Engineering Beihang University ,Beijing,100191,P.R.China
国际会议
武汉
英文
2008-11-01(万方平台首次上网日期,不代表论文的发表时间)