会议专题

CHINESE TEXT CATEGORIZATION BASED ON CCIPCA AND SMO

Vector space model is usually used to express text for text categorization. How to reduce the dimensionality of feature space is a very key problem for practical text classification. The classical decomposition algorithms are incapable of dealing with the high-dimensional and large-scale text categorization problems. In this paper an approach to improving the performance of text categorization is presented by using candid incremental principal component analysis and sequential minimization optimization algorithm. The experimental result shows that the proposed method for Chinese text categorization is practicable and effective.

Tezt categorization Dimension reduction Candid incremental principal component analysis (CCIPCA) Sequential minimization optimization algorithm (SMO)

XIN-FU LI HAI-BIN HE LEI-LEI ZHAO

College of Mathematics and Computer Science, Hebei University, Baoding 071002, China

国际会议

2008 International Conference on Machine Learning and Cybernetics(2008机器学习与控制论国际会议)

昆明

英文

2514-2518

2008-07-12(万方平台首次上网日期,不代表论文的发表时间)