会议专题

TCBPLK: A NEW METHOD OF TEXT CATEGORIZATION

This paper presents a new text categorization method based on P-L theory and Kohonen network, which called TCBPLK method.The Kohonen network is applied to realizing text categorization, which has a defect of too slowly speed of training.To text vector of high dimension, the defect is more obvious.Even the result of text categorization can not be acquired.The new method establishes vector space model of term weight by the theory of P-L, which enhances the function of the words from the viewpoint of categorization effect, and decreases the dimension of vector through eliminating redundant features.Experimental results confirm that TCBPLK method decreases the number of vector, and enhances the generalization and precision of text categorization.

Text categorization P-L theory Kohonen network Vector space model

JIAN-SUO XU

School of Economy, Beijing University, Beijing, 100871, China

国际会议

2007 International Conference on Machine Learning and Cybernetics(IEEE第六届机器学习与控制论国际会议)

香港

英文

3889-3892

2007-08-19(万方平台首次上网日期,不代表论文的发表时间)