会议专题

USING CLUSTERING TO ENHANCE TEXT CLASSIFICATION

Enlarging the training set is a general method to get more precise classification results. However, in traditional approach, the training sets are collected manually, so it is always difficult for us to get a training set large enough to enhance the performance of classification since we cannot afford the compensation of human resources. To address this problem, in this paper, we propose a model to get training sets automatically. This model associate clustering by similarity based on LSA with classification algorithm, experimental result shows that classification performance benefit can be gained from this approach and further performance benefits can also be obtained according to further work, which needs more research about feature selection, clustering, classification and semantic similarity calculating algorithm.

Latent Semantic Analysis (LSA) Semantic Clustering Tezt classification clustering

Lei Liu Lei Li Yixin Zhong

Center for Intelligence Science and Technology Research, Beijing University of Posts and Telecommunications, Beijing 100876, China

国际会议

China-Ireland International Conference on Information and Communications Technologies 2008(2008 中国-爱尔兰信息与通信技术国际会议 CIICT 2008)

北京

英文

1-4

2008-09-26(万方平台首次上网日期,不代表论文的发表时间)