Study on Feature Selection Algorithm in Topic Tracing
Text classification is the key technology for topic tracking, and vector space model (VSM) is one of the most simple and effective model for topics representation. Feature selection algorithm in VSM is an important means of data pre-processing,and it can reduce vector space dimension and improve the generalization ability of the algorithm. Therefore, it is necessary for feature selection algorithms to be in-depth and extensive research. So we study how feature space dimension and feature selection algorithm affect topic tracking. Then we get the variation law that they affect topic tracking, and add up their optimal values in topic tracking. Finally, TDT evaluation methods prove that optimal topic tracking performance based on weight of evidence for text increases by 8.762% more than mutual information.
svm feature selection tdt evaluation topic tracking
Shengdong Li Xueqiang Lv Yuqin Li Shuicai Shi
Chinese Information Processing Research Center, Beijing Information Science and Technology Universit Chinese Information Processing Research Center, Beijing Information Science and Technology Universit
国际会议
成都
英文
319-324
2010-06-23(万方平台首次上网日期,不代表论文的发表时间)