STUDY ON THE CONSTRUCTION OF DOMAIN TEXT CLASSIFICATION MODEL WITH THE HELP OF DOMAIN KNOWLEDGE
Traditional text classification model uses statistical methods to obtain features. But in the aspect of discrimination domain and non-domain text category, domain knowledge relations havent been taken account of in these methods. A domain text classification model was presented in this paper. This model used the support vector machine learning algorithm, gained domain classification feature words through statistic and union domain words, structured domain classification feature space. With the help of domain knowledge relations, computed relevance between domain concepts, got domain classification feature weight. Finally domain text classification was realized. An experiment in the Yunnan tourism domain was carried on to confirm that domain knowledge relations have a good influence on the domain text classification. The classification accuracy rate has been increased 0.04 than improved TFIDF method.
Tezt classification Feature selection Domain knowledge Relations
ZHENG-TAO YU LU HAN CUN-LI MAO JIAN-YI GUO XIANG-YAN MENG ZHI-KUN ZHANG
The School of Information Engineering and Automation, Kunming University of Science and Technology, The School of Information Engineering and Automation, Kunming University of Science and Technology, The Institute of Intelligent Information Processing, Computer Technology Application Key Laboratory
国际会议
2008 International Conference on Machine Learning and Cybernetics(2008机器学习与控制论国际会议)
昆明
英文
2612-2617
2008-07-12(万方平台首次上网日期,不代表论文的发表时间)