Novel top-down methods for Hierarchical Text Classification
To classify large-scale text corpora, one common approach is using hierarchical text classification and classifying text documents in a top-down manner.Classification methods using top-down approach can scale well and cope with changes to the category trees.However, all these methods suffer from a common problem: a high level of misclassification document has unrecoverable.We define an virtual subclass for each non-leaf category to help the rejected documents go back to ancestor category ,thus improving the overall performance .Our experiments using Support Vector Machine (SVM) classifiers on the 20newsgroup collection have shown that they all could reduce blocking and improve the classification accuracy.Our experiments have also shown that the virtual category method delivered the best performance.
hierarchical classification virtual category top-down approach
CAO Ying Duan run-ying
Modern Educational Technology and Information Center Jiangxi University of Science and Technology,Ga Department of Computer Science and Technology GuanZhou University Sontan College,Zengcheng,GuangZhou
国际会议
International Conference on Advances in Engineering 2011(2011年工程研究进展国际学术会议 ICAE2011)
南京
英文
329-334
2011-12-17(万方平台首次上网日期,不代表论文的发表时间)