Investigating the performance of Cosine value and Jensen-Shannon Divergence in the kNN algorithm
K Nearest Neighbor (kNN) is a commonly-used text categorization algorithm.Previous studies mainly focused on improvements of the algorithm by modifying feature selection and k value selection.This research investigates the possibility to use Jensen-Shannon Divergence as similarity measure in the kNN classifier,and compares the performance,in terms of classification accuracy.The experiment denotes that the kNN algorithm based on Jensen-Shannon Divergence outperforms that based on Cosine value,while the performance is also largely dependent on number of categories and number of documents in a category.
kNN Jensen-Shannon Divergence Text Categori-zatio Performance
Xiangdong Li Han Jia Li Huang
The School of Information Management, Wuhan University, Wuhan 430072, China Wuhan University Library, Wuhan University, Wuhan 430072, China
国际会议
西安
英文
1455-1459
2012-08-24(万方平台首次上网日期,不代表论文的发表时间)