会议专题

Investigating the performance of Cosine value and Jensen-Shannon Divergence in the kNN algorithm

  K Nearest Neighbor (kNN) is a commonly-used text categorization algorithm.Previous studies mainly focused on improvements of the algorithm by modifying feature selection and k value selection.This research investigates the possibility to use Jensen-Shannon Divergence as similarity measure in the kNN classifier,and compares the performance,in terms of classification accuracy.The experiment denotes that the kNN algorithm based on Jensen-Shannon Divergence outperforms that based on Cosine value,while the performance is also largely dependent on number of categories and number of documents in a category.

kNN Jensen-Shannon Divergence Text Categori-zatio Performance

Xiangdong Li Han Jia Li Huang

The School of Information Management, Wuhan University, Wuhan 430072, China Wuhan University Library, Wuhan University, Wuhan 430072, China

国际会议

2012 2nd international Conference on Materials Science and Information Technology(2012第二届材料科学与信息技术国际会议)(MSIT2012)

西安

英文

1455-1459

2012-08-24(万方平台首次上网日期,不代表论文的发表时间)