THESES CLUSTER BASED ON BILINGUAL AND SYNONYMOUS KEYWORD SETS USING MUTUAL INFORMATION

摘要：

Searching published papers is a required activity for the researching process. Since articles are presented in various languages, it makes precise queries hard to achieve. In this paper, we propose an automatic theses clustering method based on bilingual and synonymous keyword sets which includes Chinese and English keywords. We also provide a clustering computation to speedup operation. First, the system automatically generates bilingual and synonymous keyword sets, and then based on bilingual and synonymous keyword sets, clustering the theses. The method not only solves the weakness of using digital dictionaries to solve clustering problems, but also makes error problem, the query by bilingual and synonymous keywords, be restricted. The system was implemented by a clustering computation technology to solve traditional documents clustering systems performance problems. Through many computer processes, the system not only can save a lot of time, but also can attain high availability and load balancing effectiveness. Primary experiments prove that the system makes the theses clustering work effectively.

关键词： Document clustering Keyword set Bilingual and synonymous keyword

作者: CHUNG-YI HUANG RUNG-CHING CHEN

作者单位: Department of Computer Science and Engineering, National Chung-Hsing University, Taichung 402, Taiwa Department of Information Management, Chaoyang University of Technology, Taichung 41349, Taiwan

会议类型: 国际会议

会议名称: 2009 International Conference on Machine Learning and Cybernetics(2009机器学习与控制论国际会议)

会议地点: 保定

会议语种:英文

页码: 2999-3004

在线出版日期: 2009-07-12（万方平台首次上网日期，不代表论文的发表时间）

会议专题

THESES CLUSTER BASED ON BILINGUAL AND SYNONYMOUS KEYWORD SETS USING MUTUAL INFORMATION