CLUSTERING SYNONYMOUS ENGLISH AND CHINESE KEYWORDS FOR CROSS-LANGUAGE QUERIES

摘要：

In this paper, we propose an automatic clustering method to find synonymous terms including cross-language keywords from Chinese and English thesis documents.First, Chinese and English keyword pairs were collected from an existing database.Then, the system calculates the support and confidence values of the keyword pairs.Next, high confidence and support values are selected for keyword pairs.Subsequently, keyword pairs are merged by applying a clustering algorithm to various keyword pairs with similar meanings which are clustered into the same subset.Finally, effective applications can be applied based the subsets of collected words including cross-language or synonymous queries.The experimental results achieved 98.4％ precision identifying correct terms from 1220 keyword pair clusters from the collected subsets.The primary experimental results show that the system can provide effective information for users when making queries online.

关键词： Synonymous terms Keyword pairs Cross-language Keyword clustering

作者: RUNG-CHING CHEN CHUNG-YI HUANG YU-LEN HUANG

作者单位: Department of Information Management, Chaoyang University of Technology, Taichung 41349, Taiwan Computer Center, Chienkuo Technology University, Changhua 50094, Taiwan Department of Computer Science and Information Engineering, Tunghai University, Taichung 40704, Taiw

会议类型: 国际会议

会议名称: 2007 International Conference on Machine Learning and Cybernetics(IEEE第六届机器学习与控制论国际会议)

会议地点: 香港

会议语种:英文

页码: 1875-1880

在线出版日期: 2007-08-19（万方平台首次上网日期，不代表论文的发表时间）

会议专题

CLUSTERING SYNONYMOUS ENGLISH AND CHINESE KEYWORDS FOR CROSS-LANGUAGE QUERIES