English-Persian Tezt Retrieval Using Concept Graph

摘要：

Cross-language information retrieval (CLIR) is the retrieval process where the user presents queries in one language to retrieve documents in another language. In this field the resolution of lexical ambiguity in translating queries is a key challenge. In this paper, we propose a technique for calculating translation probabilities based on creating query terms concept graphs for selecting the right translation sense of query terms for English-Persian text retrieval. We present an efficient statistical method for creating this graph. We test the effectiveness of the proposed disambiguation method on Hamshahri collection1 that is standardized according to CLEF standards. Evaluation using this data collection shows great effectiveness of the proposed method.

关键词： Tezt retrieval Concept graph Term Weighting Translation disambiguation

作者: Farnaz Teymoorian Mehran Mohsenzadeh MirAli Seyyedi

作者单位: Department of Computer Engineering Islamic Azad University - North Tehran Branch, Tehran, Iran Department of Computer Engineering Islamic Azad University - Science & Research Center, Tehran, Iran Department of Computer Engineering Islamic Azad University - South Tehran Branch, Tehran, Iran

会议类型: 国际会议

会议名称: 2009 2nd IEEE International Conference on Computer Science and Information Technology(第二届计算机科学与信息技术国际会议 ICCSIT2009)

会议地点: 北京

会议语种:英文

页码: 2395-2399

在线出版日期: 2009-08-08（万方平台首次上网日期，不代表论文的发表时间）

会议专题

English-Persian Tezt Retrieval Using Concept Graph