An Improved Relation-Based Information Retrieval Technique for Bioinformatics

摘要：

One of the limitations with the current relationship-based IR models is that a relation is often recorded as a binary form,such as R(Term1,Term2),which is only composed of general information of a pair of two terms which are semantically and syntactically related to each other.To tackle this problem,a triple is defined in this paper as a data structure for the integration of a pair of concepts as well as a verb phrase or sometimes a special noun we extract from the sentence as the relation of the concepts pair.We applied the advanced ontology-based approach to extract generic concepts and relations by using both UMLS and WordNet,and implemented a new approach to rank retrieved passages from documents corresponding to measuring system performance mentioned in TREC 2007 Genomics Track.We built a new version (IRIRS)of the relation-based IR system (RIRS) developed by DM &Bioinformatics Lab of Drexel University in 2004.We use IRIRS to search answers in tests of English reading comprehension and improve the retrieval result of all official runs in TREC 2004 Genomics Track.The experiments which are based on the different collections show more promising performance of IRIRS than RIRS.7he character-based MAP measuring passage-level retrieval performance,for 64 topics from the first collection is significantly raised from 64.44 % (RIRS)to 74.28%.The MAP (Mean Average Precision)for 50 topics from the second collection is raised from 21.71%(TREC) and 37.58%(RIRS)to 40.14%.

作者: Yan Li Jian Wen Zhoujun Li

作者单位: School of Computer National University of Defense Technology Changsha,Hunan Province,410073,China School of Computer Science &Engineering Beihang University Beijing,100083,China

会议类型: 国际会议

会议名称: 2008 IEEE International Conference on Onformation and Automation(IEEE 信息与自动化国际会议)

会议地点: 张家界

会议语种:英文

页码: 1536-1541

在线出版日期: 2008-06-20（万方平台首次上网日期，不代表论文的发表时间）

会议专题

An Improved Relation-Based Information Retrieval Technique for Bioinformatics