Retrofitting Concept Vector Representations of Medical Concepts to Improve Estimates of Semantic Similarity and Relatedness
Estimation of semantic similarity and relatedness between biomedical concepts has utility for many informatics applications.Automated methods fall into two categories: methods based on distributional statistics drawn from text corpora,and methods using the structure of existing knowledge resources.Methods in the former category disregard taxonomic structure,while those in the latter fail to consider semantically relevant empirical information.In this paper,we present a method that retrofits distributional context vector representations of biomedical concepts using structural information from the UMLS Metathesaurus,such that the similarity between vector representations of linked concepts is augmented.We evaluated it on the UMNSRS benchmark.Our results demonstrate that retrofitting of concept vector representations leads to better correlation with human raters for both similarity and relatedness,surpassing the best results reported to date.They also demonstrate a clear improvement in performance on this reference standard for retrofitted vector representations,as compared to those without retrofitting.
Semantics Natural Language Processing Unified Medical Language System
Zhiguo Yu Byron C.Wallace Todd Johnson Trevor Cohen
The University of Texas School of Biomedical Informatics at Houston,Houston,Texas,USA
国际会议
第十六届世界医药健康信息学大会((MEDINFO2017)、第二届世界医药健康信息学华语论坛(WCHIS 2017)、第15届全国医药信息学大会(CMIA 2017)
苏州
英文
657-661
2017-08-21(万方平台首次上网日期,不代表论文的发表时间)