会议专题

Improving Medical Ontology Based on Word Embedding

  Medical ontology learning or improving is automatically learning the knowledge in ontology format from medical data, mainly text data.With the rise of the word vector space, improving ontology using word embedding has become a hot spot.Most of previous studies have focused on how to acquire different ontological elements using all kinds of learning technologies.Few studies focus on the prior knowledge in a given ontology.In essence,ontology learning or improving is still a learning process based on existing samples.So, the type and number of knowledge acquired is limited by existing samples in a given ontology.This paper firstly formalizes several kinds of prior knowledge for classes in a given ontology.Then we propose a method, named improving medical ontology based on word embeddings (IMO-WE), to enrich different types of knowledge from medical text according to characteristics of different types of prior knowledge.At last, the paper collects the PubMed Central (PMC) data and the PHARE ontology, and finishes a series of experiments to evaluate the IMO-WE.The experimental results yield the following conclusions.The first one is that the data-rich model can achieve higher accuracy for the IMO-WE under same setting in training progress.So, collecting and training big medical data is a viable way to learn more useful knowledge, The second one is that the IMO-WE can be used to improving ontology knowledge when medical data is sufficiently abundant and the ontology has appropriate prior knowledge.Moreover, in the task of improving synonymous labels through similarity distance, the accuracy of IMO-WE is significantly better than that of the Random indexing method.

word embedding medical ontology improving,prior knowledge

Mingxia Gao Furong Chen Rifeng Wang

College of Computer Science and Technology, Beijing University of Technology, Beijing 100124, China TravelSky Technology Limited, Beijing,China Guangxi University of Science and Technology, Liuzhou.P.R.China

国际会议

2018 6th International Conference on Bioinformatics and Computational Biology(ICBCB 2018)(第六届生物信息学与计算生物学国际会议)

成都

英文

121-127

2018-03-12(万方平台首次上网日期,不代表论文的发表时间)