会议专题

Developing Methodologies to Find Abbreviated Laboratory Test Names in Narrative Clinical Documents by Generating High Quality Q-Grams

  Laboratory test names are used as basic information to diagnose diseases. However, this kind of medical information is usually written in a natural language. To find this information, lexicon based methods have been good solutions but they cannot find terms that do not have abbreviated expressions, such as neuts that means neutrophils. To address this issue, similar word matching can be used; however, it can be disadvantageous because of significant false positives. Moreover, processing time is longer as the size of terms is bigger. Therefore, we suggest a novel q-gram based algorithm, named modified triangular area filtering, to find abbreviated laboratory test terms in clinical documents, minimizing the possibility to impair the lexicons precision. In addition, we found the terms using the methodology with reasonable processing time. The results show that this method can achieve 92.54 precision, 87.72 recall, 90.06 f1-score in test sets when edit distance threshold(τ)=3.

Medical Informatics Medical Informatics Computing Natural Language Processing

Kyungmo Kim Jinwook Choi

Interdisciplinary program for Bioengineering,Seoul National University,Seoul 03080,South Korea Interdisciplinary program for Bioengineering,Seoul National University,Seoul 03080,South Korea;Depar

国际会议

第十六届世界医药健康信息学大会((MEDINFO2017)、第二届世界医药健康信息学华语论坛(WCHIS 2017)、第15届全国医药信息学大会(CMIA 2017)

苏州

英文

452-456

2017-08-21(万方平台首次上网日期,不代表论文的发表时间)