Automatically Identifying Topics of Consumer Health Questions in Chinese
In health question answering (QA) system development, question topic identification is crucial to understand users information needs and further facilitate answer extraction. This paper presented a machine-learning method to automatically identify topics of health related questions in Chinese asked by the general public. We collected 2000 questions from Chinese consumer health website, and characterized them using 17 types of features such as lexical, grammatical, statistical, and semantic features. This method were applied to identify 6 health question topics of Condition Management, Healthy Lifestyle, Diagnosis, Health Provider Choosing, Treatment, and Epidemiology. The results showed the average F1-scores of the above 6 topic identification were 99.63%, 99.13%, 98.55%, 96.35%, 76.02%, and 71.77%, respectively.
Information Storage and Retrieval Machine Learning Medical Informatics
Haihong Guo Xu Na Jiao Li
Institute of Medical Information & Library,Chinese Academy of Medical Sciences,Beijing,China
国际会议
第十六届世界医药健康信息学大会((MEDINFO2017)、第二届世界医药健康信息学华语论坛(WCHIS 2017)、第15届全国医药信息学大会(CMIA 2017)
苏州
英文
388-392
2017-08-21(万方平台首次上网日期,不代表论文的发表时间)