A Method of Naming and Identifying Chinese Medical Cases Based on Multi-feature Template Modification
Chinese medical case is a record of diagnosis and treatment activities of TCM experts.Identification of its named entities is of great significance to standardization and information nalization of Chinese medical cases.In view of vague expression and unclear title in the text of Chinese medical case,based on conditional random fields(CRFs),this paper proposes a named entity identification pat-tern based on multi-feature template modification.Firstly,sentence extraction and automatic word segmentation were performed on the texts of Chinese medical cases.Then,character features,part-of-speech features,left-right designator features and term features were labelled for the corpora after word segmentation.Finally,CRFs models were trained using the labelled data to identify four diagnostic of TCM,syndrome patterns and therapy entities,build triple correspondence of four diagnostic of TCM-syndrome pattern-therapy,and provide reference and basis for scientific argumentation of syndrome differentiation and treatment.With 12,000 Chinese medical cases of cardiovascular outpatient specialists at the Second Affiliated Hospital of Shandong University of Traditional Chinese Medicine as the data source,identification accuracy was further enhanced by different combinations of features and adjustment of context window size.Average accuracy,recall and F measures reached 90.68%,90.45%,and 90.56%,respectively.
named entity identification Chinese medical case conditional random field multi-feature
Shouqiang Chen Yang Chen Feng Yuan Lili Zhao Wenrong An
Center of Hear of the Second Affiliated Hospital of Shandong University of Traditional Chinese Medic School of Information Science and Engineering,Shandong Normal University,Jinan,China;University of W Key Laboratory of TCM Data Cloud Service in Universities of Shandong(Shandong Management University)
国际会议
西安
英文
541-550
2018-09-21(万方平台首次上网日期,不代表论文的发表时间)