Anaphora Resolution in Chinese for Analysis of Medical Q&A Platforms
In medical Q&A platforms,patients share information about their diag-nosis,give advice and consult with doctors,this creates a large amount of data that contains valuable knowledge on the side effects of drugs,patients'actions and symptoms.This information is widely considered to be the most important in the field of computer-aided medical analysis.Nevertheless,messages on the Internet are difficult to analyze because of their unstructured form.Thus,the pur-pose of this study is to develop a program for anaphora resolution in Chinese and to implement it for analysis of user-generated content in the medical Q&A platform.The experiments are conducted on three models:BERT,NeuralCoref and BERT-Chinese+SpanBERT.BERT-Chinese+SpanBERT achieves the highest accuracy—68.5%on the OntoNotes 5.0 corpus.Testing the model that showed the highest result was carried out on messages from the medical Q&A platform haodf.com.The results of the study might contribute to improving the diagnosis of hereditary diseases.
Anaphora resolution Chinese Natural Language Processing(NLP) User-generated content BERT
Alena Tsvetkova
HSE University,20 Myasnitskaya,Moscow 101000,Russia;Semantic Hub,4 Ilyinka,Moscow 109012,Russia
国际会议
9th CCF International Conference on Natural Language Processing and Chinese Computing (NLPCC 2020)
郑州
英文
1341-1348
2020-10-14(万方平台首次上网日期,不代表论文的发表时间)