Anaphora Resolution in Chinese for Analysis of Medical Q&A Platforms

摘要：

　　In medical Q&A platforms,patients share information about their diag-nosis,give advice and consult with doctors,this creates a large amount of data that contains valuable knowledge on the side effects of drugs,patients'actions and symptoms.This information is widely considered to be the most important in the field of computer-aided medical analysis.Nevertheless,messages on the Internet are difficult to analyze because of their unstructured form.Thus,the pur-pose of this study is to develop a program for anaphora resolution in Chinese and to implement it for analysis of user-generated content in the medical Q&A platform.The experiments are conducted on three models:BERT,NeuralCoref and BERT-Chinese+SpanBERT.BERT-Chinese+SpanBERT achieves the highest accuracy—68.5%on the OntoNotes 5.0 corpus.Testing the model that showed the highest result was carried out on messages from the medical Q&A platform haodf.com.The results of the study might contribute to improving the diagnosis of hereditary diseases.

关键词： Anaphora resolution Chinese Natural Language Processing(NLP) User-generated content BERT

作者: Alena Tsvetkova

作者单位: HSE University,20 Myasnitskaya,Moscow 101000,Russia;Semantic Hub,4 Ilyinka,Moscow 109012,Russia

会议类型: 国际会议

会议名称: 9th CCF International Conference on Natural Language Processing and Chinese Computing (NLPCC 2020)

会议地点: 郑州

会议语种:英文

页码: 1341-1348

在线出版日期: 2020-10-14（万方平台首次上网日期，不代表论文的发表时间）

会议专题

Anaphora Resolution in Chinese for Analysis of Medical Q&A Platforms