Learning to Recognize Protected Health Information in Electronic Health Records with Recurrent Neural Network
De-identification in electronic health records is a prerequisite to distribute medical records for further clinical data processing or mining.In this paper,we introduce a framework based on recurrent neural network to solve the de-identification problem,and compare state-of-the-art methods with our framework.It is integrated,which includes records skeleton generation,chunk representation and protected information labeling.We evaluate our framework on three datasets involving two English datasets from i2b2 de-identification challenge and a Chinese dataset we created.To the best of our knowledge,we are the first to apply RNN model to the Chinese de-identification problem.The experimental results indicate that our framework not only achieves high performance but also has strong generalization ability.
De-identification Electronic Health Record Recurrent Neural Network
Kun Li Yumei Chai Hongling Zhao Xiaofei Nan Yueshu Zhao
Information Engineering School,Zhengzhou University,Zhengzhou,China Collaborative Innovation Center for Internet Healthcare,Zhengzhou University,Zhengzhou,China Collaborative Innovation Center for Internet Healthcare,Zhengzhou University,Zhengzhou,China;The Thi
国际会议
第五届自然语言处理与中文计算会议(NLPCC-ICCPOL2016)
昆明
英文
1-8
2016-12-02(万方平台首次上网日期,不代表论文的发表时间)