Using Collaborative Training Method to build Vietnamese Dependency Treebank
For the difficulty of marking Vietnamese dependency tree,this paper proposed the method which combined MST algorithm and improved Nivre algorithm to build Vietnamese dependency treebank.The method took full advantage of the characteristics of collaborative training.Firstly,we built a bit samples.Secondly,we used the samples to build two weak learners with two fully redundant views.Then,we marked a large number of unmarked samples mutually.Next,we selected the samples of high trust to relearn and built a dependency parsing system.Finally,we used 5000 Vietnamese sentences marked manually to do tenfold cross-test and obtained the accuracy of 76.33%.Experimental results showed that the proposed method in this paper could take full advantage of unmarked corpus to effectively improve the quality of dependency treebank.
Dependency Treebank Vietnamese Collaborative Training Dependency Parsing
Guoke Qiu Jianyi Guo Zhengtao Yu Yantuan Xian Cunli Mao
The School of Information Engineering and Automation,Kunming University of Science and Technology,Ku The Key Laboratory of Intelligent Information Processing,Kunming University of Science and Technolog
国内会议
第十五届全国计算语言学学术会议(CCL2016)暨第四届基于自然标注大数据的自然语言处理国际学术研讨会(NLP-NABD-2016)
烟台
英文
1-14
2016-10-14(万方平台首次上网日期,不代表论文的发表时间)