Joint Tokenization, Parsing, and Translation
Natural language processing is all about ambiguities. In machine translation, tokenization and parsing mistakes due to segmentation and structural ambiguities potentially introduce translation errors. A well-known solution is to provide more alternatives by using compact representations such as lattice and forest. In this talk, I will introduce a technique that goes beyond using lattices and forests, which integrates tokenization, parsing, and translation in one system. Therefore, tokenization, parsing, and translation can interact with and benefit each other in a discriminative framework. Experimental results show that such integration significantly improves tokenization and translation performance.
Yang Liu
Institute of Computing Technology (ICT), Chinese Academy of Sciences
国际会议
2010 4th International Universal Communication Symposium(第四届国际普遍交流学术研讨会 IUCS 2010)
北京
英文
1
2010-10-18(万方平台首次上网日期,不代表论文的发表时间)