Preliminary Study on the Construction of Bilingual Phrase Structure Treebank
Treebank is an important resource for Natural Language Processing.Most existing treebanks are monolingual,but bilingual treebanks are the important basis of syntactical model in machine translation.In this paper,a bilingual phrase structure Treebank aimed for the application of machine translation was preliminarily constructed,which chose POS tagset and syntactic tagset of U-Penn English Treebank and Chinese Treebank as its tagging system.Chinese- English sentence pairs which were drawn from machine translation evaluation data in the treebank were pre-processed,with POS tagged,phrase structure annotated,and all processed data were proofread.Through the analysis of phrase structures which were modified in the proofreading process,it was found that Chinese functional words usages play an important role in Chinese phrase structure grammar.
Bilingual Treebank Phrase Structure Grammar Chinese Functional Word Usages
Kunli Zhang Hongying Zan Yingjie Han Lingling Mu
School of Information Engineering,Zhengzhou University,Zhengzhou,Henan 450001,China
国际会议
Chinese Lexical Semantics 15th Workshop(CLSW 2014)(第十五届汉语词汇语义学国际研讨会)
澳门
英文
403-413
2014-06-09(万方平台首次上网日期,不代表论文的发表时间)