AUTOMATIC IDENTIFYING OF MAXIMAL LENGTH NOUN PHRASE
The automatic recognition of the maximal-length noun phrase (MNP) helps to the shallow parsing.In this paper,automatic labeling of Chinese MNP is regarded as a sequential labeling task and Support Vector Machine model (SVM) is employed in the model.We propose a method which takes 2-phase hybrid approach which first identifies base chunk and then identifies MNP.Furthermore,the base chunk features can be exploited to improve performance of MNP recognition.In addition,both left-right and fight-left sequential labeling were employed to identify Chinese MNP by bidirectional sequence labeling merging.The data set in the experiments is selected from Penn Chinese Treebank 5.0 Corpus,and split into train set,development set and test set according to the proportion of 4:4:1.Experimental result shows a high quality performance of 90.13% in F(l)-measure.
MNP Base chunk feature Bidirectional sequence labeling merging 2-phase
Yegang Li Heyan Huang
School of Computer Science and Technology,Beijing institute of Technology,Beijing 100081,China;Depar School of Computer Science and Technology,Beijing institute of Technology,Beijing 100081,China
国际会议
杭州
英文
1927-1930
2012-10-30(万方平台首次上网日期,不代表论文的发表时间)