STATISTIC-BASED CHINESE ORGANIZATION NAME RECOGNITION
Organization name is a kind of frequently occurring but ever-changing proper nouns in texts. Chinese organization name recognition is a non-trivial task in named entity recognition (NER). Comparing with other entities such as person and location, Chinese organization name is the most difficult to be identified. Currently statistic-based approach for automatic NER is widely studied. In this paper, we try to make clear several puzzling problems of statistic-based Chinese organization name recognition and propose experimental conclusions. Whether the encoding scheme in the recognition system by classification approach affects the performance and how much? Should we build one identification model for all different named entities or one-for-each? Or whether Chinese organization name recognition after person and location identification outperforms the parallel approach or not? Which is better, word-based or character-based Chinese organization recognition? Our conclusions are drawn on corpora of SIGHAN Bakeoff datasets for NER.
Chinese Organization Name Recognition Conditional Random Fields Model Feature
Ying Qin Xiaojie Wang Yixin Zhong
National Research Centre for Foreign Language Education, Beijing Foreign Studies University, Beijing Beijing University of Posts and Telecommunications, Beijing 100876, China
国际会议
北京
英文
1-5
2008-09-26(万方平台首次上网日期,不代表论文的发表时间)