会议专题

STATISTIC-BASED CHINESE ORGANIZATION NAME RECOGNITION

Organization name is a kind of frequently occurring but ever-changing proper nouns in texts. Chinese organization name recognition is a non-trivial task in named entity recognition (NER). Comparing with other entities such as person and location, Chinese organization name is the most difficult to be identified. Currently statistic-based approach for automatic NER is widely studied. In this paper, we try to make clear several puzzling problems of statistic-based Chinese organization name recognition and propose experimental conclusions. Whether the encoding scheme in the recognition system by classification approach affects the performance and how much? Should we build one identification model for all different named entities or one-for-each? Or whether Chinese organization name recognition after person and location identification outperforms the parallel approach or not? Which is better, word-based or character-based Chinese organization recognition? Our conclusions are drawn on corpora of SIGHAN Bakeoff datasets for NER.

Chinese Organization Name Recognition Conditional Random Fields Model Feature

Ying Qin Xiaojie Wang Yixin Zhong

National Research Centre for Foreign Language Education, Beijing Foreign Studies University, Beijing Beijing University of Posts and Telecommunications, Beijing 100876, China

国际会议

China-Ireland International Conference on Information and Communications Technologies 2008(2008 中国-爱尔兰信息与通信技术国际会议 CIICT 2008)

北京

英文

1-5

2008-09-26(万方平台首次上网日期,不代表论文的发表时间)