会议专题

Recognition and Extraction of Honorifics in Chinese Diachronic Corpora

  Honorifics in this paper refer to names of official positions and titles of nobility or honor.They can be found in various written records in different periods and have great historical significance.This paper introduces a machine learning system to recognize the honorifics in diachronic corpora.A tagged corpus of four classic novels written in the Ming and Qing dynasties is used to train the system.The system is then used to automatically recognize and extract the honorifics in pre-Qin classics,Tang-dynasty poems,and modern Chinese news.Experimental results show that the system can achieve relatively good results in recognizing the honorifics in the pre-Qin classics and Tang-dynasty poems.This work is an attempt to improve the performance of automatic recognition of honorifics in diachronic corpora.The system can be a helpful tool in the studies on the evolution of honorifics throughout Chinese history.

Honorifics Chinese diachronic corpora Machine learning algorithm

Dan Xiong Jian Xu Qin Lu Fengju Lo

Department of Computing,The Hong Kong Polytechnic University,Hong Kong,China Department of Chinese Linguistics and Literature,Yuan Ze University,Zhongli,Taiwan

国际会议

Chinese Lexical Semantics 15th Workshop(CLSW 2014)(第十五届汉语词汇语义学国际研讨会)

澳门

英文

305-316

2014-06-09(万方平台首次上网日期,不代表论文的发表时间)