Research on Knowledge Elements in Ezponential Language Model
This paper presents an exponential language model (ELM) for modeling and managing knowledge elements. The model has been developed based on Minimum Sample Risk (MSR) algorithm, which is a discriminative training method. ELM uses features to capture global, domain, or sentential language phenomena that is composed of name entities, part of speech strings, personal usage words, positions of words, sentence mood, sentence tense etc. We study kinds of knowledge elements’ performances on the task of Chinese Pinyin to Chinese character (PTC) conversion in Internet language (Chinese mobile short messages and Chinese QQ1 chat records). By combining different kind of knowledge elements to ELM, the model performs different, but all ELMs with more knowledge elements outperform the ELM only using probability knowledge calculated by baseline n-gram models which use Ney smoothing technology.
Ezponential language models minimum sample risk knowledge elements MSR-ELM
Huixing JIANG Xiaojie WANG
Center for Intelligence Science and Technology, Beijing University of Post and Telecommunications Center for Intelligence Science and Technology, Beijing University of Posts and Telecommunications
国际会议
大连
英文
1-5
2009-09-24(万方平台首次上网日期,不代表论文的发表时间)