会议专题

Unsupervised Text Pattern Learning Using Minimum Description Length

The knowledge of text patterns in a domainspecific corpus is valuable in many natural language processing (NLP) applications such as information extraction, question-answering system, and etc. In this paper, we propose a simple but effective probabilistic language model for modeling the indecomposability of text patterns. Under the minimum description length (MDL) principle, an efficient unsupervised learning algorithm is implemented and the experiment on an English critical writing corpus has shown promising coverage of patterns compared with human summary.

Ke Wu Jiangsheng Yu Hanpin Wang Fei Cheng

国际会议

2010 4th International Universal Communication Symposium(第四届国际普遍交流学术研讨会 IUCS 2010)

北京

英文

160-165

2010-10-18(万方平台首次上网日期,不代表论文的发表时间)