会议专题

Incremental Chinese Lexicon Extraction with Minimal Resources on a Domain-Specific Corpus

This article presents an original lexical unit extraction system for Chinese. The method is based on an incremental pro-cess driven by an association score featur-ing a minimal resources statistically aided linguistic approach. We also introduce a linguistics-based lexical unit definition and use it to describe an evaluation pro-tocol dedicated to the task. The experi-mental results on a domain specific cor-pus show that the method performs better than other approaches. The extraction re-sults, evaluated on a random sample of the working corpus, show a recall of 68.4% and precision of 37.1%.

Gael Patin

Texts, Computer Science and Multilingualism Research Center (Ertim) National Institute of Oriental Languages and Civilizations (Inalco) Arisem, Thales Company

国际会议

The 23rd International Conference on Computational Linguistics(第23届国际计算语言学大会)

北京

英文

963-971

2010-08-01(万方平台首次上网日期,不代表论文的发表时间)