会议专题

FIN3E Approach: Identification of Named Entities from Extracted Terms

1. Introduction The Named Entities (NE) are classically defined as the names of People, Places, and Organizations. Moreover other NE classes as Documents (e. g. software, hardware), and Sciences (e. g. illness, medications) exist. In order to identify NE, a lot of systems rely on the presence of uppercases. This technique can be inefficient to treat non-standard documents (e. g. emails, blogs, fora, texts or fragments of texts totally written in uppercase or lowercase). In this work, we do not use this kind of information to identify the NE. Formally, to characterize the NE, there exists two important criteria: (1) Referential uniqueness (i. e. a proper noun refers to one referential entity), (2) Denominative stability (i. e. little possible variations). Our work is based on this last criterion to identify the NE from Noun-Noun terms obtained by terminology extraction methods. Our method deals with a cognitive process that simulates a human reasoning: (1) Expressing differently one term by a reformulation technique, (2) Judging the relevance of this reformulation to identify NE.

Mathieu Roche

LIRMM, CNRS, Univ. Montpellier 2

国际会议

The 7th International Conference on Cognitive Science(第七届国际认知科学大会 ICCS 2010)

北京

英文

356-358

2010-08-01(万方平台首次上网日期,不代表论文的发表时间)