A Method of Building Chinese Field Association Knowledge from Wikipedia
Field Association (FA) terms form a limited set of discriminating terms that give us the knowledge to identify document fields. The primary goal of this research is to make a system that can imitate the process whereby humans recognize the fields by looking at a few Chinese FA terms in a document. This paper proposes a new approach to build a Chinese FA terms dictionary automatically from Wikipedia. 104,532 FA terms are added in the dictionary. The resulting FA terms by using this dictionary are applied to recognize the fields of 5,841 documents. The average accuracy in the experiment is 92.04%. The results show that the presented method is effective in building FA terms from Wikipedia automatically.
Field association terms Feature fields Wikipedia Chinese documents Field recognition
Li WANG Susumu YATA El-sayed ATLAM Masao FUKETA Kazuhiro MORITA Hiroaki BANDO Jun-ichi AOE
Department of Information Science and Intelligent Systems, Faculty of Engineering, University of Tokushima Minamijosanjima 2-1, Tokushima, 770-8506, Japan
国际会议
大连
英文
1-5
2009-09-24(万方平台首次上网日期,不代表论文的发表时间)