Resources for Nepali Word Sense Disambiguation
Word Sense Disambiguation (WSD) is a process of identifying proper meaning of words that may have multiple meanings. It is regarded as one of the most challenging problems in the field of Natural Language Processing (NLP). Nepali Language also has words that have multiple meanings, thus giving rise to the problem of WSD in it. In this paper, we investigate the impact of NLP resources like Morphology Analyzer (MA) and Machine Readable Dictionary (MRD) in ambiguity resolution. Our results show that the accuracy in WSD is better with the availability of NLP resources like Morph Analyzer, MRD etc. Lesk algorithm has been used to solve WSD problem using a sample Nepali WordNet containing few sets of Nepali nouns and the system is able to disambiguate these nouns only. The system was tested on a small set of data with limited number of nouns. The accuracy reading was between 50%-70% depending on the sample data provided. When the same data was tested through manual morph analysis, the accuracy was seen to be considerably high (80%).
Language Lesk Algorithm Nepali WordNet WSD
Niraj SHRESTHA Patrick A.V.HALL Sanat K.BISTA
Information and Language Processing Research Lab Kathmandu University Dhulikel,Kavre,NEPAL Department of Computer Science & Engineering Kathmandu University Dhulikel,Kavre,NEPAL
国际会议
北京
英文
2008-10-19(万方平台首次上网日期,不代表论文的发表时间)