会议专题

The Study of Rule-based Automatic Segmentation of Inflectional Affixes of the Kazakh Language

This paper focuses on the automatic segmentation of inflectional affixes of the Kazakh Language (KL) on the basis of studying the corpus of KL. Based on the analysis of the configuration of inflectional affixes, it firstly constructs the Finite-State Automation (FSM) and the segmentation of inflectional affixes. Secondly it targets at specially constructing the Finite-State Automations of nouns and verbs, which are the most changeable and complex part of speech of KL. And thirdly it adopts the methods of Bidirectional Omni-Word Segmentation and lexical analysis to achieve the goal of stemming and fine segmentation of inflectional affixes of KL. And finally it gives an additional account of studying the segmentation of ambiguous inflectional affixes. The paper intends to improve the accuracy and the quickness of stemming the inflectional affixes of KL.

Kazakh Affixes Segmentation Finite-state Automaton Bidirectional Omni-word Segmentation Bayes Classifier

Gulila.Altenbek Dawel.Abilhayer Muheyat.Niyazbek

Information Science and Engineering Colleges of Xinjiang University,Urumqi,Xinjiang,P.R.China.830046

国际会议

2008高等智能国际会议(2008 International Conference on Advanced Intelligence)

北京

英文

2008-10-18(万方平台首次上网日期,不代表论文的发表时间)