An Adaptive Markov Model for Text Categorization
Existing methods for text categorization assumethat a document is a bag of words.While compu-rationally efficient,such a representation is unable tocapture sequential information.In this paper,adocument is looked upon as a sequence of charactersor words and the preprocessing for text categorization,such as word segmentation and feature selection,isnot demanded Statistical dependencies among theneighboring terms of a sequence are captured bydifferent order markov models.We proposed asequence classification methods based on adaptivemarkov model.Our method blends the markov modelswith different order values together for text catego-rization automatically and effectively.We present anextensive experimental evaluation of our method on anEnglish collections and one Chinese corpus.Theresults show the high recall and precision of ourmethod.
Jin Li Kun Yue WeiYi Liu
School of Software,Yunnan University School of Information Science and Engineering,Yunnan University
国际会议
厦门
英文
802-807
2008-11-17(万方平台首次上网日期,不代表论文的发表时间)