An Adaptive Markov Model for Text Categorization

摘要：

Existing methods for text categorization assumethat a document is a bag of words.While compu-rationally efficient,such a representation is unable tocapture sequential information.In this paper,adocument is looked upon as a sequence of charactersor words and the preprocessing for text categorization,such as word segmentation and feature selection,isnot demanded Statistical dependencies among theneighboring terms of a sequence are captured bydifferent order markov models.We proposed asequence classification methods based on adaptivemarkov model.Our method blends the markov modelswith different order values together for text catego-rization automatically and effectively.We present anextensive experimental evaluation of our method on anEnglish collections and one Chinese corpus.Theresults show the high recall and precision of ourmethod.

作者: Jin Li Kun Yue WeiYi Liu

作者单位: School of Software,Yunnan University School of Information Science and Engineering,Yunnan University

会议类型: 国际会议

会议名称: 2008 3rd International Conference on Intelligent System and Knowledge Engineering(第三届智能系统与知识工程国际会议)(ISKE 2008)

会议地点: 厦门

会议语种:英文

页码: 802-807

在线出版日期: 2008-11-17（万方平台首次上网日期，不代表论文的发表时间）

会议专题

An Adaptive Markov Model for Text Categorization