会议专题

Text Categorization Based on Topic Model

In the text literature,many topic models were proposed to represent documents and words as topics or latent topics in order to process text effectively and accurately.In this paper,we propose LDACLM or Latent Dirichlet Allocation Category Language Model for text categorization and estimate parameters of models by variational inference.As a variant of Latent Dirichlet Allocation Model,LDACLM regard documents of category as Language Model and use variational parameters to estimate maximum a posteriori of terms.Experiments show LDACLM model to be effective for text categorization,outperforming standard Naive Bayes and Rocchio method for text categorization.

Latent Dirichlet Allocation Variational Inference Category Language Model

Shibin Zhou Kan Li Yushu Liu

School of Computer Science and Technology Beijing Institute of Technology,Beijing 100081,P.R.China

国际会议

The Third International Conference on Rough Sets and Knowledge Tevhnology(RSKT 2008)(第三届粗糙集与知识技术国际会议)

成都

英文

572-579

2008-05-17(万方平台首次上网日期,不代表论文的发表时间)