USER INTEREST MODELING BY LABELED LDA WITH TOPIC FEATURES
As well known, the user interest is carried in the users web browsing history that can be mined out. This paper presents an innovative method to extract users interests from his/her web browsing history. We first apply an efficient algorithm to extract useful texts from the web pages in users browsed URL sequence. We then proposed a Labeled Latent Dirichlet Allocation with Topic Feature (LLDA-TF) to mine users interests from the texts. Unlike other works that need a lot of training data to train a model to adopt supervised information, we directly introduce the raw supervised information to the procedure of LLDATF. As shown in the experimental results, results given by LLDA-TF fit predefined categories well. Furthermore, LLDA-TF model can name the user interests by category words as well as a keyword list for each category.
Labeled Latent Dirichlet Allocation with topic Feature (LLDA-TF) browsing history topic model user interest model topic feature
Wenfeng Li Xiaojie Wang Rile Hu Jilei Tian
Center of Intelligent Science and Technology,Beijing University of Posts and Telecommunications, Bei Nokia Research Center, Beijing, China
国际会议
北京
英文
6-11
2011-09-15(万方平台首次上网日期,不代表论文的发表时间)