A Fast and Effective Framework for Lifelong Topic Model with Self-Learning Knowledge

摘要：

　　To discover semantically coherent topics from topic models,knowledge-based topic models have been proposed to incorporate prior knowledge into topic models.Moreover,some researchers propose life-long topic models(LTM)to mine prior knowledge from topics generated from multi-domain corpus without human intervene.LTM incorporates the learned knowledge from multi-domain corpus into topic models by in-troducing the Generalized Polya Urn(GPU)model into Gibbs sampling.However,GPU model is nonexchangeable so that topic inference for LTM is computationally expensive.Meanwhile,variational inference is an al-ternative approach to Gibbs sampling and tend to be faster than Gibbs sampling.Moreover,variational inference can also be flexible for inferring topic models with knowledge,i.e.,regularized topic model.In this pa-per,we propose a fast and effective framework for lifelong topic model,called Regularized Lifelong Topic Model with Self-learning Knowledge(RLTM-SK),with lexical knowledge automatically learnt from the pre-vious topic extraction,then design a variational inference method to estimate the posterior distributions of hidden variables for RLTM-SK.We compare our method with 5 state-of-the-art baselines on a dataset of product reviews from 50 domains.Results show that the performance of our method is comparable to LTM and other knowledge-based topic models.Moreover,our model is consistently faster than the best baselinemethod,LTM.

关键词： Variational Inference Lifelong Topic Model Knowledge-based Topic Model

作者: Kang Xu Feng Liu Tianxing Wu Sheng Bi Guilin Qi

作者单位: School of Computer Science and Engineering,Southeast University,Nanjing,China

会议类型: 国内会议

会议名称: 第十六届全国计算语言学学术会议暨第五届基于自然标注大数据的自然语言处理国际学术研讨会

会议地点: 南京

会议语种:英文

页码: 1-12

在线出版日期: 2017-10-13（万方平台首次上网日期，不代表论文的发表时间）

会议专题

A Fast and Effective Framework for Lifelong Topic Model with Self-Learning Knowledge