会议专题

A Fast and Effective Framework for Lifelong Topic Model with Self-Learning Knowledge

  To discover semantically coherent topics from topic models,knowledge-based topic models have been proposed to incorporate prior knowledge into topic models.Moreover,some researchers propose life-long topic models(LTM)to mine prior knowledge from topics generated from multi-domain corpus without human intervene.LTM incorporates the learned knowledge from multi-domain corpus into topic models by in-troducing the Generalized Polya Urn(GPU)model into Gibbs sampling.However,GPU model is nonexchangeable so that topic inference for LTM is computationally expensive.Meanwhile,variational inference is an al-ternative approach to Gibbs sampling and tend to be faster than Gibbs sampling.Moreover,variational inference can also be flexible for inferring topic models with knowledge,i.e.,regularized topic model.In this pa-per,we propose a fast and effective framework for lifelong topic model,called Regularized Lifelong Topic Model with Self-learning Knowledge(RLTM-SK),with lexical knowledge automatically learnt from the pre-vious topic extraction,then design a variational inference method to estimate the posterior distributions of hidden variables for RLTM-SK.We compare our method with 5 state-of-the-art baselines on a dataset of product reviews from 50 domains.Results show that the performance of our method is comparable to LTM and other knowledge-based topic models.Moreover,our model is consistently faster than the best baselinemethod,LTM.

Variational Inference Lifelong Topic Model Knowledge-based Topic Model

Kang Xu Feng Liu Tianxing Wu Sheng Bi Guilin Qi

School of Computer Science and Engineering,Southeast University,Nanjing,China

国内会议

第十六届全国计算语言学学术会议暨第五届基于自然标注大数据的自然语言处理国际学术研讨会

南京

英文

1-12

2017-10-13(万方平台首次上网日期,不代表论文的发表时间)