Incorporating Entities in News Topic Modeling
News articles express information by concentrating on named entities like who, when, and where in news.Whereas, extract ing the relationships among entities, words and topics through a large amount of news articles is nontrivial.Topic modeling like Latent Dirichlet Allocation has been applied a lot to mine hidden topics in text analy sis, which have achieved considerable performance.However, it cannot explicitly show relationship between words and entities.In this paper, we propose a generative model, Entity-Centered Topic Model(ECTM) to summarize the correlation among entities, words and topics by taking entity topic as a mixture of word topics.Experiments on real news data sets show our model of a lower perplexity and better in clustering of enti ties than state-of-the-art entity topic model(CorrLDA2).We also present analysis for results of ECTM and further compare it with CorrLDA2.
news named entity generative entity topic models
Linmei Hu Juanzi Li Zhihui Li Chao Shao Zhixing Li
Dept.of Computer Sci. and Tech., Tsinghua University, China Dept.of Computer Sci. and Tech., Beijing Information Science and Technology University, China
国际会议
Second CCF Conference,NLPCC2013(第二届自然语言处理与中文计算会议)
重庆
英文
139-150
2013-11-15(万方平台首次上网日期,不代表论文的发表时间)