会议专题

Incorporating Entities in News Topic Modeling

  News articles express information by concentrating on named entities like who, when, and where in news.Whereas, extract ing the relationships among entities, words and topics through a large amount of news articles is nontrivial.Topic modeling like Latent Dirichlet Allocation has been applied a lot to mine hidden topics in text analy sis, which have achieved considerable performance.However, it cannot explicitly show relationship between words and entities.In this paper, we propose a generative model, Entity-Centered Topic Model(ECTM) to summarize the correlation among entities, words and topics by taking entity topic as a mixture of word topics.Experiments on real news data sets show our model of a lower perplexity and better in clustering of enti ties than state-of-the-art entity topic model(CorrLDA2).We also present analysis for results of ECTM and further compare it with CorrLDA2.

news named entity generative entity topic models

Linmei Hu Juanzi Li Zhihui Li Chao Shao Zhixing Li

Dept.of Computer Sci. and Tech., Tsinghua University, China Dept.of Computer Sci. and Tech., Beijing Information Science and Technology University, China

国际会议

Second CCF Conference,NLPCC2013(第二届自然语言处理与中文计算会议)

重庆

英文

139-150

2013-11-15(万方平台首次上网日期,不代表论文的发表时间)