MapReduce based Parallel Latent Dirichlet Allocation
Latent Dirichlet Allocation(LDA)has been widely applied to text mining.LDA is a probabilistic topic model which processes documents as the probability distribution of topics.This paper presents a parallel LDA based on the MapReduce model,which has become a major programming model for data intensive applications.By distributing data among a number of computer nodes,the computation in LDA can be carried out in parallel.
MapReduce job scheduling data locality
Fan Tang Yang Liu Zelong Liu Maozhen Li Man Qi
State Grid Sichuan Electric Power Research Institute Chengdu, China School of Electrical Engineering and Information Systems Sichuan University Chengdu, 610065, China School of Engineering and Design Brunel University Uxbridge, UB8 3PH, UK Department of Computing Canterbury Christ Church University Canterbury, Kent, CT1 1QU, UK
国际会议
厦门
英文
756-760
2014-08-19(万方平台首次上网日期,不代表论文的发表时间)