Mining micro-blogging users interest features via fingerprint generation
Nowadays,micro-blogging is widely used as a communication and information sharing social network service,therefore mining micro-blogging users behavior features is very important both in the economic and social fields.A framework for the analysis of users interest features is proposed in this paper.After data cleaning,word segmentation,POS(part of speech) filtering and synonym merging,the keywords that called terms of all the tweets posted by a typical user in 2011 are extracted.Then VSM(vector space model) is used to generate the feature vector of the tweets from these terms.Furthermore,a k-blt binary called fingerprint is generated from the high dimensional feature vector of the tweets by use of Simhash algorithm.The micro-blogging users interest features and change patterns could be detected by analyzing the fingerprint sequences and the distance between the adjacent two fingerprints.Taking Sina micro-blogging as background,a series of experiments are done to prove the effectiveness of the algorithms.
micro-blogging interest feature tweet fingerprint
Dong Liu Quanyuan Wu Weihong Han
School of Computer National University of Defense Technology Changsha,China
国际会议
杭州
英文
1470-1473
2013-03-22(万方平台首次上网日期,不代表论文的发表时间)