MICRO-BLOG CATEGORY BASED ON FEATURE-WORDS CATEGORY DISPERSION
The micro-blog information classification is an important pretreatment in micro-blog data processing work.Due to the unique properties of the micro-blog text,there are some limitations when use traditional classification to deal with it.Consider to a single microblog text brief which contains less effective feature-words,and the content compare spoken of the features,this paper proposed to use similar words and collocations to extend the text feature-words,reducing the possibility of feature loss.For the feature of information selection and weight calculation,proposed one kind text classification methods which based on the feature-words category dispersion and dispersion degree.The experiments show that the propose classification method achieves good effects in the classification of micro-blog text,and has better applicability in micro-blog text classification scene.
Micro-blog text classification Term weight Category dispersed Feature-word extension
Yingyou Chen Qing Wu
Department of Computer,Hangzhou Dianzi University,Hangzhou 310018,China
国际会议
杭州
英文
134-137
2012-10-30(万方平台首次上网日期,不代表论文的发表时间)