An Improved Method of Short Text Feature Extraction Based on Words Co-occurrence
In Chinese text clustering, short text is very different from traditional long text, principally in the low frequency of word . As a result, traditional text feature extraction and the method for weight calculating is not directly suitable for short text clustering .To solve the problem of clustering drift in short text segments , this paper proposes an method for feature extraction through improving the method of weight calculating based on words co-occurrence. Experiments show the method can get better performance in Chinese short-text clustering , compared with the traditional method TFIDF.
short- text feature extraction word co-occurrence weight calculating
Wangzuli
Department of Network Engineering, Chengdu University of Information Technology,Chengdu 610025
国际会议
三峡
英文
1350-1353
2012-05-18(万方平台首次上网日期,不代表论文的发表时间)