会议专题

An Improved Method of Short Text Feature Extraction Based on Words Co-occurrence

In Chinese text clustering, short text is very different from traditional long text, principally in the low frequency of word . As a result, traditional text feature extraction and the method for weight calculating is not directly suitable for short text clustering .To solve the problem of clustering drift in short text segments , this paper proposes an method for feature extraction through improving the method of weight calculating based on words co-occurrence. Experiments show the method can get better performance in Chinese short-text clustering , compared with the traditional method TFIDF.

short- text feature extraction word co-occurrence weight calculating

Wangzuli

Department of Network Engineering, Chengdu University of Information Technology,Chengdu 610025

国际会议

2012 International Conference on Electric Technology and Civil Engineering(2012 电子技术与土木工程国际会议 ICETCE 2012)

三峡

英文

1350-1353

2012-05-18(万方平台首次上网日期,不代表论文的发表时间)