会议专题

Chinese Keywords Extraction based on Association Rule and Improved TF*IDF

To improve accuracy and efficiency of Chinese keywords extraction, this paper presents a method of keywords extraction which combines association rules and improved TF*IDF algorithm.Firstly, it constructs foreground documents and background ones, then segments them by the positive maximum matching algorithm and uses vector space models to express segmented feature items, mines association rules, selects feature items.Besides it uses improved TFMDF to calculate the weight of feature items, sets a reasonable threshold value, extracts keywords.Meanwhile, The length of texts and feature items, location of feature items, effective compound words can affect keywords extraction.Above questions can be solved when the length of texts is standardized, the length and location is given related value, compound words are identified effectively by association rules.The experiment result shows: Compared to the conventional method of keywords extraction, this method improves accuracy and efficiency of extraction and can identify compound words.

keywords extraction association rules TF*IDF

Xu-Simon Shouning Qu Wang Qin

School of Information Science and Engineering,University of Jinan Jinan,China Information Network Center,University of Jinan Jinan,China

国际会议

2011 3rd International Conference on Computer and Network Technology(ICCNT 2011)(2011第三届IEEE计算机与网络技术国际会议)

太原

英文

146-151

2011-02-26(万方平台首次上网日期,不代表论文的发表时间)