Improve Unsupervised Automatic Keyword Extraction via Word Embeddings
The task of automatic keyword extraction can be described as: given a document,the extractor is expected to automatically pick up several words that can most represent the content.Traditional methods in this sphere did not consider the semantic relationship between words.In this paper,we proposed a keyword extraction approach that took semantic association into account.We brought word embedding into this task based on the observation that articles” authors always use different expressions which have the same meaning to make the content less tedious,particularly in Chinese.By using word embeddings we can easily calculate the distance between two words,which makes it convenient to find the semantic relationship between words.Experiment results shows that our approach is more reasonable and can grasp more semantic information in the document,comparing to traditional statistical methods.
Automatic keyword extraction Word embeddings Semantic extension
Fei-Jia WU Tian-Fang YAO
School of Electric Information and Electrical Engineering,Shanghai JiaoTong University
国内会议
杭州
英文
1-6
2014-10-18(万方平台首次上网日期,不代表论文的发表时间)