Improve Unsupervised Automatic Keyword Extraction via Word Embeddings

摘要：

　　The task of automatic keyword extraction can be described as: given a document,the extractor is expected to automatically pick up several words that can most represent the content.Traditional methods in this sphere did not consider the semantic relationship between words.In this paper,we proposed a keyword extraction approach that took semantic association into account.We brought word embedding into this task based on the observation that articles” authors always use different expressions which have the same meaning to make the content less tedious,particularly in Chinese.By using word embeddings we can easily calculate the distance between two words,which makes it convenient to find the semantic relationship between words.Experiment results shows that our approach is more reasonable and can grasp more semantic information in the document,comparing to traditional statistical methods.

关键词： Automatic keyword extraction Word embeddings Semantic extension

作者: Fei-Jia WU Tian-Fang YAO

作者单位: School of Electric Information and Electrical Engineering,Shanghai JiaoTong University

会议类型: 国内会议

会议名称: 2014年国际计算机科学与软件工程学术会议

会议地点: 杭州

会议语种:英文

页码: 1-6

在线出版日期: 2014-10-18（万方平台首次上网日期，不代表论文的发表时间）

会议专题

Improve Unsupervised Automatic Keyword Extraction via Word Embeddings