A Rules and Statistical Learning Based Method for Chinese Patent Information Extraction

摘要：

Patent documents, as a kind of open scientific literature protected by law, the abstracts of which often highly summarize the main information. Information extraction work and analysis of the abstracts can contribute to better protection of intellectual property rights and promotion of enterprise technological innovation. This paper focus on patent abstracts and view information extraction of patent documents as a short text categorization problem, a method based on the combination of rules and statistical learning is used to annotate and extract the information of patent features, composition and usage. Experiments show that our method can not only extract the above three types of information in the patent abstracts, but also has higher accuracy when compared to the rules based method or SVM, which is an efficient and commonly used statistical learning classification algorithm.

关键词： patent document information extraction rulesbased method statistic learning

作者: Feng Guangpu Chen Xu Peng Zhiyong

作者单位: Computer School Wuhan University Wuhan, China

会议类型: 国际会议

会议名称: 第8届全国web信息系统及应用学术会议

会议地点: 重庆

会议语种:英文

页码: 114-118

在线出版日期: 2011-10-21（万方平台首次上网日期，不代表论文的发表时间）

会议专题

A Rules and Statistical Learning Based Method for Chinese Patent Information Extraction