会议专题

A Rules and Statistical Learning Based Method for Chinese Patent Information Extraction

Patent documents, as a kind of open scientific literature protected by law, the abstracts of which often highly summarize the main information. Information extraction work and analysis of the abstracts can contribute to better protection of intellectual property rights and promotion of enterprise technological innovation. This paper focus on patent abstracts and view information extraction of patent documents as a short text categorization problem, a method based on the combination of rules and statistical learning is used to annotate and extract the information of patent features, composition and usage. Experiments show that our method can not only extract the above three types of information in the patent abstracts, but also has higher accuracy when compared to the rules based method or SVM, which is an efficient and commonly used statistical learning classification algorithm.

patent document information extraction rulesbased method statistic learning

Feng Guangpu Chen Xu Peng Zhiyong

Computer School Wuhan University Wuhan, China

国际会议

第8届全国web信息系统及应用学术会议

重庆

英文

114-118

2011-10-21(万方平台首次上网日期,不代表论文的发表时间)