A Rules and Statistical Learning Based Method for Chinese Patent Information Extraction
Patent documents, as a kind of open scientific literature protected by law, the abstracts of which often highly summarize the main information. Information extraction work and analysis of the abstracts can contribute to better protection of intellectual property rights and promotion of enterprise technological innovation. This paper focus on patent abstracts and view information extraction of patent documents as a short text categorization problem, a method based on the combination of rules and statistical learning is used to annotate and extract the information of patent features, composition and usage. Experiments show that our method can not only extract the above three types of information in the patent abstracts, but also has higher accuracy when compared to the rules based method or SVM, which is an efficient and commonly used statistical learning classification algorithm.
patent document information extraction rulesbased method statistic learning
Feng Guangpu Chen Xu Peng Zhiyong
Computer School Wuhan University Wuhan, China
国际会议
重庆
英文
114-118
2011-10-21(万方平台首次上网日期,不代表论文的发表时间)