Research on Method of Extracting Chinese Domain Terms Based on Rough and Fuzzy Clustering

摘要：

Automatic extraction of domain terms is the basis of domain ontology learning. General linguistic resources such as WordNet and HowNet can be applied to extract only partial domain terms from domain unstructured texts. In this paper, we firstly extract partial terms by calculating domain relatedness between words by HowNet. Then the extracted terms are semantically clustered with fuzzy c-means clustering algorithm based on properties of rough sets. Finally more domain terms are extracted from unknown words according to the clustering results with the method of machine learning. The experimental results showed that the method can not only extract domain terms as more as possible, but also ensure higher precision.

作者: Jie Liu Xiao-zhong Fan Kang Chen

作者单位: School of Computer Science and Technology, Beijing Institute of Technology, Beijing, P.R. China 1000 School of Computer Science and Technology, Beijing Institute of Technology, Beijing, P.R China 10008

会议类型: 国际会议

会议名称: 2007年第三届语义和知识网格国际会议(Third International Conference on Semantics,Knowledge,and Grid)(SKG 2007)

会议地点: 西安

会议语种:英文

在线出版日期: 2007-10-29（万方平台首次上网日期，不代表论文的发表时间）

会议专题

Research on Method of Extracting Chinese Domain Terms Based on Rough and Fuzzy Clustering