Semantic Annotation of Web Objects Using Constrained Conditional Random Fields
Semantic annotation of Web objects is a key problem for Web infor mation extraction. The Web contains an abundance of useful semi-structured in formation about real world objects, and the empirical study shows that strong sequence characteristics exist for Web information about objects of the same type across different Web sites. Conditional Random Fields (CRFs) are the state of the art approaches taking the sequence characteristics to do better labeling. However, previous CRFs have their limitations and can not deal with a variety of logical constraints between Web object elements efficiently. This paper pre sents a Constrained Conditional Random Fields (Constrained CRFs) model to do semantic annotation of Web objects. The model incorporates a novel infer ence procedure based on integer linear programming and extends CRFs to natu rally and efficiently support all kinds of logical constraints. Experimental re sults using a large number of real-world data collected from diverse domains show that the proposed approach can significantly improve the semantic anno tation accuracy of web objects.
Yongquan Dong Qingzhong Li Yongqinig Zheng Xiaoyang Xu Yongxin Zhang
School of Computer Science and Technology, Shandong University, Jinan, China School of Computer Scie School of Computer Science and Technology, Shandong University, Jinan, China
国际会议
11th International Conference,WAIM 2010(第十一届网络时代管理国际会议)
九寨沟
英文
28-39
2010-07-14(万方平台首次上网日期,不代表论文的发表时间)