会议专题

Extract Deep Web-Detail Pages with Simple Tree Match

In this paper, we provide a method to extract data from Deep Web-Detail Pages. The method use the Simple Tree Match to compute the max match value between two trees, and use the Hungarian algorithm to trace the result of the STM compute, after this we use tree merge method to generate Wrapper. At last, we use Term Frequency to optimize the Wrapper. In experimental, we use the Wrapper to extract data; the results show that our method compared with other methods is feasible and effective.

STM HungarianAlgorithm Web Extract Deep Web.

Wei ZHANG Ye DENG Ranran DU Qiuhong Wang

Department of Computer Science and Technology, Ocean University of China Qing Dao, Shandong Province, China

国际会议

2011 6th Joint International Information Technology and Artificial Intelligence Conference(2011年第六届IEEE联合国际信息技术与人工智能会议 IEEE ITAIC 2011)

重庆

英文

250-254

2011-08-20(万方平台首次上网日期,不代表论文的发表时间)