Extract Deep Web-Detail Pages with Simple Tree Match
In this paper, we provide a method to extract data from Deep Web-Detail Pages. The method use the Simple Tree Match to compute the max match value between two trees, and use the Hungarian algorithm to trace the result of the STM compute, after this we use tree merge method to generate Wrapper. At last, we use Term Frequency to optimize the Wrapper. In experimental, we use the Wrapper to extract data; the results show that our method compared with other methods is feasible and effective.
STM HungarianAlgorithm Web Extract Deep Web.
Wei ZHANG Ye DENG Ranran DU Qiuhong Wang
Department of Computer Science and Technology, Ocean University of China Qing Dao, Shandong Province, China
国际会议
重庆
英文
250-254
2011-08-20(万方平台首次上网日期,不代表论文的发表时间)