会议专题

Webpage Segmentation based on Gomory-Hu Tree Clustering in Undirected Planar Graph*

We propose a novel web page segmentation algorithm based on finding the Gomory-Hu tree in a planar graph 1. The algorithm firstly distills vision and structure information from a web page to construct a weighted undirected graph, whose vertices are the leaf nodes of the DOM tree and the edges represent the visible position relationship between vertices. Then it partitions the graph with the Gomory-Hu tree based clustering algorithm. Experimental results show that, compared with VIPS and Chakrabarti et al.s graph theoretiC algorithm, our algorithm improves upon the other two with much higher precision and recall, and its running time is far lower than that of Chakrabarti et al.s graph theoretic algorithm.

Xinyue Liu Xianchao Zhang Ye Tian Hongfei Lin

School of Electronic and Information Engineering,Dalian University of Technology,Dalian,China,116024 School of Software,Dalian University of Technology,Dalian,China,116620

国际会议

The Second International Symposium on Parallel Architectures,Algorithms and Programming(第二届国际并行体系结构、算法和程序设计研讨会)

南宁

英文

192-205

2009-12-04(万方平台首次上网日期,不代表论文的发表时间)