会议专题

An Efficient Wrapper for Web Data Eztraction and its Application

Web Wrapper extracts the data from the given Web sources according to the corresponding extraction rules of them. Its design is a key technology for Web information extraction and integration. This paper describes the design and implementation of a kind of the Web Wrapper which based on pre-defined schema. Then it validates the data extraction from the new books information Web pages of some publishing companies and analyses the extraction results with this kind of Web Wrapper. We find it can accurately extract the data from the Web source. So we can conclude that this kind of Web Wrapper which proposed in this paper is feasible, efficient and maintainable. It will be applied for Web data integration based on Wrapper/Mediator that we rely on to develop a Web application for book information integration and query system.

Wrapper eztraction rule information eztraction Web data integration new book information

Suzhi Zhang Peizhong Shi

College of Computer and Communication Engineering Zhengzhou University of Light Industry Zhengzhou, China

国际会议

第四届国际计算机新科技与教育学术会议(2009 4th International Conference on Computer Science & Education)

南京

英文

1245-1250

2009-07-25(万方平台首次上网日期,不代表论文的发表时间)