Application of VM-Based Computations to Speedup the Web Crawling Process on Multi-Core Processors

摘要：

　　A Web crawler is an important component of the Web search engine.It demands large amount of hardware resources to crawl data from the rapidly growing and changing Web.The crawling process should be performed continuously to maintain up-to-date data.This paper develops a new approach to speed up the crawling process on a multi-core processor by utilizing the concept of virtualization.In this approach,the multi-core processor is divided into a number of virtual-machines(VMs),which can concurrently perform different crawling tasks on different initial data.It presents a description,implementation,and evaluation of a VM-based distributed Web crawler.The speedup factor achieved by the VM-based crawler over no virtualization crawler,for crawling various numbers of documents,is estimated.Also,the effect of number of VMs on the speedup factor is investigated.

关键词： Web search engine Web crawler virtualization virtual machines distributed crawling multi-core processor processor-farm methodology

作者: Hussein Al-Bahadili Hamzah Qtishat

作者单位: Faculty of Information Technology University of Petra Amman,Jordan Faculty of Information Technology Middle East University Amman,Jordan

会议类型: 国际会议

会议名称: The 12th International Symposium on Distributed Computing and Applications to Business,Engineering and Science(DCABES 2013)(第十二届分布式计算及其应用国际学术研讨会)

会议地点: 英国伦敦

会议语种:英文

页码: 157-161

在线出版日期: 2013-09-02（万方平台首次上网日期，不代表论文的发表时间）

会议专题

Application of VM-Based Computations to Speedup the Web Crawling Process on Multi-Core Processors