会议专题

Research and Improvement on Content-based Web Search Engine

  There is a lot of information contained in the World Wide Web.It has become a research focus to obtain the required related resources quickly and accurately from the web through the content-based search engines.Most current tools of full text web search engine,such as Lucene which is a widely used open source retrieval library in information retrieval field,are purely keyword based.This may not sufficient for users to retrieve in the web.In this paper,we employ a method to overcome the limitations of current full text search engines in represent of Lucene.We propose a Query Expansion and Information Retrieval approach which can help users to acquire more accurate contents from the web.The Query Expansion component finds expanded candidate words of the query word through WordNet which contains synonyms in several different senses; In the Information Retrieval component,the query word and its candidate words are used together as the input of the search module to get the result items.Furthermore,we can put the result items into different classes based on the expansion.Some experiments and the results are described in the late part of this paper.

information retrieval web search engine Query Expansion

Zhichao Lin Lei Sun Xiao Liu

nformation of Science and Technology Institute, East China Normal University, Shanghai, China

国际会议

2012 2nd international Conference on Materials Science and Information Technology(2012第二届材料科学与信息技术国际会议)(MSIT2012)

西安

英文

1282-1286

2012-08-24(万方平台首次上网日期,不代表论文的发表时间)