Reputation-based Contents Crawling in Web Archiving System
The size of the web archive is increasing exponentially, many national libraries are making efforts to preserve born-digital scientific, artistic and cultural contents. However, in order to crawl and store huge volume of digital information, it is very hard to resolve various problems from the social, legal and technical view points. In this paper, from the view points of long-term preserving digital contents with good reputation of trustiness, uniqueness and valuation, we discuss strategies to preserve monotonously increasing digital contents on web servers. According to experimental results of our reputation model, it makes possible to crawl socially valuable contents for archiving.
Web Archive Web Crawling Reputation Management
Hiroyuki Kawano
Nanzan University, Aichi 4890863
国际会议
The Seventh International Symposium(ISORA08)(第七届国际效力研究及其应用学术会议)
云南丽江
英文
317-324
2008-10-31(万方平台首次上网日期,不代表论文的发表时间)