Metadata Extraction Based on Mutual Information in Digital Libraries
As the main infrastructure of Internet-two, Digital Library have had a rapidly development and received a lot of harvest in recent years. But one of the key problems is how to help users to find satisfied resources more efficiently among the affluent contents in heterogeneous repositories of digital libraries. Metadata as a kind of structure data about data can describe the content, semantics and services of data. Metadata, which is a foundation of defining and organizing the resources in digital library, plays a pivotal role in constructing resources. Therefore, metadata extraction, semantic retrieval and semantic annotate in metadata automatic management are challengeable research tasks. Each kind of metadata could be regarded as a classification. Therefore, metadata extraction is just as the classifying work for every document block. The paper focused on the research of automatic metadata extraction based on mutual information which is a widely used information theoretic measure, in a descriptive way, to compute the stochastic dependency of discrete random variables. Metadata extraction has been performed using maxmutual information including linear and non-linear feature conversions. Entropy is made use of and extended to find right features commendably in digital library systems.
Digital Library Metadata Extraction Mutual Information
Lizhen Liu Guoqiang He Xuling Shi Hantao Song
Information Engineering College, Capital Normal University, Beijing, P.R.China Department of Computer, Beijing Institute of Technology, Beijing, P.R.China
国际会议
昆明
英文
2007-11-23(万方平台首次上网日期,不代表论文的发表时间)