An Architecture for Unstructured Data Management
As the information age is coming,there is a vast amount of information available in the Internet.Most of data on Web are unstructured.But the significant data should be organized and stored in a suitable way for future purposes.One of the unsolved problems is the management of unstructured data.The unstructured data such as presentation,spreadsheet,text document,memo,images and web pages are difficult to manage while the data become a large scale and the users have different requirements and interests.In this paper,we proposed an architecture for unstructured data management by integrating source query,data collection and data management to solve these problems.The data collection layer extracts the data we care about,we use the existing tools to extract automatic and we can also add the data to the repository manually.The data management layer manage all the collection data by classifying the data,selecting nodes to store and managing centralized as index.The source query layer allows users to query and get the data diversity according the adaptive query service and recommendation service.Finally,we implemented a prototype system OCourse based on this system architecture to show its feasible and efficient.
unstructured data classification storage
Yaohu Lin Xuelian Lin
School of Economics and Management Beihang University Beijing, P.R.China The Institute of Advanced Computing Technology Beihang University Beijing, P.R.China
国际会议
太原
英文
454-457
2012-12-08(万方平台首次上网日期,不代表论文的发表时间)