DATA TRANSMISSION BETWEEN RELATIONAL DATABASE AND HDFS
The functions of data exchange provide a series of methods,transfer data from a storage system to another storage system.HDFS is a Distributed File System realized by Hadoop,which has the character of high fault-tolerance,at the same time it provides a high transfer rate to access the data of applications,and is suitable for those applications with large data set.Traditionally,the large data is stored in FTP servers or SQL databases.We use Hadoop distributed framework for large-scale data calculation,which will certainly need to transfer data from FTP servers or SQL databases to HDFS.This paper mainly discussed the problem of parallel data exchange between SQL database and HDFS,introduced the performance of Hadoop data exchange functions:DBInputFormat/ DBOutputFormat,and put forward some strategies to improve the performance.
Hadoop HDFS RDBMS Data exchange DBInputFormat/DBOutputFormat
Bin Wu Yu Jia Xinxin Ge
Beijing Key Laboratory of Intelligent Telecommunications Software and Multimedia Beijing University of Posts and Telecommunications,Beijing 100876,China
国际会议
杭州
英文
419-423
2012-10-30(万方平台首次上网日期,不代表论文的发表时间)