STUDY ON THE SMALL FILES PROBLEM OF HADOOP
Although Hadoop is widely used,its full potential is not yet put to use because of some issues,the small files problem being one of them.Firstly,the paper analyses the causes of the small files problem of Hadoop.Then,the current program to solve the small files problem are introduced,including Hadoop own programs and other application-specific solutions,and analyzes the advantages and disadvantages of various options.Finally,we present two research ideas,one is to use a combination of RDBMS and Hadoop; Another is to make the Datanode caching some metadata of the small files.
Hadoop Distributed File System (HDFS) Small Files Problem Hadoop Archives Sequence files RDBMS
Xiaojun Liu Zhengquan Xu Xin Gu
LIESMARS,Wuhan University,Wuhan 430079,China;Huanggang Normal University,Huanggang 438000,China LIESMARS,Wuhan University,Wuhan 430079,China
国际会议
杭州
英文
278-281
2012-10-30(万方平台首次上网日期,不代表论文的发表时间)