会议专题

STUDY ON THE SMALL FILES PROBLEM OF HADOOP

  Although Hadoop is widely used,its full potential is not yet put to use because of some issues,the small files problem being one of them.Firstly,the paper analyses the causes of the small files problem of Hadoop.Then,the current program to solve the small files problem are introduced,including Hadoop own programs and other application-specific solutions,and analyzes the advantages and disadvantages of various options.Finally,we present two research ideas,one is to use a combination of RDBMS and Hadoop; Another is to make the Datanode caching some metadata of the small files.

Hadoop Distributed File System (HDFS) Small Files Problem Hadoop Archives Sequence files RDBMS

Xiaojun Liu Zhengquan Xu Xin Gu

LIESMARS,Wuhan University,Wuhan 430079,China;Huanggang Normal University,Huanggang 438000,China LIESMARS,Wuhan University,Wuhan 430079,China

国际会议

2012 2nd IEEE International Conference on Cloud Computing and Intelligence Systems (2012年第2届IEEE云计算与智能系统国际会议(IEEE CCIS2012))

杭州

英文

278-281

2012-10-30(万方平台首次上网日期,不代表论文的发表时间)