会议专题

Improvement of TF-IDF Algorithm Based on Hadoop Framework

  TF-IDF algorithm is often used in search engine,text similarity computation,web data mining,etc.These applications are often faced with the massive data processing.Therefore,how to calculate the tf-idf quickly and efficiently is very important.In this paper,we give a tf-idf algorithm based on the hadoop framework.Experiments show that in the case of massive data computing,the new method applying hadoop framework is more efficient than the traditional methods.

Hadoop TF-IDF distributed computing

Bin Li Yuan Guoyong

Department of Computer Science Colleague of Information Science & Technology Jinan University Guangzhou,China

国际会议

2012 2nd International Conference on Computer Application and System Modeling(2012第二届计算机应用与系统建模国际会议)(ICCASM-2012)

沈阳

英文

391-393

2012-07-27(万方平台首次上网日期,不代表论文的发表时间)