会议专题

The changing relevance of the TLB

  A little over a decade ago,Goto and van de Geijn wrote about the importance of the treatment of the translation lookaside buffer(TLB)on the performance of matrix multiplication 1.Crucially,they did not say how important,nor did they provide results that would allow the reader to make his own judgement.In this paper,we revisit their work and look at the effect on the performance of their algorithm when built with different assumed data TLB sizes.Results on three different processors,one relatively modern,two contemporary with Goto and van de Geijns writings(1 and 2),are examined and compared within a real-world context.Our findings show that,although important when aiming for a place in the TOP500 3 list,these features have little practical effect,at least on the architectures we have chosen.We conclude,then,that the importance of the various factors,which must be taken into account when tuning matrix multiplication(GEMM,the heart of the High Performance LINPACK benchmark,and hence of the TOP500 table),differ dramatically relative to one another on different processors.

BLAS GEMM performance TLB HPL Linpack HPC high performance computing optimisation optimizationBLAS optimization

Jessica R.Jones Jessica R.Jones Russell Bradford

University of Bath Bath,BA2 7AY,U.K.

国际会议

The 12th International Symposium on Distributed Computing and Applications to Business,Engineering and Science(DCABES 2013)(第十二届分布式计算及其应用国际学术研讨会)

英国伦敦

英文

110-114

2013-09-02(万方平台首次上网日期,不代表论文的发表时间)