会议专题

Performance Modeling and Evaluation of Distributed Deep Learning Frameworks on GPUs

Challenge in GPU Computing Big gap between processing capacity and memoryaccess Computing is fast:each core(ALU) can finish one or two operations per cycle 1000 cores x 1GHz = 1TFlops But,one arithmetic operation requires two reads and one write.

褚晓文

香港浸会大学计算机科学系

国内会议

2017中国大数据技术大会

北京

英文

1-42

2017-12-01(万方平台首次上网日期,不代表论文的发表时间)