Flexible Parameter Sharing Networks

摘要：

　　Deep learning models have flourished in recent years,but it still remains a complex optimization problem in that the parameters of each layer are independent.Although this problem can be alleviated by the coefficient vector based parameter sharing methods,it has brought up a new problem:different size of parameters cannot be generated from a fixed-size global parameter template,which may truncate latent con-nections among parameters.In order to generate different size of param-eters from the same parameter template,a Flexible Parameter Sharing Scheme(FPSS)is proposed.We exploited the asymmetric characteristic of convolution operations to resize and transform the template to specific parameters.As a generalization of the coefficient vector based methods,FPSS incorporates 2-dimension convolution operations rather than linear combinations to make transformations on the global template.Since all parameters are generated from the same template,FPSS can be viewed as building latent connections among each parameter through the global template.Meanwhile,each layer needs much fewer parameters,which will reduce the search space and make it easier to train.Furthermore,we presented two deep models as applications of FPSS,Hybrid CNN and Adaptive DenseNet,which sharing the global template to different mod-ules and blocks.One can easily find the similar parts of a deep network through our method.Experimental results on several text datasets show that the proposed models are comparable or better to state of the art model.

关键词： Deep learning Parameter sharing Template

作者: Chengkai Piao Jinmao Wei Yapeng Zhu Hengpeng Xu

作者单位: College of Computer Science,Nankai University,Tianjin 300071,China College of Computer Science,Nankai University,Tianjin 300071,China;Institute of Big Data,Nankai Univ

会议类型: 国际会议

会议名称: 9th CCF International Conference on Natural Language Processing and Chinese Computing (NLPCC 2020)

会议地点: 郑州

会议语种:英文

页码: 346-358

在线出版日期: 2020-10-14（万方平台首次上网日期，不代表论文的发表时间）

会议专题

Flexible Parameter Sharing Networks