Survival Analysis Using Neural Network Hazard Model with Incomplete Covariate Data

摘要：

This paper presents a new procedure to perform survival analysis when some covariate data are not available. A neural network hazard model is utilized here to model the relationship between covariates and the hazard. In order to consider incomplete covariates, the hidden layer target data are represented to be binary random variables. This will enable the training of the two-layer neural network hazard model to be decomposed into training of two single-layer structures. The training of input-hidden structure now becomes the logistic estimation problem with part of the input and all the output (the hidden layer target) missing. However, there are two major problems for this logistic estimation. It requires assumption about the distribution of the partially observed covariates. In addition, estimation for the logistic function will become complicated when the input data has missing values. Therefore, Instead of logistic function, the general location model is adopted to represent the mixed data set which involves missing values. The training of input-hidden structure thus becomes maximisation of the likelihood of mixed continuous data (covariates) and categorical data (hidden layer targets) within the general location model. The hidden layer targets link the two single structures and are updated iteratively. After each update, the expected values of the hidden layer targets are then used for the training of hidden-output structure of the neural network hazard model. This structure is now same as a generalised linear model (GLM) and is trained by the iteratively reweighted least squares (IRLS) approach. The training for both input-hidden and hidden-output structures will iterate until the estimation is converged. This new approach is applied to a group of bearing data. Parts of the data are deleted deliberately to create different realisations of incomplete covariate set. The numerical study demonstrates that this new approach is capable of handling the incomplete covariate data in the survival analysis and its results outperform those of conventional incomplete covariates handling approaches.

关键词： survival analysis neural network hazard model incomplete covariates general location model expectationmaximisation

作者: Yi YU Lin MA Yong SUN Yuan-Tong GU

作者单位: Cooperative Research Centre for Infrastructure and Engineering Asset Management (CIEAM) School of Engineering Systems, Faculty of Built Environment and Engineering, Queensland University of Technology Brisbane, Australia

会议类型: 国际会议

会议名称: 2011 International Conference on Quality,Reliability,Risk,Maintenance,and Safety Engineering(2011年质量、可靠性、风险、维修性与安全性国际会议暨第二届维修工程国际学术会议 ICQR2MSE 2011)

会议地点: 西安

会议语种:英文

页码: 239-242

在线出版日期: 2011-06-17（万方平台首次上网日期，不代表论文的发表时间）

会议专题

Survival Analysis Using Neural Network Hazard Model with Incomplete Covariate Data