Variable selection and estimation in generalized linear models with the seamless Lo penalty
In this paper, we propose variable selection and estimation in generalized linear models using the seamless Lo (SELO) penalized likelihood approach. The SELO penalty is a smooth function that very closely resembles the discontinuous Lo penalty. We develop an efficient algorithm to fit the model, and show that the SELO-GLM procedure has the oracle property in the presence of a diverging number of variables. We propose a Bayesian Information Criterion (BIC) to select the tuning parameter. We show that under some regularity conditions, the proposed SELO-GLM/BIC procedure consistently selects the true model. We perform simulation studies to evaluate the finite sample performance of the proposed methods. Our simulation studies show that the proposed SELO-GLM procedure has a better finite sample performance than several existing methods, especially when the number of variables is large and the signals are weak. We apply the SELO-GLM to analyze a breast cancer genetic dataset to identify the SNPs that are associated with breast cancer risk.
BIC Consistency Coordinate descent algorithm Model selection Oracle property Penalized likelihood methods SELO penalty Tuning parameter selection.
Zilin Li Sijian Wang Xihong Lin
Department of Mathematics,Tsinghua University,Beijing China Department of Biostatistics & Medical Informatics and Statistics,University of Wisconsin,Madison Department of Biostatistics,Harvard University
国际会议
Second Joint Biostatistics Symposium(第二届生物统计国际研讨会2012)
北京
英文
420-447
2012-07-08(万方平台首次上网日期,不代表论文的发表时间)