New Procedures for Cox's Model with High Dimensional Predicotrs

Open Access
Yu, Ye
Graduate Program:
Doctor of Philosophy
Document Type:
Date of Defense:
September 08, 2015
Committee Members:
  • Runze Li, Dissertation Advisor
  • Matthew Logan Reimherr, Committee Member
  • Lingzhou Xue, Committee Member
  • Lan Kong, Committee Member
  • Cox's model
  • Partial Likelihood
  • ultrahigh dimensional survival data
  • model selection consistency
  • Penalized Likelihood
This thesis studies feature screening and variable selection procedures for ultra-high dimensional Cox's models and the asymptotic behaviors of tuning parameter selectors, such as the AIC, BIC, and GCV, for penalized partial likelihood. Survival data with ultrahigh dimensional covariates such as genetic markers have been collected in medical studies and other fields. In our first project, we propose a feature screening procedure for the Cox model with ultrahigh dimensional covariates. The proposed procedure is distinguished from the existing sure independence screening (SIS) procedures (Fan, Feng and Wu, 2010, Zhao and Li, 2012) in that the proposed procedure is based on joint likelihood of potential active predictors, and therefore is not a marginal screening procedure. The proposed procedure can effectively identify active predictors that are jointly dependent but marginally independent of the response without performing an iterative procedure. We develop an effective algorithm to carry out the proposed procedure and establish the ascent property of the proposed algorithm. We further prove that the proposed procedure possesses the sure screening property. That is, with the probability tending to one, the selected variable set includes the actual active predictors. We conduct Monte Carlo simulation to evaluate the finite sample performance of the proposed procedure and further compare the proposed procedure and existing SIS procedures. The proposed methodology is also demonstrated through an empirical analysis of a real data example. Due to the need of studying the theoretical property of variable selection procedure for Cox's model, we study the asymptotic behavior of partial likelihood for the Cox model in our second project. We find that the partial likelihood does not behave like an ordinary likelihood, whose sample average typically tends to its expected value, a finite number, in probability. Under some mild conditions, we prove that the sample average of partial likelihood tends to infinity at the rate of logarithm of the sample size in probability. This is an interesting and surprising results because the maximum partial likelihood estimate has the same asymptotical behavior as the ordinal maximum likelihood estimate. We further apply the asymptotic results on the partial likelihood to study tuning parameter selection for penalized partial likelihood. Our finding indicates that the penalized partial likelihood with the generalized cross-validation (GCV) tuning parameter proposed in Fan and Li (2002) enjoys the model selection consistency property. This is another surprising result because it is well known that the GCV, AIC and $C_p$ are all equivalent in the context for linear regression models, and are not model selection consistent. Our empirical studies via Monte Carlo simulation and real data example confirms our theoretical finding.