Regularization Parameter Selection for Variable Selection in High-dimensional Modelling

Open Access
Author:
Zhang, Yiyun
Graduate Program:
Statistics
Degree:
Doctor of Philosophy
Document Type:
Dissertation
Date of Defense:
February 24, 2009
Committee Members:
  • Prof Runze Li, Dissertation Advisor
  • Runze Li, Committee Chair
  • Bing Li, Committee Member
  • David Russell Hunter, Committee Member
  • Vernon Michael Chinchilli, Committee Member
Keywords:
  • GLIM
  • LASSO
  • Penalized Likelihood
  • SCAD
  • Variable Selection
Abstract:
Variable selection is an important issue in statistical modelling. Classical approaches select models by applying a penalty related to the size of the candidate model. Exhaustive search is required for these classical methods which is impractical in high-dimensional modelling. Adopting continuous penalties such as the LASSO and the SCAD made it possible to cope with the high-dimensionality. Alike in classical methods, the size of regularization plays a crucial rule in their asymptotic properties. For classical methods, it is well known that AIC-like criteria are asymptotically loss efficient in the sense that they choose the minimum loss model when the true model is infinite dimensional. On the contrary, when there is a finite dimensional correct model, BIC-like criteria are consistent in the sense that they choose the smallest correct model with probability tending to one. Parallel properties for the penalized estimators are studied in this thesis. Extending the results of Wang, Li and Tsai (2007), we show that the consistent tuning parameter selector results in a penalized estimator that is also consistent in a general likelihood setting. On the other hand, it is shown that the tuning parameter selector constructed from an efficient criterion is also asymptotically loss efficient for linear regression. Under the conditions imposed in this thesis, the efficiency result can also be extended to generalized linear models in terms of Kullback-Leibler loss. Our simulation studies suggest the finite sample performances are in line with the theories we present. A real data application is discussed to advocate the use of penalized likelihood variable selection procedures.