Tests of Hypotheses on Regression Coefficients in High-Dimensional Regression Models
Restricted (Penn State Only)
- Author:
- Zhao, Ye Alex
- Graduate Program:
- Statistics
- Degree:
- Doctor of Philosophy
- Document Type:
- Dissertation
- Date of Defense:
- June 17, 2022
- Committee Members:
- Ephraim Hanks, Professor in Charge/Director of Graduate Studies
Rongling Wu, Outside Unit & Field Member
Le Bao, Major Field Member
Runze Li, Chair & Dissertation Advisor
Ethan Fang, Major Field Member - Keywords:
- high-dimensional inference
regression coefficient testing
hypothesis testing - Abstract:
- Statistical inference in high-dimensional settings has become an important area of research due to the increased production of high-dimensional data in a wide variety of areas. However, few approaches towards simultaneous hypothesis testing of high-dimensional regression coefficients have been proposed. In the first project of this dissertation, we introduce a new method for simultaneous tests of the coefficients in a high-dimensional linear regression model. Our new test statistic is based on the sum-of-squares of the score function mean with an additional power-enhancement term. The asymptotic distribution and power of the test statistic are derived, and our procedure is shown to outperform existing approaches. We conduct Monte Carlo simulations to demonstrate performance improvements over existing methods and apply the testing procedure to a real data example. In the second project, we propose a test statistic for regression coefficients in a high-dimensional setting that applies for generalized linear models. Building on previous work on testing procedures for high-dimensional linear regression models, we extend this approach to create a new testing methodology for GLMs, with specific illustrations for the Poisson and logistic regression scenarios. The asymptotic distribution of the test statistic is established, and both simulation results and a real data analysis are conducted to illustrate the performance of our proposed method. The final project of this dissertation introduces two new approaches for testing high-dimensional regression coefficients in the partial linear model setting and more generally for linear hypothesis tests in linear models. Our proposed statistic is motivated by the profile least squares method and decorrelation score method for high-dimensional inference, which we show to be equivalent in these particular cases. We outline the empirical performance of the new test statistic with simulation studies and real data examples. These results indicate generally satisfactory performance under a wide range of settings and applicability to real world data problems.