Statistical Inference for High Dimensional Models
Restricted (Penn State Only)
- Author:
- Cui, Shijie
- Graduate Program:
- Statistics
- Degree:
- Doctor of Philosophy
- Document Type:
- Dissertation
- Date of Defense:
- June 14, 2022
- Committee Members:
- Lingzhou Xue, Major Field Member
Yanyuan Ma, Major Field Member
Rongling Wu, Outside Unit & Field Member
Bing Li, Major Field Member
Runze Li, Chair & Dissertation Advisor
Ephraim Mont Hanks, Professor in Charge/Director of Graduate Studies - Keywords:
- statistics
statistical inference
high dimension
partially linear single index
matrix
rank
double robust
measurement error
misspecified
sparse SVD - Abstract:
- Statistical inference under high dimensional modelings has attracted much attention due to its wide applications in many fields. In this dissertation, I propose new methods for statistical inference in high dimensional models from three aspects: inference in high dimensional semiparametric models, inference in high dimensional matrix-valued data, and inference in high dimensional measurement error misspecified models. The first project studied statistical inference in high dimensional partially linear single index models. Firstly a profile partial penalized least squares estimator for parameter estimates for the model is proposed, and its asymptotic properties are given. Then an F-type test statistic for testing the parametric components is proposed, and its theoretical properties are established. I then propose a new test for the specification testing problem of the nonparametric components. Finally, simulation studies and empirical analysis of a real-world data set are conducted to illustrate the performance of the proposed testing procedure. The second project proposes new testing procedures in high dimensional matrix-valued data. Rank is an essential attribute for a matrix. A new type of statistic is proposed, which can make inferences on the rank of the matrix-valued data. I firstly give the theoretical property of its oracle version. To overcome the problem of empirical error accumulation, a new type of sparse SVD method is proposed, and its theoretical properties are given. Based on the newly proposed sparse SVD method, I provide a sample version statistic. Theoretical properties of this sample version statistic are given. Simulation studies and two applications to surveillance video data are provided to illustrate the performance of our newly proposed method. The third project proposes a new testing method in misspecified measurement error models. The testing method can work when there is potential model misspecification and measurement error in the model. Firstly its property is studied under the low dimensional setting. Then I develop it to the high dimensional setting. Further, I propose a method that can be adaptive to the sparsity level of the true parameters under the high dimensional setting. Simulation studies and one application to a clinical trial data set are given.