Functional Principal Component Analysis and Sparse Functional Regression

Restricted (Penn State Only)
Petrovich, Justin Peter
Graduate Program:
Doctor of Philosophy
Document Type:
Date of Defense:
June 24, 2018
Committee Members:
  • Matthew Logan Reimherr, Dissertation Advisor
  • Matthew Logan Reimherr, Committee Chair
  • Runze Li, Committee Member
  • Ephraim Mont Hanks, Committee Member
  • Stephanie Trea Lanza, Outside Member
  • Carrie Daymont, Special Member
  • functional data analysis
  • sparse functional data
  • FPCA
  • scalar-on-function regression
The focus of this dissertation is on functional data which are sparsely and irregularly observed. Such data require special consideration, as classical functional data methods and theory were developed for densely-observed data. As is the case in much of functional data analysis, the functional principal components (FPCs) play a key role in current sparse functional data methods via the Karhunen-Loéve expansion. Thus, after a review of relevant background material in chapter 1, this thesis is divided roughly into two parts, the first focusing specifically on theoretical properties of FPCs, and the second on regression for sparsely observed functional data. Chapter 2 discusses functional principal component analysis and, in particular, provides a theoretical framework to relax the commonly-made assumption of distinct eigenvalues. This is done by shifting analysis from individual FPCs to the projection of FPCs. In addition, it is shown that one can still obtain asymptotic normality of the FPC projections. In chapter 3, focus shifts to scalar-on-function regression. Bridging ideas from the missing data literature with the more traditional sparse functional data methods, we propose a multiple imputation approach to imputing the sparselyobserved (functional) covariate in a scalar-on-function regression model. The proposed methodology is applied to both linear and logistic scalar-on-function regression. Extensive simulations are performed to validate the proposed approach, and consistency of the resulting estimated coefficient function is established. Finally, we apply our new approach to a study on childhood macrocephaly, and show that the development of pathological conditions are linked both to an average level of head circumference, as well as the velocity of head circumference growth.