Dimension reduction for functional data

Restricted (Penn State Only)
Author:
Song, Jun
Graduate Program:
Statistics
Degree:
Doctor of Philosophy
Document Type:
Dissertation
Date of Defense:
May 16, 2017
Committee Members:
  • Bing Li, Dissertation Advisor
  • Bing Li, Committee Chair
  • Naomi S. Altman, Committee Member
  • Runze Li, Committee Member
  • Christopher Parker, Outside Member
  • Matthew Reimherr, Committee Member
Keywords:
  • dimension reduction
  • random operator
  • supervised machine learning
  • unsupervised machine learning
  • functional data analysis
  • handwriting data
  • speech recognition data
Abstract:
In regression problems, sufficient dimension reduction (SDR) allows us to reduce the dimension of predictor variables without losing any regression information which is considered to be a part of supervised machine learning. In particular, new theories and methodologies are in increasing demand to adapt to complex types of data with a drastically increased dimension such as functional data of infinite dimension. In this work, we established theories and methods of dimension reduction for functional data in three ways: (1) nonlinear supervised dimension reduction, (2) linear supervised dimension reduction, (3) nonlinear unsupervised dimension reduction. The fundamental idea of the theories is the construction of a feature space over a function space where real data live. We construct a feature space by using reproducing kernel Hilbert space (RKHS) in a nested way, called nested RKHS. Which allows us to treat functional data and capture nonlinear characteristics of data at the same time. In addition, nested RKHS can be used to develop weak conditional moments for developing general theories and methods for linear dimension reduction. We employ additive structure over the functional data so that the methods work for multivariate functional data. We develop two methods of nonlinear SDR for functional data, three methods of linear SDR for functional data, and a general framework of nonlinear functional PCA. Then asymptotic results, dimension determination, and its consistency have been studied for parts of methods. Simulation studies and real data application results show that the methods can reduce the dimension of functional data, and can be used for functional classification with high effectiveness.