# Machine Learning Methods for Regression and Classification with Functional Data

Restricted (Penn State Only)

- Author:
- Wright, Isaac
- Graduate Program:
- Statistics (PHD)
- Degree:
- Doctor of Philosophy
- Document Type:
- Dissertation
- Date of Defense:
- May 27, 2022
- Committee Members:
- Ephraim Hanks, Professor in Charge/Director of Graduate Studies

Jia Li, Major Field Member

Matthew Reimherr, Chair & Dissertation Advisor

John Liechty, Outside Unit & Field Member

Francesca Chiaromonte, Major Field Member - Keywords:
- Functional Data

Boosting

Decision Trees

Classification

Regression

Machine Learning

Supervised Learning

Unsupervised Learning - Abstract:
- Data sets with repeated measures collected over time are prevalent across a wide variety of settings in academia, government, and industry. Some examples include insurance telematics data, stock price data, and weather data. One way to model this repeated measure data is using Functional Data Analysis (FDA) techniques. In the FDA literature, functional regression is a widely studied topic. Functional regression can generally be classified into three broad categories that depend on the form of the response and predictor. If the response is a scalar and the predictor is a function, then this is called a scalar-on-function regression. If the response is a function and the predictor is a scalar, then this is called a function-on-scalar regression. Finally, if both the response and the predictor are functions, this is called a function-on-function regression. Though there is a vast amount of FDA literature on each of these topics, there is still a need for additional non-linear techniques that have superior performance on prediction-based tasks. In this dissertation, we develop both unsupervised and supervised learning methods to deal with complex non-linear relations present in the data. These methods include (1) Time-varying distance metric based on Wasserstein distance (2) Classification and regression trees for the scalar-on-function settings (3) Non-linear function-on-function regression with functional boosting.