Explainable Predictive Modeling and Causal Effect Estimation from Complex Time-varying Data

Open Access
- Author:
- Hsieh, Tsung Yu
- Graduate Program:
- Computer Science and Engineering
- Degree:
- Doctor of Philosophy
- Document Type:
- Dissertation
- Date of Defense:
- June 24, 2021
- Committee Members:
- David Miller, Major Field Member
Wang-Chien Lee, Major Field Member
Vasant Honavar, Chair & Dissertation Advisor
Matthew Reimherr, Outside Unit & Field Member
Chitaranjan Das, Program Head/Chair - Keywords:
- Explainable Machine Learning
Time Series Learning
Deep Neural Networks
Attention Networks
Functional Data Analysis
Causal Inference - Abstract:
- Time-varying data are prevalent in a wide variety of real-world applications for example health care, environmental study, finance, motion capture among others. Time-varying data possess complex nature and pose unique challenges. For example, time-varying data observed in real-world applications almost always exhibit nonstationary characteristics that challenges ordinary time-series methods with stationary assumptions. In addition, one may only have access to irregularly sampled data which prohibits the models that assume regularly observed samples. On the other hand, as machine learning and data mining algorithms have begun make an impact on real-world applications, merely providing accurate prediction is no longer sufficient. There is a growing need for interpretations and explanations to how the machine learning models make predictions in order for end-users to fully trust and adopt these models. In this thesis, we explore time-varying data in various practical scenarios and aim at enhancing model explainability and understanding of the data. First, we study the problem of building explainable classifiers for multivariate time series data by means of joint variable and time interval selection. We introduce a modular framework, the LAXCAT model, consisting of a convolution-based feature extraction and a dual attention mechanism. The convolution-based feature extraction network produces variable-specific representation by considering local time interval context. The dual attention mechanisms, namely variable attention network and temporal attention network, work in concert to simultaneously select variable and time interval that are discriminative to the classification task. We present results of extensive experiments with several benchmark data sets that show that the proposed method outperforms the state-of-the-art baseline methods on multi-variate time series classification task. The results of our case studies demonstrate that the variables and time intervals identified by the proposed method make sense relative to available domain knowledge. Second, to obtain a better understanding of the input multivariate time series data, we study dynamic structure learning which aims at jointly discovering hidden state transitions and state-dependent inter-variable connectivity structures. To address the research problem, we introduce a novel state-regularized dynamic autoregressive model framework, the SrVARM model, featuring a state-regularized recurrent neural network and a dynamic autoregressive model. The state-regularized recurrent unit learns to discover the hidden state transition dynamics from the data while the autoregressive function learns to encode state-dependent inter-variable dependencies in directed acyclic graph structure. A smooth characterization of the acyclic constraint is exploited to train the model in an efficient and unified framework. We report results of extensive experiments with simulated data as well as a real-world benchmark that show that SrVARM outperforms state-of-the-art baselines in recovering the unobserved state transitions and discovering the state-dependent relationships among variables. Third, functional data analysis provides another promising perspective at dealing with time-vary data. However, the representation learning capability of neural network-based method have not been fully explored for functional data. We study unsupervised representation learning from functional data and introduce the functional autoencoder network which generalizes the standard autoencoder network to the functional data setting. The functional autoencoder copes with functional data input by leveraging functional weights and inner product for real-valued functions. We derive from first principles, a functional gradient-based algorithm for training the resulting network. We present results of experiments which demonstrate that the functional autoencoders outperform the state-of-the-art baseline methods. Besides providing a solution to the problem of functional data representation learning, the proposed model offers a fundamental building block for other functional data learning tasks, such as classification and regression networks. Fourth, we study the problem of treatment effect estimation from networked time series data. Such data arise in settings where individuals are linked by a network of relations, e.g., social ties, and the observations for each individual are naturally represented by time series. We propose a novel representation learning approach to treatment effect estimation from networked time series data consisting of a temporal convolution network, a graph attention network, and a treatment-specific outcome predictor network. We use an adversarial learning framework for domain adaptation to learn a representation of individuals that makes treatment assignment independent of the treatment outcome. We present results of experiments and show that the proposed framework outperforms the state-of-the-art baselines in estimating treatment effects from networked time series data. We conclude with a brief summary of the main contributions of the thesis and some directions for further research.