Neural Network Architectures in the Setting of Dynamical Systems Setting

Open Access
- Author:
- Gunn, Sean R
- Graduate Program:
- Mechanical Engineering
- Degree:
- Master of Science
- Document Type:
- Master Thesis
- Date of Defense:
- March 07, 2019
- Committee Members:
- Asok Ray, Thesis Advisor/Co-Advisor
Brian Foley, Committee Member - Keywords:
- Deep Learning
Dynamical Systems in Neural Networks - Abstract:
- Traditionally deep learning models were initially interpreted as learning data representations by multiple nested compositions of affine transformations followed by nonlinear activations. These models learn complex structures of the data by using backpropgation internally through each layer of the network to determine the parameters of the model. Deep learning has become the focus for many researchers due to its breakthroughs in speech, text and image datasets. However, it lacks rigorous mathematical formulation to better explain these models. The main focus of this thesis is to understand and formulate neural network architectures in the form of a state space model. By defining an entire class of neural networks a set of differential or difference equation can represent a system of dynamical equations with varying levels of smoothness on the layer-wise transformation that can be algebraically manipulated into a state space form. From this representation, a closed form solution of an additive dense network and as well as $k^{th}$ order smooth network are found. Furthermore, results demonstrate that imposing $k^{th}$-many skip connections in a network architecture with $d$-many nodes per layer makes the effective embedding dimension of the data manifold $k \cdot d$-many dimensions to successfully classify the data. The proposed theory was partially validated on carefully designed experiments and various numerical data ranging from benchmark image classification tasks as well as mechanical engineering related applications. The objective of this thesis in part is to understand why skip connections have been so successful, and further motivate future research in developing algebraically defined networks.