Principles of Riemannian Geometry in Neural Networks

Open Access
Author:
Hauser, Michael B
Graduate Program:
Mechanical Engineering
Degree:
Doctor of Philosophy
Document Type:
Dissertation
Date of Defense:
March 30, 2018
Committee Members:
  • Asok Ray, Dissertation Advisor
  • Asok Ray, Committee Chair
  • Shashi Phoha, Committee Member
  • Daniel Connell Haworth, Committee Member
  • Kenneth Jenkins, Outside Member
Keywords:
  • Neural networks
  • Algebraic geometry
  • Riemannian manifolds
  • Machine learning
  • Non-abelian gauge theories
Abstract:
The first part of this dissertation deals with neural networks in the sense of geometric transformations acting on the coordinate representation of the underlying data manifold from which the data is sampled. It forms part of an attempt to construct a formalized general theory of neural networks in the setting of algebraic and Riemannian geometry. Geometry allows for a rigorous formulation, and therefore understanding, of neural networks as they act on data manifolds, which is certainly of great importance as these tools become ubiquitous in engineering applications. From this perspective, the following theoretical results are developed and proven for feedforward networks. First it is shown that residual neural networks are finite difference approximations to dynamical systems of first order differential equations, as opposed to ordinary networks that are static. This implies that the network is learning systems of differential equations governing the coordinate transformations that represent the data. Second it is shown that a closed form solution of the metric tensor on the underlying data manifold can be found by backpropagating the coordinate representation through the neural network. This is formulated in a formal abstract sense as a sequence of pullback Lie group actions on the metric fibre space in the principal and associated bundles on the data manifold, where backpropagation is shown to be the pullback of tensor bundles. A model based on perturbation theory is developed as well, and used to understand how neural networks treat testing data differently than they do training data. The second part of this dissertation makes use of neural networks for forecasting probability distributions of time series in terms of discrete symbols that are quantized from real-valued data. The developed framework formulates the forecasting problem into a probabilistic paradigm of forecasting densities over symbols. The main advantage of formulating probabilistic forecasting in the symbolic setting is that density predictions are obtained without any significantly restrictive assumptions, such as second order statistics. The efficacy of the proposed method has been demonstrated by forecasting probability distributions on chaotic time series data collected from a laboratory-scale experimental apparatus.