Coordinated Local Learning Algorithms for Continuously Adaptive Neural Systems

Open Access
- Author:
- Ororbia, Alexander Gabriel
- Graduate Program:
- Information Sciences and Technology
- Degree:
- Doctor of Philosophy
- Document Type:
- Dissertation
- Date of Defense:
- June 15, 2018
- Committee Members:
- C. Lee Giles, Dissertation Advisor/Co-Advisor
C. Lee Giles, Committee Chair/Co-Chair
David Reitter, Committee Member
Vasant Honavar, Committee Member
Daniel Kifer, Outside Member - Keywords:
- artificial intelligence
artificial neural networks
local learning
neurocognitive
machine learning
lifelong learning - Abstract:
- It is common statistical learning practice to build models, nowadays largely connectionist models, on very large, static, and fully annotated datasets of identically and independently distributed samples. However, the nature of this setup raises several questions that uncover the brittleness of the models fit to these datasets. What if the data contains few, if any, labels? Labels are generally difficult to come by and require intensive human labor if a dataset is to be fully and properly given ground truth. Alternatively, what if the data is sequential in nature and contains dependencies that span large gaps of time? In the task of modeling the characters of text, a model needs to learn how to spell words as well as how to arrange them in a coherent order in order to produce a meaningful sentence. Or finally, what if the distribution to be learned is dynamic and samples are presented over time, either drawn from a stream or a sequence of task datasets? A model must learn to predict the future well and yet retain previously acquired knowledge. In these settings, traditional machine learning approaches no longer directly apply. Motivated by this issue, we will ultimately address the issue of lifelong learning and the nature of systems that adapt themselves to such distributions. The goal of this thesis is to propose a new family of algorithms that will enable connectionist systems to operate in the lifelong learning setting. In this dissertation, we will specifically: 1) propose a class of deep neural architectures, as well as their learning and inference procedures, that are capable of learning from data with few class labels, 2) develop a novel framework for incorporating simple and effective forms of long-term memory into recurrent neural networks so that they are better able to capture longer-term dependencies in sequential data, and 3) propose a new family of learning algorithms inspired and motivated by neuro-cognitive theory and develop an architecture that can learn without unfolding over time. Finally, we will investigate the challenging problem of sequential and cumulative learning. The work carried forth in this dissertation is meant to serve as a crucial stepping stone towards tackling the greatest challenge facing machine learning and artificial intelligence research efforts–developing an agent that can continually learn.