Electronic Theses and Dissertations for Graduate School
Add My Work
Author Last Name
SAVING COMPUTATIONS BY EARLY INFERENCE TERMINATION
Restricted (Penn State Only)
Computer Science and Engineering
Master of Science
Date of Defense:
November 17, 2017
Chitaranjan Das, Thesis Advisor
Vijaykrishnan Narayanan, Committee Member
John Morgan Sampson, Committee Member
Deep Neural Network
Convolutional Neural Network
Machine learning algorithms have seen a revival and a fast growth in popularity due to the recent increase in training data and processing capability of computers. They are being used in a number of different tasks such as image classification, object detection, speech recognition among others. Deep neural networks (DNNs) can be trained to achieve high inference accuracy, and deeper networks correspond to better accuracy of the overall network output, but also incur increasing costs in total computation. However, most networks are flat n-way classifiers, which expend equal effort for all classes in a dataset. This thesis proposes a framework to identify subsets of classes that can be classified with high accuracy using only features extracted from earlier network layers in order to reduce the average computational cost of inference. We apply our framework on the MNIST and CIFAR-10 datasets and demonstrate how our approach makes these networks more amenable for deployment on compute limited endpoint devices. We show up to 52% computation savings (42% latency reduction) for CIFAR-10 with accuracy losses of no more than 1.8%.
Login using your Penn State access account to view the paper.