Audio Recognition with Spiking Neural Networks

Open Access
- Author:
- Sargent, Kevin
- Graduate Program:
- Physics
- Degree:
- Doctor of Philosophy
- Document Type:
- Dissertation
- Date of Defense:
- July 30, 2024
- Committee Members:
- Mauricio Terrones, Program Head/Chair
Dezhe Jin, Chair & Dissertation Advisor
Abhronil Sengupta, Outside Field Member
Vladimir Itskov, Outside Unit Member
Ying Liu, Major Field Member - Keywords:
- speech recognition
spiking neural networks
neuromorphic computing - Abstract:
- Automatic speech recognition is one of the biggest challenges in machine learning. While there have been significant advancements, these advancements require large datasets for training and long training times. This comes with significant energy requirements for such models, as well as requiring connection to cloud server to run the models. Neuromorphic computing has the potential to solve these problems, providing a fundamentally new way of approaching machine learning. In this work, we demonstrate template-matching-based birdsong syllable recognition. Spikes generated by a set of Support Vector Machines are sent through a Spiking Neural Network which uses synaptic delays to encode template sequences. Unlike traditional CPU implementations, the neuromorphic templates run asynchronously and in parallel, meaning each template is checked simultaneously. This enables us to encode the large numbers of templates necessary for accurate recognition, without the increase in latency traditionally required as a trade-off. The neuromorphic template matching network is able to achieve competitive performance to the CPU with significantly reduced latency. Finally, the network developed on birdsong is demonstrated performing a proof-of-concept isolated, single-speaker, word recognition task. These results show that neuromorphic computing is capable of performing powerful computations and demonstrates the final step of a potential end-to-end neuromorphic speech recognition system.