Evolutionary-based Feature Extraction for Gesture Recognition Using a Motion Camera

Open Access
Author:
Ahn, Eun Yeong Yeong
Graduate Program:
Information Sciences and Technology
Degree:
Doctor of Philosophy
Document Type:
Dissertation
Date of Defense:
April 30, 2012
Committee Members:
  • John Yen, Dissertation Advisor
  • John Yen, Committee Chair
  • Dinghao Wu, Committee Member
  • Dongwon Lee, Committee Member
  • Patrick M Reed, Committee Member
  • Tracy Mullen, Special Member
Keywords:
  • Global Feature Extraction
  • Local Feature Extraction
  • DVS camera
  • Evolutionary Algorithm
  • Segmentation
Abstract:
Gesture recognition systems have garnered increasing interest for their potential to support more natural human-computer interactions. However, compared to other human-computer interaction technologies such as speech recognition, gesture recognition has not been actively applied to personal devices such as mobile phones or laptops due to the spatial requirements when performing gestures as well as sensitivity to background noise. My research first devises a problem of recognizing speed sensitive finger gestures using a novel camera called Dynamic Vision Sensor camera, which detects the temporal luminance difference for each pixel at microsecond-level granularity and outputs a stream of on-events (brighter) and off-events (darker) to the hardware. As with other machine learning problems, the performance of a gesture classification task depends on how well the representative features are extracted. Thus the feature extraction process must consider device-specific data properties to maximize the feature recognition abilities while minimizing computational cost. My research studies two feature extraction methods, namely local and global feature extractions, which are designed to maximize the performance of the DVS camera-based gesture recognition system. First, the local feature extraction method aims to extract a smaller number of representative features from a long sequence of the raw gesture events detected by the DVS camera using segmentation. This approach is called the local feature extraction, since the features are extracted by considering neighboring events only. Specifically, I propose bottom-up segmentation methods, where the sequence of events are first divided into segments having the same time interval, called the time-based, or the same number of events, called the event-based, and the segments are repeatedly augmented based on the event distributions of the neighboring segments. The experimental results show that the event-based initial segmentation outperforms the time-based across different classifiers, and is more robust to noise. I also found that Bayesian network classifier is more accurate than hidden Markov model when features are well extracted using the event-based segmentation. Second, the global feature extraction method aims to construct higher level compound features by transforming the locally extracted features. Specifically, an evolutionary algorithm is employed to find a good set of simple and compound features. This is a challenging task due to the large search space and the risks of overfitting. I define problem-specific representation, genetic operators, and evaluation methods, and analyze how the specified mutation and crossover operator controls the individual’s search space. The experimental results show that the proposed EA can extract a good set of compound features that can enhance the performance accuracy with a smaller number of features. Finally, I show how my evolutionary-based feature extraction approach can serve as a knowledge discovery process in the context of gesture recognition.