A GPU based implementation of Center Surround Distribution Distance Algorithm for Feature Recognition

Open Access
Rathi, Aditi
Graduate Program:
Electrical Engineering
Master of Science
Document Type:
Master Thesis
Date of Defense:
September 29, 2009
Committee Members:
  • Vijaykrishnan Narayanan, Thesis Advisor
  • Kenneth Jenkins, Thesis Advisor
  • HPC
  • Acceleration
  • CUDA
  • CSDD
  • GPU
General purpose GPU programming environments like NVIDIA CUDA provide universal access to computing performance that was once only available to super-computers. The availability of such computational power has fostered the creation and re-deployment of algorithms, new and old, creating entirely new classes of applications. In this thesis, a GPU implementation of the Center-Surround Distribution Distance (CSDD) algorithm for feature recognition within images and video frames is presented. While an optimized CPU implementation requires anywhere from several seconds to tens of minutes to perform analysis of an image, the GPU based approach has the potential to improve upon this by up to 28X within acceptable accuracy. This thesis presents a scalable parallel computing model for the CSDD application and quantifies the impact of different CUDA optimizations on it. The experiments involved in the course of this implementation unleash almost all the capabilities and limitations of GPU for the application for a non-traditional problem like CSDD. The implementation shows promise of achieving real-time speeds with enhanced CUDA provisions for synchronization (the design bottleneck for CSDD), faster accesses of GPU memories (the performance bottleneck for CSDD) and, faster double precision computations (the computational speed bottleneck for CSDD because of limited double precision units per SM). Thus this work also establishes the suitability of GPU for similar data-intensive and data-dependent problems.