Optimization and Hardware Acceleration of Consensus-based Matching and Tracking

Open Access
Snyder, Joshua Scott
Graduate Program:
Computer Science and Engineering
Master of Science
Document Type:
Master Thesis
Date of Defense:
March 31, 2015
Committee Members:
  • Vijaykrishnan Narayanan, Thesis Advisor
  • hardware acceleration
  • object tracking
  • FPGA
  • CMT
  • hardware architecture
  • vision system
  • computer vision
Image and video understanding has become an increasingly valuable capability for many emerging applications such as smart retail, intelligent surveillance, and autonomous robotic systems. The critical barrier to enabling these applications is the high execution latencies of complex vision tasks that make real-time system constraints difficult, or impossible, to achieve. One specific instance of a complex vision task is object tracking, which is the focus of this thesis. Object tracking is a necessary component of grocery shopping assistance applications that track a grocery item and a person’s hand and guides the hand to the item to pick it up. Although there are many object tracking algorithms to choose from, this work investigates the performance bottlenecks and optimizations of the Consensus-based Matching and Tracking, CMT, algorithm. To circumvent the limitations of standard optical-flow based trackers, CMT uses a descriptor matching step to redetect an object’s key features that would be permanently lost in the standard approach. This allows for an object to be hidden or occluded from view and redetected once it reappears in the view of the camera. For fully autonomous systems, in which re-initialization of a failed object track may not be possible or prohibitively costly, robustness of the tracker is of critical importance. As such, this work introduces, an enhanced version of the CMT algorithm that exhibits improvements in accuracy and robustness as evaluated against a standardized benchmark. The improvement in accuracy and robustness of the enhanced CMT comes at the cost of a significant increase in computational latency. Accordingly, this work also proposes a hybrid system that integrates high-performance custom hardware accelerators with a traditional processor to alleviate these new performance bottlenecks and to support real-time throughput.