Human Body Pose Estimation and Segmentation with Hierarchical, Data-driven MCMC

Open Access
- Author:
- Rauschert, Ingmar Peter
- Graduate Program:
- Computer Science and Engineering
- Degree:
- Master of Science
- Document Type:
- Master Thesis
- Date of Defense:
- November 22, 2013
- Committee Members:
- Robert T Collins, Thesis Advisor/Co-Advisor
- Keywords:
- Human Body Pose
Pedestrian
Image Segmentation
MCMC
Bayesian Probability
Computer Vision - Abstract:
- We are tackling the problem of human body detection, segmentation and pose estimation in still images, which has applications in initializing a human articulated body tracker, action recognition and image processing. The goal is to generate pixel-level segmentations of the body and its individual parts, along with estimates of the joint locations of an articulated human body model. Large variation in body shape and clothing across subject, pose and viewpoint, as well as the unpredictable nature of the scene background and the lack of temporal information, makes this a very challenging problem. We review the broad field of human pose estimation, discuss relevant work that has studied this problem from various perspectives, and identify several shortcomings in the current state-of-the-art of human pose estimation. The problem of human pose estimation is then formulated within a Bayesian framework, with the use of a generative model approach. The key aspect of this approach lays in the joint estimation of body shape and image segmentation. Together the multi-label image segmentation and the generative body model extract potential body part segments from the image that are simultaneously evaluated against the expected shape and color distribution of a particular, pose specific body model. This approach circumvents several shortcomings of previous work, such as a local search over model parameters, needing to be initialized with pose parameters close to the true body pose. Instead, this work uses a global search framework based on Markov Chain Monte Carlo (MCMC) sampling, which is an iterative hypothesize and test procedure that explores the vast search space of body poses and foreground/background segmentations to produce a generative explanation of the observed image data. To effectively search the large space of body poses, a data-driven proposal function and a coarse-to-fine search scheme are employed to reduce the high-dimensional search space into a set of low dimensional ones. This work further explores the particular application of estimating the walking direction of pedestrians and their style of clothing. Strong prior knowledge of possible body poses can then be used to more effectively constrain the vast parameter space of the employed body model. A quantitative evaluation of this work in terms of accuracy of pixel-level segmentations, estimated joint locations and walking direction of pedestrians is performed on several public human pose datasets. Results demonstrate that a global, stochastic model evaluation using only a simple color likelihood function yields surprisingly good results, outperforming other, deterministic bottom-up approaches.