Mixture Model Learning with Instance-level Constraints

Open Access
Author:
Zhao, Qi
Graduate Program:
Electrical Engineering
Degree:
Doctor of Philosophy
Document Type:
Dissertation
Date of Defense:
April 07, 2005
Committee Members:
  • David Jonathan Miller, Committee Chair
  • Constantino Manuel Lagoa, Committee Member
  • George Kesidis, Committee Member
  • Jia Li, Committee Member
Keywords:
  • semi-supervised learning with instance-level const
  • mixture modeling with constraints
  • constrained clustering
  • image segmentation with background information
Abstract:
Machine learning traditionally includes two categories of methods: supervised learning and unsupervised learning. In recent years, one paradigm, semi-supervised learning, has attracted much more interest due to large data sets with only partial label information from different domains such as web search, text classification, and machine vision. However, much prior knowledge or problem-specific information cannot be expressed with labels and hence cannot be used by existing semi-supervised learning methods. In other words, we require extensions of semi-supervised learning that can encode these types of auxiliary information; of particular interest is information that can be encoded as instance-level constraints. This dissertation presents a mixture model-based method able to make use of domain-specific information to improve clustering. The domain-specific information is cast in the form of instance-level constraints, i.e., pairwise sample constraints. Most prior work on semi-supervised clustering with constraints assumes the number of classes is known, with each learned cluster assumed to be a class and hence subject to the given class constraints. When the number of classes is unknown or when the ``one-cluster-per-class' assumption is not valid, the use of constraints may actually be deleterious to learning the ground-truth data groups. The proposed method addresses this by 1) allowing allocation of multiple mixture components to individual classes and 2) estimating both the number of components and the number of classes. This method also addresses new class discovery, with components void of constraints treated as putative unknown classes. The improvement in clustering performance by this method for the case of partially labeled data sets is also illustrated. We also explore discriminative learning with instance-level constraints in this dissertation. Our proposed discriminative method assumes a family of discriminant functions specified with a set of parameters, and uses the minimum relative entropy principle to find a distribution over these parameters. The final discriminant decision rule is obtained by averaging over all classifiers. The second major contribution of this dissertation is to image segmentation. The domain-specific information in images is spatial continuity, which can also be converted into instance-level constraints. After applying the proposed mixture-model-based method with constraints to image segmentation, we obtain a standard Markov random field potential objective function. Due to the structure of the constraints, a sequence-based forward/backward algorithm, i.e., a novel structured mean-field method, is presented. Better performance is obtained than a standard mean-field annealing algorithm. An investigation of model selection techniques is another contribution in this dissertation. In the proposed mixture-model-based method, the cluster number is determined with model selection. We provide an integrated learning and model selection framework, which performs batch optimization over the components and has the character of deterministic annealing with the optimization performed over a sequence of decreasing ``temperatures'. With a large number of components initialized and only well-initialized components kept, this method is less sensitive to local optima than the standard EM algorithm.