visualization high dimensionality dimension reduction subspace Gaussian mixture model
Abstract:
We develop a new method for high dimensional data visualization via the Gaussian mixture model (GMM) with the component means constrained in a pre-selected subspace. An EM-type estimation algorithm is derived. We prove that the subspace containing the component means of a GMM with a common covariance matrix also contains the class means and the modes of the density. This motivates us to find a subspace by applying weighted principal component analysis to the class means and the modes. A dimension reduction property is proved in the sense of being informative for classification or clustering. Experiments on real data sets indicate that our method with the simple technique of spanning the subspace only by class means often outperforms the reduced rank mixture discriminant analysis (MDA) when the subspace dimension is very low. Visualization results on independent test data show that our proposed method exhibits more distinct class-wise separation of high dimensional data in 2d or 3d subspaces in comparison with reduced rank MDA.