Abstract Principal Component Analysis And Applications To Model Reduction

Open Access
Li, Tianjiang
Graduate Program:
Doctor of Philosophy
Document Type:
Date of Defense:
November 11, 2009
Committee Members:
  • Qiang Du, Dissertation Advisor
  • Qiang Du, Committee Chair
  • Ludmil Tomov Zikatanov, Committee Member
  • Xiantao Li, Committee Member
  • Anna L Mazzucato, Committee Member
  • Runze Li, Committee Member
  • Banach operator algebra
  • standing and traveling waves
  • signal decomposition
  • extraction of transient modes
  • abstract principal component analysis
  • data mining
The work in the present thesis is to develop a general complex system reduction framework for analyzing complex mixing wave signals generated from some special dynamic systems. This general framework is named Abstract Principal Component Analysis (APCA) which is an extension of the classical Principal Component Analysis to abstract operator spaces. This work is motivated by the growing need to facilitate both theoretical modeling and numerical computation with high dimensional complex systems. Modern scientific research often relies on complicated theoretical analysis and heavy computation with high dimensional models and data sets. In many occasions a suitable low dimensional and simplified model with only dominant features will be sufficient to demonstrate most essential properties of the high dimensional counterpart. There are various issues about the feature extraction process, such as identifying most significant information, extracting information in different scale and reconstructing high dimensional signals. The primary focus of this work is the development of a general APCA framework and APCA applications to model reduction problems. This model reduction framework could be utilized as a data driven approach to study mixing dynamics data sets arising from physical problems and informational problems. This thesis is organized as follows. First we review some major existing methods for model reduction. Reviewed methods include the classical Principal Component Analysis (PCA) and some extensions such as Nonlinear PCA and Kernel PCA, Multidimensional Scaling (MDS), ISOMAP, Locally Linear Embedding (LLE), Diffusion Map, Independent Component Analysis (ICA), Compressive Sensing, Empirical Mode Decomposition (EMD) and Principal Interval Decomposition (PID). One particular PCA extension called Generalized Principal Component Analysis (GPCA) is introduced in details since it serves as the foundation for APCA general framework and shares a lot of common features with APCA. The principal and algorithm of each method will be introduced and its applications will be illustrated with examples from existing references. We will also show some experiments using existing methods to solve mode extraction problems. The failure of existing methods on mode extraction is the motivation for us to develop APCA which serves as a universal framework for some mode extraction problems. The general APCA framework is developed with a basic setting of Banach algebra including operator elements. Different specializations of APCA solve model reduction problems such as segmenting sample points to linear subspaces and decomposing single wave modes from mixing signals. Mode extraction is implemented in the local single stage pattern when wave motions preserve their characteristic features such as traveling speed and scaling parameters. Basic wave motion types include scaling in the dependent variable (standing wave), scaling in the independent variable (scaling wave) and moving with fixed profile (traveling wave). Complex wave motions include compositions of three basic motion types. Processing schemes for these composition wave modes are consistent with compositions of processing schemes for single wave modes. There is an alternative optimization approach for certain procedures in mode extraction, which are equivalent as procedures in APCA. Synthetic numerical examples are presented to demonstrate the performance of mode decomposition algorithms. Some complex motions are indeed simple single stage motions under certain coordinate transformation and some examples are shown for illustration. Applications of the APCA model reduction framework are illustrated primarily with those partial differential equations describing wave signal propagation processes, such as the one dimensional Burgers' equation and the two dimensional Kadomtsev-Petviashvili (KP) equation. These equations permit solutions which are approximately superpositions of single independent wave modes. The most discussed wave modes are tanh shape modes for the Burgers' equation and soliton modes for the KP equation. We illustrate decomposition results with comparisons between mixing signals and corresponding decomposed single modes. With mode characteristic parameters and reconstructed mode functions in a short time, long time solution behaviors can be predicted without solving partial differential equations in a long time period. In order to investigate independent signal mode information from the global complex mode evolution, we have developed several techniques and some additional processing procedures. The global rigid motion reduction is introduced with some earlier work designed for the purpose of facilitating molecular dynamics simulations. This global rigid motion reduction serves as a preprocessing procedure on the training data. In the presence of noise, mode extraction results involve moderate level of error. Attentions are given to signal processing issues such as independent signal mode number overestimation, mode characteristic parameter estimation error and mode reconstruction error. Mode number selection schemes are used to choose the independent mode number. An optimization approach would provide a more robust processing scheme for solving mode parameters. We use different snapshots as multi-stage mode evolutions and use the mode function alignment as a post-processing procedure to reduce noise effects on mode profiles. Global multi-stage complex motions are approximated with local single stage motions from different subintervals in time. Multi-stage motion parameters are approximated by piecewise constant local single stage motion parameters. Multi-stage complex wave motions are approximated by piecewise single stage motions and the approximation accuracy is determined by comparing the reconstructed snapshots with original given snapshots. In the end we present conclusions as well as some processing issues and directions for future studies.