ADAPTIVE INFORMATION EXTRACTION FROM COMPLEX SYSTEMS VIA SYMBOLIC TIME SERIES ANALYSIS

Open Access
Author:
Li, Yue
Graduate Program:
Mechanical Engineering
Degree:
Doctor of Philosophy
Document Type:
Dissertation
Date of Defense:
June 05, 2016
Committee Members:
  • Asok Ray, Dissertation Advisor
  • Asok Ray, Committee Chair
  • Christopher Rahn, Committee Member
  • Hosam Kadry Fathy, Committee Member
  • Shashi Phoha, Outside Member
  • Minghui Zhu, Outside Member
  • Thomas Wettergren, Committee Member
Keywords:
  • Symbolic time series analysis
  • Hidden Markov modeling
  • Pattern recognition
  • Finite-state Automaton
  • Information Fusion
  • Sensor Networks
  • Battery SOC estimation
  • Battery SOH estimation
  • Recursive Bayes Filter
  • Image processing
Abstract:
This dissertation represents a framework for adaptive information extraction from complex systems via symbolic time series analysis (STSA). The key idea for STSA is to convert original time series of digital signals into a sequence of (spatially discrete) symbols from which embedded dynamic information can be extracted and analyzed. The main challenges are: 1) selection of symbol alphabet size; 2) identification of partitioning locations in signal space of time series; and 3) dynamic modeling of symbol sequences to extract embedded temporal patterns. In this context, probabilistic deterministic finite-state automata (PDFA), a special class of Hidden Markov models (HMMs), are used to learn the temporal patterns embedded in symbol sequences. A novel unsupervised symbolization algorithm is developed to construct PDFA models from time series by maximizing the mutual information measurement between symbol set and state set. It is demonstrated that the proposed approach can capture the underlying dynamic information of time series more effectively than existing unsupervised symbolization methods such as equal frequency partition and equal width partition. In order to evaluate the information dependence and causality between two time series, a special class of PDFAs, called ×D-Markov (pronounced as cross D-Markov) machines, is adopted in this dissertation. To quantify the information flow from one time series to the other, an information-theoretical measurement derived from the concept of transfer entropy is introduced. In this dissertation, the proposed STSA approaches are adapted and applied to three different types of complex systems for different purposes. The first is state estimation and parameter identification for SISO systems via symbolic dynamic modeling of synchronized input-output time series. By considering input-output jointly instead of system output alone, the proposed data-driven approach has ability to provide robust results under fluctuating or varying input patterns. To overcome the deficiencies of solely model-based filtering and those of solely data-driven approaches, the estimation framework is constructed based on a model-based recursive Bayesian estimator combined with a data-driven measurement model. The second type of system under investigation is a large-scale sensor network. The problem is to identify and locate useful information under a dynamic ambient noise from the environment, which impacts each individual sensor differently. In general, the changes in a dynamic environment may significantly affect the performance of pattern recognition due to limited training scenarios and the assumptions made on signal behavior under a static environment. Both symbol level and feature-level fusion schemes are proposed to evaluate the information content of sensor nodes. The third research topic is dimensionality reduction of high dimensional data (e.g., video) for information extraction. The main challenge is to develop feature extraction and information compression algorithms with low computational complexity such that they can be applied to real-time analysis of video (i.e., a sequence of image frames) captured by a high-speed camera. In the proposed method, the sequence of images is converted to a sequence of symbols where embedded dynamic characteristics of the physical process are preserved.