Modeling Spatiotemporality for Multivariate Time Series in Urban Applications

Open Access
- Author:
- Tang, Xianfeng
- Graduate Program:
- Information Sciences and Technology
- Degree:
- Master of Science
- Document Type:
- Master Thesis
- Date of Defense:
- June 22, 2020
- Committee Members:
- Prasenjit Mitra, Thesis Advisor/Co-Advisor
Suhang Wang, Thesis Advisor/Co-Advisor
Xiang Zhang, Committee Member
Wang-Chien Lee, Committee Member
Mary Beth Rosson, Program Head/Chair - Keywords:
- Machine learning
Data mining
Time series
Spatial-temporal
Smart city - Abstract:
- With the rapid development of smart cities, a large amount of urban data is collected and stored. Most collected data can be formulated as multivariate time series (MTS) with geo-tagged information, such as public transportation, air quality, and weather. Spatialtemporality, defined from both spatial and temporal aspects, is the most important dynamics of urban MTS data. Understanding and analyzing the spatiotemporality of MTS data would benefit a wide range of real-world applications, thus contributing to the urbanization process. The majority of prior work focuses on data characteristics of MTS data from specific tasks, and successfully model spatiotemporal dynamics in proposed models. Despite the initial success of existing methods, real-world MTS urban data has several unique characteristics that have not been fully addressed, which brings both challenges and opportunities for researches on modeling MTS. First, from the data characteristics perspective, the distribution of real-world MTS data is affected by various hidden factors such as human daily activity. Understanding and modeling those hidden factors as prior knowledge can potentially enhance existing machine learning algorithms for specific applications. Second, from a data quality perspective, in most cases, collected MTS data are imperfect due to a lot of reasons (e.g., broken of sensors). Common low-quality issues include missing value, noisy sequential data, and anomaly samples, which challenges existing MTS models. Third, from the task correlation perspective, many tasks from urban scenarios are highly correlated. Exploiting and reusing knowledge shared in different tasks can benefit multi-tasks while how to efficiently exploit and reuse knowledge across multi-tasks for MTS data remains a challenging problem. Therefore, in this thesis, we investigate the novel problem of tacking the three unique characteristics of spatiotemporal MTS urban data. In particular, we first will investigate how to leverage domain knowledge for spatiotemporal MTS modeling to explore the first characteristic. We then study the low data quality issue jointly with various tasks (e.g., prediction, classification) and improve the robustness of our framework. The proposed solutions contribute to a wide range of real-world applications, such as public transportation and air quality monitoring.