Pattern Matching via Sequence Alignment: Analyzing Spatio-temporal Patterns and their Distances

Open Access
Author:
Stehle, Sam Kenneth
Graduate Program:
Geography
Degree:
Master of Science
Document Type:
Master Thesis
Date of Defense:
July 10, 2013
Committee Members:
  • Donna Jean Peuquet, Thesis Advisor
Keywords:
  • sequence alignment
  • pattern matching
  • space-time
  • time geography
  • events
  • geographic information
Abstract:
In many fields of study, researchers seek patterns to provide insight into processes and phenomena. In particular, time provides a measure to indicate consistency and change. Finding significant temporal patterns thus remains a popular and necessary topic of research in many fields, geography included. This research adds to this scholarship by extending sequence analysis techniques to evaluate the significance of a single established pattern by confirming its existence under new conditions. This effort investigates two questions: How well does the pattern fit in these new conditions? In what ways does the pattern change from the expectation it presents? By using a variation on the sequence alignment algorithm developed in computational biology this Masters thesis addresses these questions. Sequence alignment uses matrix representations and traversal to compare two sequences by modifying them to make them approximate one another, which can be visually represented. Further, sequence alignment generates a measure of the total modification done that can be used to quantitatively compare sequences of events. This thesis extends sequence alignment to account for the properties of spatio-temporal sequences extraneous to the original intent of the algorithm. Sequence alignment, developed for analysis of the physical structure of DNA, places a base pair within every possible location within a sequence. In this research, sequence alignment is amended to instead analyze the temporal intervals between successive events. Additionally, unlike the physical structure of DNA in which one base pair can occupy a location in the sequence, events can occupy the same temporal unit. This Masters thesis introduces the Temporal Deviation Distance as a measure of the amount of temporal deviation between an expected pattern of events and empirical data. The output from this operation is color-coded to facilitate visual exploration of those deviations with respect to the times and locations in both sequences that they take place. This thesis demonstrates the temporally-aware sequence alignment algorithm on a statistically derived pattern of diplomatic events taking place in Yemen between February 2011to March 2012. This pattern is compared to a series of events in time periods previous to those from which it was derived. I find that patterns change as a function of the temporal distance between those time periods, and thus conform to normal distance decay models. A similar analysis is undertaken that holds constant time rather than space. The pattern derived from events in Yemen is compared to events in the same time period in eight different countries showing that spatial distance decay does not apply and further political and economic differences affect the flow of patterns.