THE EFFECT OF SPATIAL SEGMENTATION ON SAFETY PERFORMANCE FUNCTION MODELING
Open Access
- Author:
- Wang, Xingsheng
- Graduate Program:
- Computer Science
- Degree:
- Master of Science
- Document Type:
- Master Thesis
- Date of Defense:
- October 13, 2017
- Committee Members:
- Jeremy Blum, Thesis Advisor/Co-Advisor
Thang N. Bui, Committee Member
Linda Null, Committee Member
Sukmoon Chang, Committee Member
Omar El Ariss, Committee Member
Hyuntae Na, Committee Member - Keywords:
- safety performance functions
roadway segmentation
machine learning
Negative Binominal models
coordinate-descent approach
Weighted Absolute Percentage Error
clustering
generalizability of models - Abstract:
- Building predictive models called safety performance functions (SPFs) is important for the study of roadway safety. The first step in SPF modeling is roadway segmentation, which partitions roadways into segments. To build the predictive models, we train the models with a certain amount of observations. The observations cover as many cases as possible in order to build better and transferable model. These observations with different geometrical parameters and number of crashes are derived from the segmentation. Roadway segmentation is not only an essential but a challenging step. Previous studies have found that segmentation approaches affect the models’ transferability, for example, their predictive ability for future crashes or crashes on other roadways. Some researchers find that a little shift in segmentation yields very different models. To find better approaches to segmentation, in this thesis, we propose a novel segmentation methodology, which is driven by a machine learning clustering approach. While this approach limits in its ability to improve model transferability, it does help to characterize the extent to which segmentation approaches affect conclusions drawn from the models. In the clustering step of this approach, roadway segmentation is based on a weighted distance between adjacent segments. Segmented roadway data is used to build models that allow for the estimation of the gradient in the error metric as a function of the segmentation weights. The weights are updated based on this gradient, and this process repeats with the performance of models guiding the updating of weights and the resulting segmentation.