ESTIMATION OF CRASH TYPE FREQUENCY ACCOUNTING FOR MISCLASSIFICATION IN CRASH DATA

Open Access
- Author:
- Mahmud, Asif
- Graduate Program:
- Civil Engineering
- Degree:
- Doctor of Philosophy
- Document Type:
- Dissertation
- Date of Defense:
- February 13, 2023
- Committee Members:
- Patrick Fox, Program Head/Chair
Eric Donnell, Major Field Member
S. Ilgin Guler, Major Field Member
Mosuk Chow, Outside Unit, Field & Minor Member
Rajesh Paleti, Special Member
Nikhil Menon, Special Member
Vikash Gayah, Chair & Dissertation Advisor - Keywords:
- Crash frequency
Collision type
Misclassification
Underreporting - Abstract:
- Individual crash types have different underlying causes and thus the relationships between roadway/traffic characteristics and crash frequency are likely to differ across unique crash types. Two statistical methods – univariate and multivariate formulations – have been widely used so far by researchers in estimating the impact of contributing factors on different crash types. Addressing the limitations of these methods, recently a two-stage approach has been proposed in which one model is estimated to predict the total crash frequency and its prediction is combined with another model which predicts the proportions of different crash types. More efficient one-stage joint models, in which both the frequency and proportion models are estimated simultaneously and predictions are provided more directly, have also been proposed for macro-level analysis. This study investigates the performance of this joint modeling paradigm in analyzing unique crash type frequencies on individual road segments. Moreover, this study also proposes the use of a multinomial logit (MNL) model to estimate the proportion of different collision types, which has never been done in safety literature. This study compares the performance of all these methods in predicting crash frequency by crash type on two-way two-lane urban-suburban collector roadway segments in Pennsylvania. While the methodologies of crash type frequency estimation are well-established, less focus has been given on the quality of the crash dataset they are applied on. Crash misclassification (MC) – e.g., a crash of one type or severity being mistakenly miscategorized as another – is a relatively common problem in transportation safety. Crash frequency models for individual crash categories estimated using datasets with MC errors could result in biased parameter estimates and thus lead to ineffective countermeasure planning. This study proposes a novel methodological formulation to directly account for this MC error and incorporates it into the two most common count data models used for crash frequency prediction: Poisson and Negative Binomial (NB) regression. The proposed framework introduces probabilistic MC rates among different crash types and modifies the likelihood function of the count models accordingly. The study also demonstrates how this approach can be integrated into reformulated models that express each count model as a discrete choice model. The capability of the proposed models to estimate true parameters, given the existence of MC error, is examined via simulation analysis. Then, the proposed models are applied to empirical data to examine the presence of MC in crash data and further examine the robustness of the proposed models. Lastly, the ability of the proposed models in accounting for underreporting, another acute problem in crash data, is examined through comparing its performance with that from established frameworks.