Analytics and Modeling for Optimizing Screening and Early Diagnosis in Children with Developmental Disorders
Open Access
- Author:
- Chen, Yu Hsin
- Graduate Program:
- Industrial Engineering
- Degree:
- Doctor of Philosophy
- Document Type:
- Dissertation
- Date of Defense:
- June 11, 2024
- Committee Members:
- Ling Rothrock, Professor in Charge/Director of Graduate Studies
Paul Griffin, Major Field Member
Guodong Liu, Outside Field Member
Hui Zhao, Outside Unit Member
Qiushi Chen, Chair & Dissertation Advisor
Prakash Chakraborty, Major Field Member - Keywords:
- Autism Spectrum Disorder
Screening
Health policy
Operations Research
Simulation
Optimization
Machine learning
Prediction
Clinical decision making
Health service capacity - Abstract:
- Developmental disorders impose significant public health and economic burdens on society. These conditions present substantial social, communication, and behavioral challenges throughout an individual's lifetime. It is well known that early diagnosis is crucial for enabling timely and effective interventions, which improve long-term outcomes for this population. However, significant delays in the diagnostic process in current practice cause children to miss the optimal window for early interventions, underscoring the urgent need for improving early detection by more effective approaches and processes. This dissertation aims to develop a data-driven analytical modeling framework, through a series of interrelated studies to improve early detection by enhancing the accuracy of screening tools and the efficiency of diagnosis processes, while accounting for healthcare capacity constraints. In particular, we focus on the applications of the proposed analytical and modeling framework in the clinical domains of Autism Spectrum Disorder (ASD), one of the most common developmental disorders, which has an increasing awareness and challenges in meeting the unmet needs of services in this population. In our first study, we developed machine learning prediction models to assess the risk of ASD based on clinical information for children at very young ages. In current practice, ASD screening is solely based on behavioral questionnaires, the Modified Checklist for Autism in Toddlers (M-CHAT), which does not incorporate children’s clinical information, although abundant evidence in the clinical literature has shown the associations between certain clinical symptoms and ASD. To assess the predictive value of this clinical information, using commercial insurance claims data consisting of demographics, medical diagnoses, and procedures, we trained and tested models using LASSO logistic regression and random forest to predict the risk of ASD in children at the ages of 18 and 24 months, which were shown to achieve comparable accuracy with M-CHAT, or even outperform M-CHAT when differentiating inpatient and outpatient visit data. We identified key predictive features, including sex, diagnosis of other developmental disorders, respiratory infections, and gastrointestinal infections. To facilitate the integration of risk prediction models that integrate clinical information into practice, in our second study, we aimed to further improve the prediction accuracy for ASD diagnosis by building on the existing screening instrument, M-CHAT. Using real-world electronic health record (EHR) data from the Children’s Hospital of Philadelphia (CHOP), we augmented the M-CHAT scores with additional demographic and clinical factors, and achieved a prediction accuracy with an area under the receiver operating characteristic curve (AUROC) of 0.83 at 18 months and 0.88 at 24 months, using a gradient boost model. To address the limitation of lacking interpretability by conventional machine learning models, we further developed a risk-calibrated supersparse linear integer model that generates the score-based prediction model. The resulting score-based checklist served as an extension to the existing items in the M-CHAT, incorporating additional features such as sex, family history, and diagnoses related to developmental disorders. This approach yielded incremental improvements in AUROC from 0.74 to 0.81 at 18 months and from 0.76 to 0.83 at 24 months, and allowed for simple interpretation and calculation of the risk score. In the third study, we focused on examining how the improvement in the accuracy of screening instruments translates into the impact on the early diagnosis outcomes, given constrained diagnostic service capacity. We developed an individual-level discrete event simulation model parameterized based on EHR data from CHOP. We simulated the screening-to-diagnosis process, including the waitlist for diagnostic services, and projected the mean age at diagnosis and percentage of early diagnosis (below the age of 4 years) under various combinations of sensitivity and specificity of the screening. Our analysis revealed drastic changes in early diagnosis outcomes in terms of specificity, with less pronounced changes in sensitivity. A distinct threshold emerged around 0.95, indicating that specificity must exceed this threshold to ensure favorable early diagnosis outcomes, given the current limitations in diagnostic service capacity. In our last study, we took a holistic approach to systematically optimize the risk-based referral policies while accounting for risk transitions and the waitlist dynamics. In the generic problem setting with repeated screenings, we formulated a non-convex quadratic mixed integer program to minimize the average age at diagnosis for ASD. We developed an efficient solution approach based on the branch-and-bound algorithm and McCormick relaxation to solve the large-scale problem, which otherwise could not be solved using off-shelf optimization solvers within reasonable computational time. We further evaluated and compared optimal policies of different policy structures with the status quo using a detailed discrete event simulation model. We found that compared to the status quo, age-dependent optimal policies decreased the average age at diagnosis for children who had referrals by up to 10 months, while managing to refer an additional 2\% of children with ASD, which resulted in a 4\% increase in the early diagnosis rate and doubled the positive predictive value. Our findings highlighted the effectiveness of age-dependent referral policies, especially under constrained diagnostic capacity, and provided critical insights for improving ASD referral processes.