A COMPARISON OF REGRESSION MODELS FOR INCIDENT RATE PREDICTION IN A CANADIAN POWER COMPANY

Open Access
Author:
Park, Sunghae
Graduate Program:
Energy and Mineral Engineering
Degree:
Master of Science
Document Type:
Master Thesis
Date of Defense:
April 15, 2010
Committee Members:
  • William Arthur Groves, Thesis Advisor
Keywords:
  • OLS regression
  • Safety Intervention
  • PLS regression
Abstract:
This thesis presents a comparison of regression models used to characterize relationships between incident rate (IR) and safety interventions applied in a Canadian Power Company. Quantitative analyses were used to evaluate IR prediction models extracted using Partial Least Squares (PLS), Ordinary Least Squares (OLS), and Stepwise regression methods. Data consisting of weekly time sheets recording incidents and safety training (intervention) activities were collected from two Service Groups in the Hydro One Company The percentages of total man-hours devoted to four categories of interventions activities (Factor A - Safety Awareness and Motivation Activities, Factor B - Skill Development and Training Activities, Factor C - New Tools and Equipment Design Methods and Activities, and Factor D - Equipment Related Activities) were treated as input variables for regression models. The specific aims of this study were to: 1) Determine whether PLS regression models perform better than OLS regression models for IR prediction, 2) Identify and characterize the intervention application rates associated with the lowest incident rates, and 3) Evaluate whether models developed within the same company, but using data from different operating units, could successfully predict IR for new data sets. Results show that while PLS- and OLS-based regression models employing all main factors and interaction terms as input variables (15 parameters) performed well for calibration data sets (R2 = 0.86-0.94, Mean Absolute Percent Error (MAPE) = 4-15%), performance was poor for validation data sets (R2 = 0.70-0.86, Mean Absolute Percent Error (MAPE) = 200-700%). Differences in the characteristics/distribution of the input variables between calibration and validation data sets and possible seasonal trends are likely to contribute to these large errors. The extracted PLS regression model was used to examine appropriate levels of intervention activities by employing the Excel® Solver function to identify intervention activity distributions that resulted in the lowest incident rates. Solver-identified intervention levels represented approximately 16-17% of total man-hours which is slightly higher than previously published results of 11-15% of total man-hours for “optimal” resource allocation. It is not clear that slight modifications in safety intervention allocations would be feasible or desirable given variability in the reporting and tracking of individual supervisors’ time sheets, and uncertainty as to whether a cause-effect relationship exists between safety intervention activities and reported incidents. Application of IR prediction models to different work groups was conducted to evaluate model generalizability. Results showed that neither Service Group’s prediction model could be successfully applied to the other group to estimate IR. Differences in data collection periods and changes potentially affecting the quantity and quality of the safety interventions, such as, equipment development, skills enhancement, employees, and safety culture, seem likely to contribute to the inability of IR prediction models to generalize. Any future attempt to model IR as a function of safety interventions should include an analysis using both quantitative and qualitative data characterizing intervention activities over longer time frames.