XGBoost-SHAP and Unobserved Heterogeneity Modelling of Temporal Multivehicle Truck-Involved Crash Severity Patterns

Wimon Laphrom, Chamroeun Se, Thanapong Champahom, Sajjakaj Jomnonkwao, Warit Wipulanusatd, Thaned Satiennam, Vatanavongs Ratanavaraha

Abstract


This paper aims to address the critical issue of multivehicle truck crashes in developing regions, with a focus on Thailand, by analyzing the factors that influence injury severity and comparing the effectiveness of predictive models. Utilizing advanced random parameters and the XGBoost machine learning algorithm, we conducted a comprehensive analysis of injury severity factors in multivehicle truck-involved accidents, contrasting weekdays and weekends. Our findings reveal that the XGBoost model significantly outperforms the heterogeneous logit model in predicting crash severity outcomes, demonstrating superior accuracy, sensitivity, specificity, precision, F1 score, and area under the curve (AUC) in both model training and testing phases. Key risk factors identified include motorcycle involvement, head-on collisions, and crashes occurring during late night/early morning hours, with environmental elements like road lane numbers and weekend hours also playing a significant role. The study introduces XGBoost as a novel and improved method for truck safety analysis, capable of capturing the complex interactions within multivehicle crash data and offering actionable insights for targeted interventions to reduce crash severity. By highlighting specific risk factors and the effectiveness of XGBoost, this research contributes to the development of data-driven strategies for enhancing truck safety in developing countries.

 

Doi: 10.28991/CEJ-2024-010-06-011

Full Text: PDF


Keywords


Truck-Involved Crashes; Injury Severities; Random Parameters; Machine Learning; eXtreme Gradient Boosting; SHAP.

References


Boonpanya, T., & Masui, T. (2021). Assessing the economic and environmental impact of freight transport sectors in Thailand using computable general equilibrium model. Journal of Cleaner Production, 280, 124271. doi:10.1016/j.jclepro.2020.124271.

NESDB. (2012). Structure of the GDP. National Economic & Social Development Board (NESDB), Bangkok, Thailand.

DOH. (2024). Thailand traffic accident on national highways. Thailand Department of Highways (DOH), Bangkok, Thailand.

Chen, C., Zhang, G., Tian, Z., Bogus, S. M., & Yang, Y. (2015). Hierarchical Bayesian random intercept model-based cross-level interaction decomposition for truck driver injury severity investigations. Accident Analysis and Prevention, 85, 186–198. doi:10.1016/j.aap.2015.09.005.

Islam, S., Jones, S. L., & Dye, D. (2014). Comprehensive analysis of single- and multi-vehicle large truck at-fault crashes on rural and urban roadways in Alabama. Accident Analysis and Prevention, 67, 148–158. doi:10.1016/j.aap.2014.02.014.

Pahukula, J., Hernandez, S., & Unnikrishnan, A. (2015). A time of day analysis of crashes involving large trucks in urban areas. Accident Analysis and Prevention, 75, 155–163. doi:10.1016/j.aap.2014.11.021.

Zou, W., Wang, X., & Zhang, D. (2017). Truck crash severity in New York city: An investigation of the spatial and the time of day effects. Accident Analysis and Prevention, 99, 249–261. doi:10.1016/j.aap.2016.11.024.

Hao, W., Kamga, C., Yang, X., Ma, J. Q., Thorson, E., Zhong, M., & Wu, C. (2016). Driver injury severity study for truck involved accidents at highway-rail grade crossings in the United States. Transportation Research Part F: Traffic Psychology and Behaviour, 43, 379–386. doi:10.1016/j.trf.2016.09.001.

Khan, W. A., & Khattak, A. J. (2018). Injury Severity of Truck Drivers in Crashes at Highway-Rail Grade Crossings in the United States. Transportation Research Record, 2672(10), 38–47. doi:10.1177/0361198118781183.

Al-Bdairi, N. S. S., Hernandez, S., & Anderson, J. (2018). Contributing Factors to Run-Off-Road Crashes Involving Large Trucks under Lighted and Dark Conditions. Journal of Transportation Engineering, Part A: Systems, 144. doi:10.1061/jtepbs.0000104.

Uddin, M., & Huynh, N. (2017). Truck-involved crashes injury severity analysis for different lighting conditions on rural and urban roadways. Accident Analysis and Prevention, 108, 44–55. doi:10.1016/j.aap.2017.08.009.

Ahmed, M. M., Franke, R., Ksaibati, K., & Shinstine, D. S. (2018). Effects of truck traffic on crash injury severity on rural highways in Wyoming using Bayesian binary logit models. Accident Analysis and Prevention, 117, 106–113. doi:10.1016/j.aap.2018.04.011.

Behnood, A., & Mannering, F. (2019). Time-of-day variations and temporal instability of factors affecting injury severities in large-truck crashes. Analytic Methods in Accident Research, 23, 100102. doi:10.1016/j.amar.2019.100102.

Song, D., Yang, X., Yang, Y., Cui, P., & Zhu, G. (2023). Bivariate joint analysis of injury severity of drivers in truck-car crashes accommodating multilayer unobserved heterogeneity. Accident Analysis and Prevention, 190, 107175. doi:10.1016/j.aap.2023.107175.

Uddin, M., & Huynh, N. (2020). Injury severity analysis of truck-involved crashes under different weather conditions. Accident Analysis and Prevention, 141, 105529. doi:10.1016/j.aap.2020.105529.

Shao, X., Ma, X., Chen, F., Song, M., Pan, X., & You, K. (2020). A random parameter ordered probit analysis of injury severity in truck involved rear-end collisions. International Journal of Environmental Research and Public Health, 17(2), 395. doi:10.3390/ijerph17020395.

Tahmidul Haq, M., Zlatkovic, M., & Ksaibati, K. (2021). Assessment of Commercial Truck Driver Injury Severity as a Result of Driving Actions. Transportation Research Record: Journal of the Transportation Research Board, 2675(9), 1707–1719. doi:10.1177/03611981211009880.

Haq, M. T., Zlatkovic, M., & Ksaibati, K. (2021). Assessment of commercial truck driver injury severity based on truck configuration along a mountainous roadway using hierarchical Bayesian random intercept approach. Accident Analysis & Prevention, 162, 106392. doi:10.1016/j.aap.2021.106392.

Wang, C., Chen, F., Zhang, Y., & Cheng, J. (2022). Spatiotemporal instability analysis of injury severities in truck-involved and non-truck-involved crashes. Analytic Methods in Accident Research, 34, 100214. doi:10.1016/j.amar.2022.100214.

Santos, K., Dias, J. P., & Amado, C. (2022). A literature review of machine learning algorithms for crash injury severity prediction. Journal of Safety Research, 80, 254–269. doi:10.1016/j.jsr.2021.12.007.

Chang, L. Y., & Chien, J. T. (2013). Analysis of driver injury severity in truck-involved accidents using a non-parametric classification tree model. Safety Science, 51(1), 17–22. doi:10.1016/j.ssci.2012.06.017.

Zheng, Z., Lu, P., & Lantz, B. (2018). Commercial truck crash injury severity analysis using gradient boosting data mining model. Journal of Safety Research, 65, 115–124. doi:10.1016/j.jsr.2018.03.002.

Zhang, S., Khattak, A., Matara, C. M., Hussain, A., & Farooq, A. (2022). Hybrid feature selection-based machine learning Classification system for the prediction of injury severity in single and multiple-vehicle accidents. PLoS ONE, 17(2), 262941. doi:10.1371/journal.pone.0262941.

Rahimi, E., Shamshiripour, A., Samimi, A., & Mohammadian, A. (Kouros). (2020). Investigating the injury severity of single-vehicle truck crashes in a developing country. Accident Analysis and Prevention, 137, 105444. doi:10.1016/j.aap.2020.105444.

Islam, M., & Hernandez, S. (2013). Large truck-involved crashes: Exploratory injury severity analysis. Journal of Transportation Engineering, 139(6), 596–604. doi:10.1061/(ASCE)TE.1943-5436.0000539.

Dong, C., Richards, S. H., Huang, B., & Jiang, X. (2013). Identifying the factors contributing to the severity of truck-involved crashes. International Journal of Injury Control and Safety Promotion, 22(2), 116–126. doi:10.1080/17457300.2013.844713.

Naik, B., Tung, L. W., Zhao, S., & Khattak, A. J. (2016). Weather impacts on single-vehicle truck crash injury severity. Journal of Safety Research, 58, 57–65. doi:10.1016/j.jsr.2016.06.005.

Osman, M., Paleti, R., Mishra, S., & Golias, M. M. (2016). Analysis of injury severity of large truck crashes in work zones. Accident Analysis & Prevention, 97, 261–273. doi:10.1016/j.aap.2016.10.020.

Dong, C., Dong, Q., Huang, B., Hu, W., & Nambisan, S. S. (2017). Estimating Factors Contributing to Frequency and Severity of Large Truck–Involved Crashes. Journal of Transportation Engineering, Part A: Systems, 143. doi:10.1061/jtepbs.0000060.

Al-Bdairi, N. S. S., & Hernandez, S. (2017). An empirical analysis of run-off-road injury severity crashes involving large trucks. Accident Analysis and Prevention, 102, 93–100. doi:10.1016/j.aap.2017.02.024.

Wang, Y., & Prato, C. G. (2019). Determinants of injury severity for truck crashes on mountain expressways in China: A case-study with a partial proportional odds model. Safety Science, 117, 100–107. doi:10.1016/j.ssci.2019.04.011.

Haq, M. T., Zlatkovic, M., & Ksaibati, K. (2020). Investigating occupant injury severity of truck-involved crashes based on vehicle types on a mountainous freeway: A hierarchical Bayesian random intercept approach. Accident Analysis & Prevention, 144, 105654. doi:10.1016/j.aap.2020.105654.

Wen, H., Ma, Z., Chen, Z., & Luo, C. (2023). Analyzing the impact of curve and slope on multi-vehicle truck crash severity on mountainous freeways. Accident Analysis & Prevention, 181, 106951. doi:10.1016/j.aap.2022.106951.

Mannering, F., Bhat, C. R., Shankar, V., & Abdel-Aty, M. (2020). Big data, traditional data and the tradeoffs between prediction and causality in highway-safety analysis. Analytic Methods in Accident Research, 25, 100113. doi:10.1016/j.amar.2020.100113.

Mannering, F. L., Shankar, V., & Bhat, C. R. (2016). Unobserved heterogeneity and the statistical analysis of highway accident data. Analytic Methods in Accident Research, 11, 1–16. doi:10.1016/j.amar.2016.04.001.

Hou, Q., Huo, X., Leng, J., & Mannering, F. (2022). A note on out-of-sample prediction, marginal effects computations, and temporal testing with random parameters crash-injury severity models. Analytic Methods in Accident Research, 33, 100191. doi:10.1016/j.amar.2021.100191.

Se, C., Champahom, T., Jomnonkwao, S., Karoonsoontawong, A., & Ratanavaraha, V. (2021). Temporal stability of factors influencing driver-injury severities in single-vehicle crashes: A correlated random parameters with heterogeneity in means and variances approach. Analytic Methods in Accident Research, 32, 100179. doi:10.1016/j.amar.2021.100179.

Goswamy, A., Abdel-Aty, M., & Islam, Z. (2023). Factors affecting injury severity at pedestrian crossing locations with Rectangular RAPID Flashing Beacons (RRFB) using XGBoost and random parameters discrete outcome models. Accident Analysis and Prevention, 181, 106937. doi:10.1016/j.aap.2022.106937.

Jamal, A., Zahid, M., Tauhidur Rahman, M., Al-Ahmadi, H. M., Almoshaogeh, M., Farooq, D., & Ahmad, M. (2021). Injury severity prediction of traffic crashes with ensemble machine learning techniques: a comparative study. International Journal of Injury Control and Safety Promotion, 28(4), 408–427. doi:10.1080/17457300.2021.1928233.

Train, K. E. (2009). Discrete choice methods with simulation. Cambridge university press, Cambridge, United Kingdom.

Sarrias, M. (2016). Discrete choice models with random parameters in R: The Rchoice package. Journal of Statistical Software, 74(10), 1–31. doi:10.18637/jss.v074.i10.

Washington, S., Karlaftis, M. G., Mannering, F., & Anastasopoulos, P. (2020). Statistical and econometric methods for transportation data analysis. Chapman and Hall/CRC, Boca Raton, United States.

Greene, W. H., & Hensher, D. A. (2010). Modeling ordered choices: A primer. Cambridge University Press, Cambridge, United Kingdom.

Chen, T., & Guestrin, C. (2016). XGBoost. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. doi:10.1145/2939672.2939785.

Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. The Annals of Statistics, 29(5), 1189-1232. doi:10.1214/aos/1013203451.

Dong, S., Khattak, A., Ullah, I., Zhou, J., & Hussain, A. (2022). Predicting and Analyzing Road Traffic Injury Severity Using Boosting-Based Ensemble Learning Models with SHAPley Additive exPlanations. International Journal of Environmental Research and Public Health, 19(5), 2925. doi:10.3390/ijerph19052925.

Lundberg, S. M., & Lee, S. I. (2017). A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems, 30.

Yu, H., Li, Z., Zhang, G., Liu, P., & Ma, T. (2021). Fusion convolutional neural network-based interpretation of unobserved heterogeneous factors in driver injury severity outcomes in single-vehicle crashes. Analytic Methods in Accident Research, 30, 100157. doi:10.1016/j.amar.2021.100157.

Mokhtarimousavi, S., Anderson, J. C., Azizinamini, A., & Hadi, M. (2020). Factors affecting injury severity in vehicle-pedestrian crashes: A day-of-week analysis using random parameter ordered response models and Artificial Neural Networks. International Journal of Transportation Science and Technology, 9(2), 100–115. doi:10.1016/j.ijtst.2020.01.001.

Sokolova, M., & Lapalme, G. (2009). A systematic analysis of performance measures for classification tasks. Information Processing & Management, 45(4), 427–437. doi:10.1016/j.ipm.2009.03.002.

Ryu, S. E., Shin, D. H., & Chung, K. (2020). Prediction model of dementia risk based on XGBoost using derived variable extraction and hyper parameter optimization. IEEE Access, 8, 177708–177719. doi:10.1109/ACCESS.2020.3025553.

Wu, S., Yuan, Q., Yan, Z., & Xu, Q. (2021). Analyzing Accident Injury Severity via an eXtreme Gradient Boosting (XGBoost) Model. Journal of Advanced Transportation, 2021, 1–11. doi:10.1155/2021/3771640.

Champahom, T., Jomnonkwao, S., Watthanaklang, D., Karoonsoontawong, A., Chatpattananan, V., & Ratanavaraha, V. (2020). Applying hierarchical logistic models to compare urban and rural roadway modeling of severity of rear-end vehicular crashes. Accident Analysis & Prevention, 141, 105537. doi:10.1016/j.aap.2020.105537.

Rifaat, S. M., Tay, R., & De Barros, A. (2012). Severity of motorcycle crashes in Calgary. Accident Analysis & Prevention, 49, 44–49. doi:10.1016/j.aap.2011.02.025.

Se, C., Champahom, T., Jomnonkwao, S., Chonsalasin, D., & Ratanavaraha, V. (2024). Modeling of single-vehicle and multi-vehicle truck-involved crashes injury severities: A comparative and temporal analysis in a developing country. Accident Analysis & Prevention, 197, 107452. doi:10.1016/j.aap.2023.107452.

Se, C., Champahom, T., Jomnonkwao, S., Kronprasert, N., & Ratanavaraha, V. (2022). The impact of weekday, weekend, and holiday crashes on motorcyclist injury severities: Accounting for temporal influence with unobserved effect and insights from out-of-sample prediction. Analytic Methods in Accident Research, 36, 100240. doi:10.1016/j.amar.2022.100240.

Se, C., Champahom, T., Jomnonkwao, S., Chaimuang, P., & Ratanavaraha, V. (2021). Empirical comparison of the effects of urban and rural crashes on motorcyclist injury severities: A correlated random parameters ordered probit approach with heterogeneity in means. Accident Analysis & Prevention, 161, 106352. doi:10.1016/j.aap.2021.106352.

Zhu, X., & Srinivasan, S. (2011). A comprehensive analysis of factors influencing the injury severity of large-truck crashes. Accident Analysis & Prevention, 43(1), 49–57. doi:10.1016/j.aap.2010.07.007.

Alrejjal, A., Farid, A., & Ksaibati, K. (2021). A correlated random parameters approach to investigate large truck rollover crashes on mountainous interstates. Accident Analysis & Prevention, 159, 106233. doi:10.1016/j.aap.2021.106233.

Mehrara Molan, A., Rezapour, M., & Ksaibati, K. (2020). Investigating the relationship between crash severity, traffic barrier type, and vehicle type in crashes involving traffic barrier. Journal of Traffic and Transportation Engineering (English Edition), 7(1), 125–136. doi:10.1016/j.jtte.2019.03.004.

Hao, W., Kamga, C., & Wan, D. (2016). The effect of time of day on driver’s injury severity at highway-rail grade crossings in the United States. Journal of Traffic and Transportation Engineering, 3(1), 37–50. doi:10.1016/j.jtte.2015.10.006.

Mohamad, I., Jomnonkwao, S., & Ratanavaraha, V. (2022). Using a decision tree to compare rural versus highway motorcycle fatalities in Thailand. Case Studies on Transport Policy, 10(4), 2165–2174. doi:10.1016/j.cstp.2022.09.016.

Chen, F., & Chen, S. (2011). Injury severities of truck drivers in single- and multi-vehicle accidents on rural highways. Accident Analysis & Prevention, 43(5), 1677–1688. doi:10.1016/j.aap.2011.03.026.


Full Text: PDF

DOI: 10.28991/CEJ-2024-010-06-011

Refbacks

  • There are currently no refbacks.




Copyright (c) 2024 Wimon Laphrom, Chamroeun Se, Thanapong Champahom, Sajjakaj Jomnonkwao, Warit Wipulanusatd, Thaned Satiennam, Vatanavongs Ratanavaraha

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.
x
Message