Optimized Feature Selection for Predicting the Number of Casualties in Traffic Crashes

Muamer Abuzwidah, Ahmed Elawady, Jaeyoung Jay Lee, Ghazi G. Al-Khateeb, Salah Haridy, Waleed Zeiada

Abstract


Traffic crash prediction remains a critical challenge in transportation safety management, with increasing emphasis on leveraging machine learning techniques for accurate casualty prediction. This study aims to develop an optimized feature selection framework for traffic crash casualty prediction by comparing six selection techniques: Design of Experiments (DOE), Forward and Backward Sequential Feature Selection, Information Gain, Lasso Regularization, and Random Forest (RF) Feature Importance, with subsequent integration using the Borda count method. By analyzing 517,000 UK traffic crash records (2019-2023), 25 machine learning models (linear models, decision trees, ensemble methods, and neural networks) were evaluated across 12 critical attributes. Results demonstrate eXtreme Gradient Boosting (XGBoost)'s superior performance with a Root Mean Square Error (RMSE) of 0.671 and Mean Absolute Error (MAE) of 0.372 using the proposed Borda count integration method while maintaining efficient computation time (11.3 minutes compared to the baseline's 17 minutes). Five factors consistently emerged as the most influential predictors across all selection methods: number of vehicles involved, speed limit, police officer attendance, day of the week, and urban/rural classification, while environmental factors showed lower importance than traditionally assumed. The novel integration of multiple feature selection techniques through Borda count provides a more robust feature subset than any individual method, offering an optimal balance between computational efficiency and prediction accuracy. The framework enables transportation safety authorities to implement more efficient crash prediction systems while providing actionable insights about key risk factors for targeted interventions, especially to support the Highway Safety Manual development.

 

Doi: 10.28991/CEJ-2025-011-04-01

Full Text: PDF


Keywords


Traffic Crash Analysis; Feature Selection; Machine Learning; Traffic Safety; Predictive Analytics.

References


W.H.O. (2023). Road traffic injuries. World Health Organization (W.H.O.), Genève, Switzerland. Available online: https://www.who.int/news-room/fact-sheets/detail/road-traffic-injuries (accessed on March 2025).

Dalal, K., Lin, Z., Gifford, M., & Svanström, L. (2013). Economics of global burden of road traffic injuries and their relationship with health system variables. International Journal of Preventive Medicine, 4(12), 1442–1450.

Dimitriou, D., & Poufinas, T. (2016). Cost of Road Accident Fatalities to the Economy. International Advances in Economic Research, 22(4), 433–445. doi:10.1007/s11294-016-9601-0.

Blincoe, L., Miller, T. R., Wang, J. S., Swedler, D., Coughlin, T., Lawrence, B., ... & Dingus, T. (2022). The economic and societal impact of motor vehicle crashes, National Highway Traffic Safety Administration, No. DOT HS 813 403.

Rahmati, F., Doosti, M., & Bahreini, M. (2018). The Cost Analysis of Patients with Traffic Traumatic Injuries Presenting to Emergency Department; a Cross-sectional Study. Advanced Journal of Emergency Medicine, 3(1), e2.

Kavosi, Z., Jafari, A., Hatam, N., & Enaami, M. (2015). The Economic Burden of Traumatic Brain Injury Due to Fatal Traffic Accidents in Shiraz Shahid Rajaei Trauma Hospital, Shiraz, Iran. Archives of Trauma Research, 4(1), 22594. doi:10.5812/atr.22594.

Sánchez-Vallejo, P. G., Pérez-Núñez, R., & Heredia-Pi, I. (2015). Economic cost of disability caused by traffic injuries in Mexico during 2012. Cad. Public Health, Rio de Janeiro, 31(4), 755–766. doi:10.1590/0102-311X00020314

Abuzwidah, M., & Abdel-Aty, M. (2024). Assessing the impact of express lanes on traffic safety of freeways. Accident Analysis and Prevention, 207, 107718. doi:10.1016/j.aap.2024.107718.

Elawady, A., Khetrish, A., & Abuzwidah, M. (2020). Driver behaviors’ impacts on traffic safety at the intersections. 2020 Advances in Science and Engineering Technology International Conferences, 1–6. doi:10.1109/ASET48392.2020.9118291.

Jamal, A., Zahid, M., Tauhidur Rahman, M., Al-Ahmadi, H. M., Almoshaogeh, M., Farooq, D., & Ahmad, M. (2021). Injury severity prediction of traffic crashes with ensemble machine learning techniques: a comparative study. International Journal of Injury Control and Safety Promotion, 28(4), 408–427. doi:10.1080/17457300.2021.1928233.

Rajee, A., Satu, M. S., Abedin, M. Z., Ali, K. M. A., Aloteibi, S., & Moni, M. A. Weighted Fusion-Based Feature Selection (WFFS) for Enhanced Traffic Accident Analysis. Knowledge-Based Systems, 311, 113089.

Shi, X., Wong, Y. D., Li, M. Z. F., Palanisamy, C., & Chai, C. (2019). A feature learning approach based on XGBoost for driving assessment and risk prediction. Accident Analysis and Prevention, 129, 170–179. doi:10.1016/j.aap.2019.05.005.

Shangguan, Q., Wang, J., Lei, C., Fu, T., Fang, S., & Fu, L. (2025). Modelling the impact of risky cut-in and cut-out manoeuvers on traffic platooning safety with predictability and explainability. Transportmetrica A: Transport Science. doi:10.1080/23249935.2025.2473628.

Elawady, A., Alotaibi, E., Mostafa, O., & Abuzwidah, M. (2022). Predicting Number of Casualties during Accidents Using Machine Learning. 2022 Advances in Science and Engineering Technology International Conferences, ASET 2022, 1–5. doi:10.1109/ASET53988.2022.9734994.

Ahmed, S., Hossain, M. A., Ray, S. K., Bhuiyan, M. M. I., & Sabuj, S. R. (2023). A study on road accident prediction and contributing factors using explainable machine learning models: analysis and performance. Transportation Research Interdisciplinary Perspectives, 19, 100814. doi:10.1016/j.trip.2023.100814.

Khan, A. A., & Hussain, J. (2024). Utilizing GIS and Machine Learning for Traffic Accident Prediction in Urban Environment. Civil Engineering Journal (Iran), 10(6), 1922–1935. doi:10.28991/CEJ-2024-010-06-013.

Alnaqbi, A., Zeiada, W., Al-Khateeb, G. G., & Abuzwidah, M. (2024). Machine Learning Modeling of Wheel and Non-Wheel Path Longitudinal Cracking. Buildings, 14(3), 709. doi:10.3390/buildings14030709.

Elawady, A., Abuzwidah, M., Barakat, S., & Lee, J. (2023). Predicting Traffic Accidents Severity Using Multiple Analytical Techniques. Advances in Science and Technology, 129, 215–228. doi:10.4028/p-I7bQ7V.

Çeven, S., & Albayrak, A. (2024). Traffic accident severity prediction with ensemble learning methods. Computers and Electrical Engineering, 114, 109101. doi:10.1016/j.compeleceng.2024.109101.

Ma, Z., Mei, G., & Cuomo, S. (2021). An analytic framework using deep learning for prediction of traffic accident injury severity based on contributing factors. Accident Analysis and Prevention, 160, 106322. doi:10.1016/j.aap.2021.106322.

Yang, Z., Zhang, W., & Feng, J. (2022). Predicting multiple types of traffic accident severity with explanations: A multi-task deep learning framework. Safety Science, 146, 105522. doi:10.1016/j.ssci.2021.105522.

Alnaqbi, A. J., Zeiada, W., Al-Khateeb, G., Abttan, A., & Abuzwidah, M. (2024). Predictive models for flexible pavement fatigue cracking based on machine learning. Transportation Engineering, 16, 100243. doi:10.1016/j.treng.2024.100243.

Abuzwidah, M., Elawady, A., Wang, L., & Zeiada, W. (2024). Assessing the Impact of Adverse Weather on Performance and Safety of Connected and Autonomous Vehicles. Civil Engineering Journal (Iran), 10(9), 3070–3089. doi:10.28991/CEJ-2024-010-09-019.

Sejdiu, L., Tollazzi, T., Shala, F., & Demolli, H. (2024). Analysis of Traffic Safety Factors and Their Impact Using Machine Learning Algorithms. Civil Engineering Journal (Iran), 10(9), 2859–2869. doi:10.28991/CEJ-2024-010-09-06.

Mahmoud, N., Abdel-Aty, M., Cai, Q., & Abuzwidah, M. (2022). Analyzing the Difference Between Operating Speed and Target Speed Using Mixed-Effect Ordered Logit Model. Transportation Research Record, 2676(9), 596–607. doi:10.1177/03611981221088197.

Ruangkanjanases, A., Sivarak, O., Weng, Z. J., Khan, A., & Chen, S. C. (2024). Using multilayer perceptron neural network to assess the critical factors of traffic accidents. HighTech and Innovation Journal, 5(1), 157-169. doi:10.28991/HIJ-2024-05-01-012.

Guido, G., Shaffiee Haghshenas, S., Shaffiee Haghshenas, S., Vitale, A., & Astarita, V. (2022). Application of Feature Selection Approaches for Prioritizing and Evaluating the Potential Factors for Safety Management in Transportation Systems. Computers, 11(10). doi:10.3390/computers11100145.

Sobhana, M., Mendu, G. S. S. V., Vemulapalli, N., & Chintakayala, K. K. (2024). Optimized feature selection approaches for accident classification to enhance road safety. IAES International Journal of Artificial Intelligence, 13(3), 3283–3290. doi:10.11591/ijai.v13.i3.pp3283-3290.

Zhang, S., Khattak, A., Matara, C. M., Hussain, A., & Farooq, A. (2022). Hybrid feature selection-based machine learning Classification system for the prediction of injury severity in single and multiple-vehicle accidents. PLoS ONE, 17(2 February), 262941. doi:10.1371/journal.pone.0262941.

Wang, S., Chen, Y., Huang, J., Ma, J., & Lu, Y. (2019). Traffic crash forensic analysis based on univariate feature selection. CICTP 2019: Transportation in China - Connecting the World - Proceedings of the 19th COTA International Conference of Transportation Professionals, 5458–5470. doi:10.1061/9780784482292.470.

Najah, A., Abuzwidah, M., & Khalil, D. (2020). The impact of the rear seat belt use on traffic safety in the UAE. 2020 Advances in Science and Engineering Technology International Conferences, 9118388. doi:10.1109/ASET48392.2020.9118388.

Wei, J. T., Wu, H. H., & Kou, K. Y. (2011). Using feature selection to reduce the complexity in analyzing the injury severity of traffic accidents. Proceedings - 2011 International Joint Conference on Service Sciences, 329–333. doi:10.1109/IJCSS.2011.73.

Obasi, I. C., & Benson, C. (2023). Evaluating the effectiveness of machine learning techniques in forecasting the severity of traffic accidents. Heliyon, 9(8), e18812. doi:10.1016/j.heliyon.2023.e18812.

Khetrish, A., Abuzwidah, M., & Barakat, S. (2023). Modeling Crash Frequency Using Crash and Geometric Data at Freeways. Advances in Science and Technology, 129, 207–213. doi:10.4028/p-Hq3Aty.


Full Text: PDF

DOI: 10.28991/CEJ-2025-011-04-01

Refbacks

  • There are currently no refbacks.




Copyright (c) 2025 Muamer Abuzwidah, Ahmed Elawady, Jaeyoung Jay Lee, Ghazi G Al-Khateeb, Salah Haridy, Waleed Zeiada

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.
x
Message