Machine Learning Predictive Performance in Road Accident Severity: A Case Study from Thailand

บทความในวารสาร

ผู้เขียน/บรรณาธิการ

อิทธิฤทธิ์ โมหะหมัด

กลุ่มสาขาการวิจัยเชิงกลยุทธ์

รายละเอียดสำหรับงานพิมพ์

รายชื่อผู้แต่ง: Ittirit Mohamad, Sajjakaj JomnonKwao, Vatanavongs Ratanavaraha

ผู้เผยแพร่: Elsevier

ปีที่เผยแพร่ (ค.ศ.): 2025

นอก: 2590-1230

eISSN: 2590-1230

URL: https://www.sciencedirect.com/science/article/pii/S2590123025009089

ภาษา: English-United States (EN-US)

ดูบนเว็บไซต์ของสำนักพิมพ์

บทคัดย่อ

Traffic accidents remain a major cause of fatalities and economic losses worldwide, necessitating the development of accurate predictive models for enhancing road safety and minimizing risks. In Thailand, where road traffic injuries persist as a public health challenge, data-driven approaches can significantly contribute to accident prevention strategies. This study evaluates the predictive performance of multiple supervised machine learning algorithms in classifying accident severity, addressing the gap in prior research that lacks a comparative analysis of multiple models trained on large-scale crash data. Eight algorithms were assessed, including Decision Tree (DT), Support Vector Machine (SVM), Random Forest (RF), K-Nearest Neighbors (kNN), Neural Network (NN), Naïve Bayes (NB), Logistic Regression (LR), and Gradient Boosting (GB).A dataset comprising 112,837 road accidents over a five-year period in Thailand was analyzed, focusing exclusively on incidents where drivers were at fault. The dataset underwent extensive preprocessing, including missing value imputation, data balancing checks, and feature selection to ensure robustness. Among the models tested, Random Forest demonstrated superior performance in the binary classification task, achieving an average class AUC of 0.768, classification accuracy of 0.777, precision of 0.752, and recall of 0.777. Key predictive features include road type (highway), speeding, time of day (daylight), absence of lighting at night, and driver gender. While the model effectively classifies non-fatal accidents, its recall for fatalities remains limited (0.198), highlighting challenges in predicting fatal crashes due to the complex interplay of contributing factors. These findings reinforce the applicability of machine learning in traffic safety research and provide valuable insights for policymakers seeking data-driven interventions. Future work should explore advanced feature engineering and ensemble techniques to enhance fatality prediction accuracy.

คำสำคัญ

Artificial Intelligence, Big Data forecasting, Logistic regression model, Machine Learning, Probabilistic risk assessment modeling, Road accident