Enhancing diabetes follow-up period prediction through classification algorithms with feature selection techniques
บทความในวารสาร
ผู้เขียน/บรรณาธิการ
กลุ่มสาขาการวิจัยเชิงกลยุทธ์
รายละเอียดสำหรับงานพิมพ์
รายชื่อผู้แต่ง: Paisit Khanarsa, Sutikiat Suwanmanee, Supichaya Chumpong, Kittisak Chumpong
ผู้เผยแพร่: Jiangsu Provincial Center for Disease Control and Prevention (JSCDC)
ปีที่เผยแพร่ (ค.ศ.): 2025
ชื่อย่อของวารสาร: JPHE
Volume number: 9
Issue number: 25
นอก: 2520-0054
บทคัดย่อ
Background
Despite the critical role of follow-up care in managing type 2 diabetes, limited research has focused on predicting follow-up periods using machine learning. Addressing this gap can improve patient management and optimize clinical resource allocation. Our objective is to develop and validate a machine learning-based model for predicting follow-up periods in patients with type 2 diabetes, using feature selection techniques to enhance predictive performance.
Methods
From 16,094 patient records retrieved from Pak Phanang Hospital, Thailand, 2,042 eligible follow-up records were retained after exclusion of patients aged below 35 years, missing or invalid values, and single-visit records. All included patients were diagnosed with type 2 diabetes in 2022. Follow-up periods were grouped into four categories: 1–4, 5–8, 9–12, and more than 12 weeks. Data preprocessing involved handling missing values, encoding categorical variables, scaling numerical features, and addressing class imbalance using Synthetic Minority Oversampling Technique (SMOTE). Three feature selection methods were applied: filter, wrapper, and embedded. Six classifiers (Support Vector Machine, Random Forest, K-Nearest Neighbors, Extra Trees Classifier, Adaptive Boosting and Artificial Neural Network) were evaluated using 5-fold cross-validation, with each fold consisting of 80% training and 20% testing data. Model performance was assessed using accuracy, precision, recall, and weighted F1-score.
Results
We analyzed 2,042 follow-up records from patients with type 2 diabetes diagnosed in 2022. The Extra Trees Classifier with Elastic Net feature selection achieved the highest performance, with a weighted F1-score of 90.69% [95% confidence interval (CI): 89.21–92.18%], precision of 89.54% (95% CI: 87.76–91.31%), and both weighted recall and accuracy of 91.97% (95% CI: 90.56–93.38%). Key predictors consistently identified included age, blood pressure, pulse, height, waist, fasting blood sugar, and creatinine. Elastic Net demonstrated strong feature selection performance, particularly with tree-based models.
Conclusions
Feature selection notably enhanced the predictive performance of machine learning models in classifying follow-up periods. The proposed model could assist clinicians in scheduling timely and personalized follow-up visits based on individual patient profiles, particularly key demographic and clinical features, thereby improving continuity of care, optimizing resource use, and supporting decision-making in real-world diabetes management.
คำสำคัญ
ไม่พบข้อมูลที่เกี่ยวข้อง






