Utilizing Demographic Data and Insurance Claims History to Develop Machine Learning for Assessing Cardiovascular Disease Risk

Conference proceedings article

Authors/Editors

PAISIT KHAN-AR-SA

Strategic Research Themes

Digital Transformation (Strategic Research Themes)

Publication Details

Author list: Napat Uraisomsurat, Paisit Khanarsa, Tanet Sriamorn

Publication year: 2025

URL: https://link.springer.com/chapter/10.1007/978-981-96-6400-9_1

View on publisher site

Abstract

Cardiovascular diseases are challenging to treat and can lead to severe complications or death. Insurance companies face difficulties predicting future risks of cardiovascular disease, complicating health insurance renewals and preventive care for at-risk individuals. Therefore, early identification of those at risk is essential for effective insurance management. This paper applies tree-based machine learning models to predict future cardiovascular disease risks using demographics, insurance claims, air quality, and illness summary features. Four tree-based models were implemented: Decision Tree, Random Forest, Extreme Gradient Boosting (XGBoost), and Light Gradient Boosting Machine (LightGBM). The performance of the models using only age and gender was compared to those incorporating all available features. XGBoost demonstrated the best performance, achieving sensitivity ranging from 0.7 to 0.758 and specificity ranging from 0.496 to 0.772.

Keywords

No matching items found.