Student Dropout Prediction: A KMUTT Case Study
Conference proceedings article
Authors/Editors
No matching items found.
Strategic Research Themes
Publication Details
Author list: enpipat, Warit; Akkarajitsakul, Khajonpong;
Publisher: Hindawi
Publication year: 2020
ISBN: 9781728181066
ISSN: 0146-9428
eISSN: 1745-4557
Languages: English-Great Britain (EN-GB)
Abstract
Higher education is a key factor in Thai national development. Promoting educational quality and improving the quality of learners are challenging for our government. One of the most common educational problems is a university dropout problem which has a negative impact not only on the economic level but also on the personal level of students. Then, in this study, we focus on the study of factors affecting undergraduates' educational status and to create binary classification models for predicting their educational status whether will be Dropout or Other (i.e., other statuses which are not dropout) when they were studying at King Mongkut's University of Technology Thonburi (KMUTT), Bangkok, Thailand. By applying the principles of data mining and machine learning techniques, we first collect data from the KMUTT internal data sources, i.e., Registrar's Office and KMUTT Library After that we investigate the completeness and the quality of the data, as well as we, investigate which features are meant to improve the accuracy and applicability of our models. Particularly, the data used in our analysis are from 13,714 undergraduate students in 7 academic years which are from the academic year 2012-2013 to 2019-2020. We develop 3 classifiers based on a decision tree, random forest, and gradient boosting classification. The results show that the prediction accuracy of gradient boosting, decision tree and random forest models are 93%, 92%, and 92% respectively. Moreover, we found that the top 5 important features are student's academic year, high-school GPA, channels of university admission, student's faculty, and gender. In summary, the model constructed using gradient boosting outperforms the others with the most accuracy and recall but the random forest model can outperform the others with the most dropout status precision. © 2020 IEEE.
Keywords
Dropout, Gradient Boosting