Customer churn prediction to enhance customer retention strategies in the banking industry: A study using seven machine learning algorithms

Loading...
Thumbnail Image
Supplementary material
Other Title
Authors
Kabbar, Eltahir
Herath, R.
Author ORCID Profiles (clickable)
Degree
Grantor
Date
2025-06-11
Supervisors
Type
Journal Article
Ngā Upoko Tukutuku (Māori subject headings)
Keyword
banks and banking
customer retention
churn (business)
predictive analytics
artificial intelligence
Citation
Kabbar, E., & Herath, N. (2025). Customer churn prediction to enhance customer retention strategies in the banking industry: A study using seven machine learning algorithms . Journal of Software and Systems Development (JSSD), 2025, 1-10. https://doi.org/10.5171/2025.786386
Abstract
This study explores machine learning approaches to predict customer churn in the banking sector. Following Dietterich’s Machine Learning Problem Research Life Cycle, six key steps were implemented: data gathering, preparation, exploratory data analysis (EDA), model creation, training, evaluation, and hyperparameter tuning. A systematic literature review identified essential machine learning techniques and innovations applied to customer churn prediction since 2014, highlighting Random Forest, Gradient Boosting, and hybrid models as practical approaches. This review also emphasized the value of advanced preprocessing and explainable AI techniques in improving model accuracy and usability. A publicly available dataset of 10,000 entries was used to evaluate seven machine learning algorithms, K-Nearest Neighbors (KNN), Support Vector Machine (SVM), Logistic Regression, Decision Tree, Random Forest, AdaBoost, and Gradient Boosting based on accuracy, precision, recall, F1 score, and ROC-AUC. Among these, the Gradient Boosting Classifier emerged as the most effective model with an accuracy of 85.2% and an ROC-AUC of 0.87, demonstrating robust performance in predicting customer churn. The findings underscore the potential of Gradient Boosting for developing reliable churn prediction systems, aiding banks in devising targeted customer retention strategies. The results of this study can benefit both academic researchers and industry practitioners. Academics can use the findings to explore advanced machine learning applications and develop new churn prediction frameworks. At the same time, practitioners, particularly in banking and related industries, can leverage the Gradient Boosting model to improve customer retention strategies and reduce revenue losses associated with churn. Future research is recommended to validate these results using dynamic datasets.
Publisher
IBIMA Publishing
Link to ePress publication
DOI
https://doi.org/10.5171/2025.786386
Copyright holder
Authors
Copyright notice
CC BY Attribution 4.0 International
Copyright license
This item appears in: