Customer Churn Prediction in the Banking Sector Using Machine Learning-Based Classification Models
Previous research has generally concentrated on identifying the variables that most significantly influence customer churn or has used customer segmentation to identify a subset of potential consumers, excluding its effects on forecast accuracy. Consequently, there are two primary research goals in this work. The initial goal was to examine the impact of customer segmentation on the accuracy of customer churn prediction in the banking sector using machine learning models. The second objective is to experiment, contrast, and assess which machine learning approaches are most effective in predicting customer churn.
This paper reviews the theoretical basis of customer churn, and customer segmentation, and suggests using supervised machine-learning techniques for customer attrition prediction.
In this study, we use different machine learning models such as k-means clustering to segment customers, k-nearest neighbors, logistic regression, decision tree, random forest, and support vector machine to apply to the dataset to predict customer churn.
The results demonstrate that the dataset performs well with the random forest model, with an accuracy of about 97%, and that, following customer segmentation, the mean accuracy of each model performed well, with logistic regression having the lowest accuracy (87.27%) and random forest having the best (97.25%).
Customer segmentation does not have much impact on the precision of predictions. It is dependent on the dataset and the models we choose.
The practitioners can apply the proposed solutions to build a predictive system or apply them in other fields such as education, tourism, marketing, and human resources.
The research paradigm is also applicable in other areas such as artificial intelligence, machine learning, and churn prediction.
Customer churn will cause the value flowing from customers to enterprises to decrease. If customer churn continues to occur, the enterprise will gradually lose its competitive advantage.
Build a real-time or near real-time application to provide close information to make good decisions. Furthermore, handle the imbalanced data using new techniques.