Classification of Debt Vulnerability in Sub-Saharan African Countries Using Various Machine Learning Tree Based Algorithms
Synopsis
In this work, we compared the accuracy results of a classification problem with three different models i.e. Decision Tree, Random Forest and Gradient Boosted Tree. We took a small dataset from the Kaggle repository containing four hundred and thirty-five samples. We examined each model’s choice of feature importance as well as their test and training accuracies. We found that the Gradient Boosted Tree produced the highest testing accuracy of 84%. Random forest was the second best accuracy of 83% and Decision Tree had the lowest with 82%. In addition to the accuracy, each model has a confusion matrix of the output of the testing data. Gradient Boosted Tree has the best true negative rate of 77.7% while Decision Tree has the worst true negative rate of 55.5%.
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.