Classification of Debt Vulnerability in Sub-Saharan African Countries Using Various Machine Learning Tree Based Algorithms

Authors

Danielle Shackley
Wentworth Institution of Technology 550 Huntington Ave Boston, Massachusetts 02115
Brendan Dao
Wentworth Institution of Technology 550 Huntington Ave Boston, Massachusetts 02115
Salem Othman
Wentworth Institution of Technology 550 Huntington Ave Boston, Massachusetts 02115

Synopsis

In this work, we compared the accuracy results of a classification problem with three different models i.e. Decision Tree, Random Forest and Gradient Boosted Tree. We took a small dataset from the Kaggle repository containing four hundred and thirty-five samples. We examined each model’s choice of feature importance as well as their test and training accuracies. We found that the Gradient Boosted Tree produced the highest testing accuracy of 84%. Random forest was the second best accuracy of 83% and Decision Tree had the lowest with 82%. In addition to the accuracy, each model has a confusion matrix of the output of the testing data. Gradient Boosted Tree has the best true negative rate of 77.7% while Decision Tree has the worst true negative rate of 55.5%.

SIAIA22
Published
February 17, 2024
Online ISSN
2582-3922