Predicting Missing Data Using Multiple Imputation by Chained Process in Obesity Dataset

Authors

Gopika Venu
Department of Computer Science & Technology, Madanapalle Institute of Technology & Science, Angallu, AP
A. Sai Gnanika
Department of Computer Science & Technology, Madanapalle Institute of Technology & Science, Angallu, AP
B. Rajani
Department of Computer Science & Technology, Madanapalle Institute of Technology & Science, Angallu, AP
A. Yasmeen
Department of Computer Science & Technology, Madanapalle Institute of Technology & Science, Angallu, AP
K. Kiranmai
Department of Computer Science & Technology, Madanapalle Institute of Technology & Science, Angallu, AP

Synopsis

This paper showcases the effectiveness of multiple imputation (MI) using a chained process (MIC) in imputing  missing values in an obesity dataset, highlighting its superiority over single imputation methods due to its computational complexity and lack of familiarity. MIC is implemented and its performance is compared to basic statistical imputation techniques. The results show MIC provides lower error (MSE/RMSE) on numeric variables and higher accuracy on categorical variables versus statistical methods. MIC handles both numeric and categorical missing data well, provided column variables are correlated. By providing a template for applying MIC, this project aims to encourage the use of MI and promote awareness of its benefits over a single imputation for missing data problems in medical research.

ICAMC2024
Published
March 17, 2025
Online ISSN
2582-3922