Predicting Missing Data Using Multiple Imputation by Chained Process in Obesity Dataset

Gopika Venu; A. Sai Gnanika; B. Rajani; A. Yasmeen; K. Kiranmai

Predicting Missing Data Using Multiple Imputation by Chained Process in Obesity Dataset

Authors

Gopika Venu

Department of Computer Science & Technology, Madanapalle Institute of Technology & Science, Angallu, AP

A. Sai Gnanika

Department of Computer Science & Technology, Madanapalle Institute of Technology & Science, Angallu, AP

B. Rajani

Department of Computer Science & Technology, Madanapalle Institute of Technology & Science, Angallu, AP

A. Yasmeen

Department of Computer Science & Technology, Madanapalle Institute of Technology & Science, Angallu, AP

K. Kiranmai

Department of Computer Science & Technology, Madanapalle Institute of Technology & Science, Angallu, AP

DOI: https://doi.org/10.21467/proceedings.178.3

Synopsis

This paper showcases the effectiveness of multiple imputation (MI) using a chained process (MIC) in imputing missing values in an obesity dataset, highlighting its superiority over single imputation methods due to its computational complexity and lack of familiarity. MIC is implemented and its performance is compared to basic statistical imputation techniques. The results show MIC provides lower error (MSE/RMSE) on numeric variables and higher accuracy on categorical variables versus statistical methods. MIC handles both numeric and categorical missing data well, provided column variables are correlated. By providing a template for applying MIC, this project aims to encourage the use of MI and promote awareness of its benefits over a single imputation for missing data problems in medical research.