Application of Swarm-Based Feature Selection and Extreme Learning Machines in Lung Cancer Risk Prediction

Authors

Priya Garg
Department of Computer Science, Delhi Technological University, Bawana Road, Delhi, India
Deepti Aggarwal
Department of Computer Science, Delhi Technological University, Bawana Road, Delhi, India

Synopsis

Lung cancer risk prediction models help in identifying high-risk individuals for early CT screening tests. These predictive models can play a pivotal role in healthcare by decreasing lung cancer's mortality rate and saving many lives. Although many predictive models have been developed that use various features, no specific guidelines have been provided regarding the crucial features in lung cancer risk prediction. This study proposes novel risk prediction models using bio-inspired swarm-based techniques for feature selection and extreme learning machines for classification. The proposed models are applied on a public dataset consisting of 1000 patient records and 23 variables, including sociodemographic factors, smoking status, and lung cancer clinical symptoms. The models, validated using 10-fold cross-validation, achieve an AUC score in the range of 0.985 to 0.989, accuracy in the range of 0.986 to 0.99 and F-Measure in range of 0.98 to 0.985. The study also identifies smoking habits, exposure to air pollution, occupational hazards and some clinical symptoms as the most commonly selected lung cancer risk prediction features.  The study concludes that the developed lung cancer risk prediction models can be successfully applied for early screening, diagnosis and treatment of high-risk individuals.

ICTCon2021
Published
July 12, 2021
Online ISSN
2582-3922