Improving Statistical Learning Methods Via Features Selection without Replacement Sampling and Random Projection

Sulaiman Khan, Muhammad Ahmad, Fida Ullah, Carlos Fernando Aguilar Ibañez, José Eduardo Valdez Rodriguez

Abstract


Cancer is fundamentally a genetic disease characterized by genetic and epigenetic alterations that disrupt normal gene expression, leading to uncontrolled cell growth and metastasis. High-dimensional microarray datasets pose challenges for classification models due to the "small n, large p" problem, which can lead to overfitting. This study makes three different key contributions: 1) We propose a machine learning-based approach integrating the Feature Selection Without Replacement (FSWOR) technique and a projection method to improve classification accuracy.

Keywords


Brain cancer, gene expression, machine learning, SVM, NB, LR, DT, KNN, dimension reduction, PCA, LDA, GRP, SRP

Full Text: PDF