Machine Learning, Missing Values, and Algorithm Selectors: The Untold Story

Anna Karen Gárate-Escamilla, José Carlos Ortiz-Bayliss, Hugo Terashima-Marín

Abstract


This paper presents a study of the potential benefits of incorporating missing values intothe training process of algorithm selectors powered bymachine learning algorithms, particularly those used for classification. This work analyzes various scenarios related to omitting some of the data available fortraining and measure the performance of the algorithm selectors produced to estimate how resistant they are to the presence of missing values within the training data. Our experiments open a new and exciting perspective on training algorithm selectors, one where itis possible to save computational resources by omitting some calculations, reducing the effort to produce such selectors, but without significantly harming their performance on unseen instances. For example, our results show that given a proper training set and deciding which runs to omit completely at random, some Machine Learning strategies such as Neural Networks, NaıveBayes Classifiers, and Support Vector Machines can correctly operate as algorithm selectors with up to 50% of the data missing (data about the solvers to choose from), without any further treatment of the missing values.

Keywords


Algorithm selection, bin packing problem, machine learning, missing values

Full Text: PDF