EMiner: A Tool for Selecting Classification Algorithms and Optimal Parameters

Rayrone Zirtany Nunes Marques, Luciano Reis Coutinho, Tiago Bonini Borchartt, Samyr Béliche Vale, Francisco José da Silva e Silva


In this paper, Genetic Algorithm (GA) is used to search for combinations of learning algorithms and associated parameters with maximum accuracy. An important feature of the approach is that the GA initial population is formed by using parameter values gathered from ExpDB (a public database of data mining experiments). The proposed approach was implemented in a tool called EMiner, built on top of a grid based software infrastructure for developing collaborative applications in medicine and healthcare domains (ECADeG project). Experiments on 16 datasets from the UCI repository were performed. The results obtained have shown that the strategy of combining the data from ExpDB via GA is effective in finding classification models with good accuracy.


Data mining; medicine and healthcare; algorithm selection; parameter optimization; genetic algorithms

Full Text: PDF


